Skip to content

Support remote assets and additional image formats#46

Merged
jongalloway merged 3 commits intomainfrom
copilot/support-remote-assets
Mar 4, 2026
Merged

Support remote assets and additional image formats#46
jongalloway merged 3 commits intomainfrom
copilot/support-remote-assets

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 4, 2026

  • Explore repository and understand existing image handling
  • Add WebP and BMP support to ImageMetadataReader
  • Add TryDetectContentType to ImageMetadataReader for magic-byte detection
  • Create RemoteAssetResolver class that downloads/caches remote HTTP images to temp files
  • Add AllowRemoteAssets and RemoteAssetHandler to PptxRenderOptions for configurability/testability
  • Update OpenXmlPptxRenderer to resolve remote URLs via RemoteAssetResolver
  • Update GetImageContentType to detect format from bytes when extension is absent/unknown; returns null for unsupported formats
  • Improve error messages: "Missing image: {url} ({reason})" and "Unsupported image format: {path}"
  • Add tests: remote image resolution (mock HTTP), HTTP failure (404), disabled remote assets, BMP/WebP acceptance, unsupported format, URL without extension
  • Address review feedback:
    • AllowRemoteAssets defaults to false (opt-in, avoids unexpected outbound HTTP)
    • URL cache uses StringComparer.Ordinal (URL paths are case-sensitive)
    • GetAsync uses HttpCompletionOption.ResponseHeadersRead for streaming (avoids buffering full response in memory)
    • Extension selection prefers Content-Type header over URL path extension for remote downloads
  • All 54 tests pass, CodeQL clean
Original prompt

This section details on the original issue you should resolve

<issue_title>Support remote assets and additional image formats</issue_title>
<issue_description>## Summary
Improve asset handling to fetch remote images safely and support more input formats.

Scope

  • add remote asset resolution rules
  • broaden intrinsic metadata support where needed
  • define caching and error handling behavior
  • add tests for asset resolution and failure cases

Acceptance Criteria

  • remote images can be resolved intentionally and predictably
  • more image formats are accepted or clearly rejected
  • asset failures produce actionable errors</issue_description>

Comments on the Issue (you are @copilot in this section)


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Co-authored-by: jongalloway <68539+jongalloway@users.noreply.github.com>
@jongalloway jongalloway marked this pull request as ready for review March 4, 2026 07:04
@jongalloway jongalloway requested a review from Copilot March 4, 2026 07:04
Copilot AI changed the title [WIP] Improve asset handling for remote images and formats Support remote assets and additional image formats Mar 4, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR expands the PPTX renderer’s asset pipeline to support remote HTTP(S) images and additional raster formats, improving format detection and making failures more actionable during rendering.

Changes:

  • Add a RemoteAssetResolver that downloads/caches HTTP(S) images to temporary files during a render.
  • Extend image format handling (BMP/WebP) and add magic-byte MIME detection for extensionless/unknown files.
  • Update rendering behavior and tests to cover remote asset success/failure, disabled remote assets, and unsupported formats.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/MarpToPptx.Tests/PptxRendererTests.cs Adds coverage for remote image resolution, HTTP failures, remote-disabled behavior, and BMP/WebP/unsupported formats.
src/MarpToPptx.Pptx/Rendering/RemoteAssetResolver.cs Introduces a resolver that fetches remote images via HttpClient, caches by URL, and cleans up temp files.
src/MarpToPptx.Pptx/Rendering/PptxRenderOptions.cs Adds AllowRemoteAssets and injectable RemoteAssetHandler for configurability/testability.
src/MarpToPptx.Pptx/Rendering/OpenXmlPptxRenderer.cs Plumbs remote asset resolution into rendering and updates image content-type detection/error reporting.
src/MarpToPptx.Pptx/Rendering/ImageMetadataReader.cs Adds BMP/WebP size parsing and stream-based magic-byte content-type detection.
Comments suppressed due to low confidence (1)

tests/MarpToPptx.Tests/PptxRendererTests.cs:283

  • These new remote-asset tests assert picture presence/text, but they don’t run the OpenXmlPackageValidator like other tests in this file. Adding validation here would better catch regressions (e.g., incorrect content types/relationships) introduced by remote image embedding.
        using var document = PresentationDocument.Open(outputPath, false);
        var slidePart = document.PresentationPart!.SlideParts.First();
        Assert.Single(slidePart.Slide!.Descendants<P.Picture>());
        Assert.DoesNotContain("Missing image", slidePart.Slide!.Descendants<A.Text>().Select(t => t.Text));
    }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +10 to +13
/// When <c>true</c> (the default), HTTP and HTTPS image URLs are fetched at render time.
/// Set to <c>false</c> to treat all remote references as missing assets.
/// </summary>
public bool AllowRemoteAssets { get; init; } = true;
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AllowRemoteAssets defaults to true, which changes rendering from “no network access” to “performs outbound HTTP(S) requests” unless callers opt out. That’s a security/operational concern for library consumers (SSRF and unexpected network dependency). Consider defaulting to false (opt-in) or very explicitly documenting the risk/expected usage in server contexts.

Suggested change
/// When <c>true</c> (the default), HTTP and HTTPS image URLs are fetched at render time.
/// Set to <c>false</c> to treat all remote references as missing assets.
/// </summary>
public bool AllowRemoteAssets { get; init; } = true;
/// When <c>true</c>, HTTP and HTTPS image URLs are fetched at render time.
/// The default is <c>false</c>, which treats all remote references as missing assets
/// and avoids outbound HTTP(S) requests unless explicitly enabled.
/// </summary>
public bool AllowRemoteAssets { get; init; } = false;

Copilot uses AI. Check for mistakes.

private readonly HttpClient _client;
private readonly bool _ownsClient;
private readonly Dictionary<string, string> _cache = new(StringComparer.OrdinalIgnoreCase);
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RemoteAssetResolver caches by URL using StringComparer.OrdinalIgnoreCase. URL paths can be case-sensitive, so this may incorrectly treat distinct resources as the same cache key. Use a case-sensitive comparer (Ordinal) or normalize URLs in a way that preserves semantics.

Suggested change
private readonly Dictionary<string, string> _cache = new(StringComparer.OrdinalIgnoreCase);
private readonly Dictionary<string, string> _cache = new(StringComparer.Ordinal);

Copilot uses AI. Check for mistakes.
Comment on lines +45 to +63
using var response = _client.GetAsync(url).ConfigureAwait(false).GetAwaiter().GetResult();
if (!response.IsSuccessStatusCode)
{
errorMessage = $"HTTP {(int)response.StatusCode} {response.ReasonPhrase}";
return null;
}

var contentType = response.Content.Headers.ContentType?.MediaType;
var extension = GetExtensionFromUrl(url)
?? GetExtensionFromContentType(contentType)
?? ".bin";

var tempPath = Path.Combine(Path.GetTempPath(), $"marp2pptx_{Guid.NewGuid():N}{extension}");

using (var fileStream = File.Create(tempPath))
using (var contentStream = response.Content.ReadAsStream())
{
contentStream.CopyTo(fileStream);
}
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolve() performs a synchronous wait on HttpClient.GetAsync and uses the default completion option, which buffers the entire response content in memory. This can cause thread-pool starvation/deadlocks in some sync contexts and can spike memory for large images. Prefer a streaming request (ResponseHeadersRead) and avoid sync-over-async (e.g., make resolution async or isolate sync waits carefully), ideally with an upper bound on bytes copied.

Copilot uses AI. Check for mistakes.
Comment on lines +53 to +54
var extension = GetExtensionFromUrl(url)
?? GetExtensionFromContentType(contentType)
Copy link

Copilot AI Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extension selection prefers the URL’s extension over the HTTP Content-Type, which can produce temp files with a misleading extension (and later a mismatched content type in the PPTX). For remote downloads, consider preferring Content-Type (or magic-byte detection after download) over URL extension, and/or validating that the bytes match the selected type.

Suggested change
var extension = GetExtensionFromUrl(url)
?? GetExtensionFromContentType(contentType)
var extension = GetExtensionFromContentType(contentType)
?? GetExtensionFromUrl(url)

Copilot uses AI. Check for mistakes.
@jongalloway
Copy link
Copy Markdown
Owner

@copilot review and address comments

…, streaming download, prefer Content-Type over URL extension

Co-authored-by: jongalloway <68539+jongalloway@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Mar 4, 2026

@copilot review and address comments

Addressed all four review comments in commit 2c038e7:

  • AllowRemoteAssets defaults to false — remote fetching is now opt-in; callers must explicitly set AllowRemoteAssets = true to enable outbound HTTP(S) requests.
  • URL cache uses StringComparer.Ordinal — URL paths are case-sensitive, so the cache no longer conflates distinct resources that differ only in case.
  • Streaming downloadGetAsync now uses HttpCompletionOption.ResponseHeadersRead, so response content is streamed directly to the temp file rather than buffered in memory first.
  • Content-Type preferred over URL extension — extension selection now tries Content-Type first, then the URL path extension as fallback, ensuring the temp file extension (and resulting MIME type) matches what the server actually sent.

Tests that exercise remote resolution were also updated to set AllowRemoteAssets = true explicitly.

@jongalloway jongalloway merged commit b254d55 into main Mar 4, 2026
5 checks passed
@jongalloway jongalloway deleted the copilot/support-remote-assets branch March 4, 2026 07:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support remote assets and additional image formats

3 participants