A PHP library to extract embedded data: URI images from HTML, replace them with temporary placeholders, and restore them with real URLs once the images have been saved or uploaded.
- PHP 8.1+
composer require mathsgod/html-image-extractorHTML (with data: URIs)
↓ extract()
HTML with __IMG_xxx__ placeholders + image data map
↓ save / upload images, get URLs
↓ restore()
Final HTML (with real URLs)
use HtmlImageExtractor\HtmlImageExtractor;
$extractor = new HtmlImageExtractor();
// Step 1: extract embedded images
$extractor->extract($html);
$modifiedHtml = $extractor->getHtml(); // HTML with __IMG_xxx__ placeholders
$images = $extractor->getImages(); // image data map
echo $extractor->count() . ' image(s) found';
// Step 2: save to disk and get URL map
$urlMap = $extractor->saveToDir(
saveDir: __DIR__ . '/uploads',
baseUrl: 'https://example.com/uploads'
);
// Step 3: restore placeholders with real URLs
$finalHtml = $extractor->restore($urlMap);$extractor->extract($html);
// Build the URL map yourself after uploading
$urlMap = [];
foreach ($extractor->getImages() as $id => $info) {
// $info['mimeType'] — e.g. "image/png"
// $info['data'] — base64 encoded image data
// $info['extension'] — e.g. "png"
$url = myCloudUpload(base64_decode($info['data']), $info['mimeType']);
$urlMap[$id] = $url;
}
$finalHtml = $extractor->restore($urlMap);| Method | Description |
|---|---|
extract(string $html): static |
Extract all data: URI images and replace with placeholders. Returns $this for chaining. |
getHtml(): string |
Get the HTML with placeholders (after extract()). |
getImages(): array |
Get extracted image data keyed by placeholder ID. Each entry has mimeType, data (base64), extension. |
count(): int |
Number of images found in the last extract() call. |
saveToDir(string $saveDir, string $baseUrl): array |
Save images to a local directory. Returns a urlMap ready for restore(). |
restore(array $urlMap): string |
Replace placeholders with real URLs. Returns final HTML. |
jpeg, png, gif, webp, svg, bmp, tiff, avif
MIT