# Text in Image Files

In this notebook we look at several ways to read text embedded in image files of type PNG and JPEG.

In [1]:
import numpy as np
import matplotlib.pyplot as plt

from PIL import Image
from PIL.ExifTags import TAGS

import exiftool

## Linux Command strings

We saw an example of this command in module 1.1. Here we take another look.

The strings command scans a file and looks for sequences of bytes that could be text.

Here is an image I created while trying to make something that looks like Sarah's dog Stormy.

<img src='../data/image.png'>

The command `strings ../data/image.png` produces a great deal of output, most of it junk.  It does find useful information near the top of the output.  The command `strings ../data/image.png | head` produces all the text, but pipes the output to `head`, which shows the first ten lines.

In [2]:
!strings ../data/image.png | head

IHDR
tEXtparameters
((old dog)), mixed breed dachsund, and boykin spaniel, long hair, dark brown
Steps: 20, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 716484137, Size: 512x512, Model hash: aadddd3d75, Model: deliberate_v3, Version: v1.6.0
IDATx
*+3c
F#h$h
O@$A@"!$
TJJe*33
9""2"


## PNG

The PNG (Portable Network Graphics) format uses the `tEXt` marker, or tag, to hold text. After the tEXt marker there is a label, `parameters` in this case, which is followed by a null byte '\x00'.

After the null byte comes the text.

In the cell below, I will read the first 300 bytes of image.png and store them in the byte array named dog_bytes.

In [3]:
with open('../data/image.png','rb') as file:
    dog_bytes = file.read(300)
    
dog_bytes

b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x02\x00\x00\x00\x02\x00\x08\x02\x00\x00\x00{\x1aC\xad\x00\x00\x00\xe8tEXtparameters\x00((old dog)), mixed breed dachsund, and boykin spaniel, long hair, dark brown\nSteps: 20, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 716484137, Size: 512x512, Model hash: aadddd3d75, Model: deliberate_v3, Version: v1.6.0\xdb[\x07\xe3\x00\x01\x00\x00IDATx\x9cd\xfdY\xb7dir\x1d\x06\xeemv\x8e'

The tag tEXt starts at the 37th byte. You can slice a byte array just as you would a string or list. Here are bytes 0 through 36 of dog_bytes.

In [8]:
dog_bytes[:37]

b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x02\x00\x00\x00\x02\x00\x08\x02\x00\x00\x00{\x1aC\xad\x00\x00\x00\xe8'

Look at the four bytes before tEXt: `\x00\x00\x00\xe8 = 14*16+8 = 232`. You can have python do that arithmetic for you:

In [19]:
0x000000e8

232

232+4 is the length of the tEXt section. To check this, look at the slice that starts at byte 37 and ends before 37+232+4 .  I am not sure why the +4 is needed, but it is!

In [27]:
dog_bytes[37:37+232+4]

b'tEXtparameters\x00((old dog)), mixed breed dachsund, and boykin spaniel, long hair, dark brown\nSteps: 20, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 716484137, Size: 512x512, Model hash: aadddd3d75, Model: deliberate_v3, Version: v1.6.0'

Here is the part of the PNG specification that describes the tEXt chunk:

https://www.w3.org/TR/png-3/#11tEXt

### Exercise

Read the first 500 bytes of the image at the path defined below and store it in a variable. Find the length of the tEXt chunk and use a slice to extract just the tEXt chunk.

In [9]:
path = '../data/00002-430620326.png'

Here is the image at that path:

<img src='../data/00002-430620326.png' width=500>

In [7]:
!strings ../data/00002-430620326.png | head

IHDR
tEXtparameters
mj, cinematic close up photo of an ethereal neural network organism, divine woman, anatomical face, ((biomechanical details))
Negative prompt: [deformed | disfigured], poorly drawn, [bad : wrong] anatomy, [extra | missing | floating | disconnected] limb, (mutated hands and fingers), blurry
Steps: 20, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 430620326, Size: 768x1024, Model hash: aadddd3d75, Model: deliberate_v3, Version: v1.6.0
IDATx
73]D
RJJQB2
<Uum
ovC@om


In [13]:
with open(path,'rb') as file:
    womenmag_bytes = file.read(500)
    
womenmag_bytes

b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x03\x00\x00\x00\x04\x00\x08\x02\x00\x00\x00\xd9D\xa9W\x00\x00\x01\xbftEXtparameters\x00mj, cinematic close up photo of an ethereal neural network organism, divine woman, anatomical face, ((biomechanical details))\nNegative prompt: [deformed | disfigured], poorly drawn, [bad : wrong] anatomy, [extra | missing | floating | disconnected] limb, (mutated hands and fingers), blurry\nSteps: 20, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 430620326, Size: 768x1024, Model hash: aadddd3d75, Model: deliberate_v3, Version: v1.6.0\xa0\x97\xcb\xb3\x00\x01\x00\x00IDAT'

In [14]:
0x0001bf

447

In [16]:
womenmag_bytes[37:37+447+4]

b'tEXtparameters\x00mj, cinematic close up photo of an ethereal neural network organism, divine woman, anatomical face, ((biomechanical details))\nNegative prompt: [deformed | disfigured], poorly drawn, [bad : wrong] anatomy, [extra | missing | floating | disconnected] limb, (mutated hands and fingers), blurry\nSteps: 20, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 430620326, Size: 768x1024, Model hash: aadddd3d75, Model: deliberate_v3, Version: v1.6.0'

## JPEG

JPEG files are not as easy to deal with as PNG files.

The file ../data/ape.jpeg is an image I found at civitai.com

<img src='../data/ape.jpeg' width=350>

JPEG supports exif, Exchangeable Image File, format.

I imported the Image package from PIL at the top of the notebook, and it has a function call _getexif().

Here is what that function finds about the ape.jpeg image:

In [28]:
ape = Image.open('../data/ape.jpeg')

ape_exif = ape._getexif()
type(ape_exif)

dict

In [3]:
ape_exif

{34665: 26,
 37510: b'UNICODE\x00\x00(\x00(\x00(\x00m\x00o\x00n\x00k\x00e\x00y\x00 \x00(\x00(\x00i\x00n\x00 \x00c\x00o\x00l\x00o\x00r\x00f\x00u\x00l\x00 \x00s\x00u\x00i\x00t\x00)\x00)\x00 \x00s\x00i\x00t\x00t\x00i\x00n\x00g\x00 \x00o\x00n\x00 \x00t\x00r\x00e\x00e\x00 \x00b\x00r\x00a\x00n\x00c\x00h\x00,\x00 \x00d\x00a\x00r\x00k\x00 \x00m\x00y\x00s\x00t\x00e\x00r\x00i\x00o\x00u\x00s\x00 \x00f\x00o\x00r\x00e\x00s\x00t\x00,\x00 \x00d\x00r\x00a\x00m\x00a\x00t\x00i\x00c\x00 \x00l\x00i\x00g\x00h\x00t\x00i\x00n\x00g\x00,\x00 \x00d\x00a\x00r\x00k\x00 \x00p\x00h\x00o\x00t\x00o\x00,\x00 \x00n\x00i\x00g\x00h\x00t\x00)\x00)\x00)\x00,\x00,\x00 \x00b\x00e\x00s\x00t\x00 \x00q\x00u\x00a\x00l\x00i\x00t\x00y\x00,\x00 \x00u\x00l\x00t\x00r\x00a\x00 \x00h\x00i\x00g\x00h\x00 \x00r\x00e\x00s\x00,\x00 \x00(\x00p\x00h\x00o\x00t\x00o\x00r\x00e\x00a\x00l\x00i\x00s\x00t\x00i\x00c\x00:\x001\x00.\x004\x00)\x00,\x00,\x00 \x00h\x00i\x00g\x00h\x00 \x00r\x00e\x00s\x00o\x00l\x00u\x00t\x00i\x00o\x00n\x00,\x00 \x00d\x00e\x

The dictionary ape_exif has two items and the keys are 34665 and 37510.

The 37510 chunk is the User Comment section.

http://www.digitalvitral.com/wiki/exif/fields

Look at the first 150 bytes of the user comment section.

In [29]:
ape_exif[37510][:150]

b'UNICODE\x00\x00(\x00(\x00(\x00m\x00o\x00n\x00k\x00e\x00y\x00 \x00(\x00(\x00i\x00n\x00 \x00c\x00o\x00l\x00o\x00r\x00f\x00u\x00l\x00 \x00s\x00u\x00i\x00t\x00)\x00)\x00 \x00s\x00i\x00t\x00t\x00i\x00n\x00g\x00 \x00o\x00n\x00 \x00t\x00r\x00e\x00e\x00 \x00b\x00r\x00a\x00n\x00c\x00h\x00,\x00 \x00d\x00a\x00r\x00k\x00 \x00m\x00y\x00s\x00t\x00e\x00r\x00i\x00o\x00u\x00s\x00 '

When python shows a byte array, it shows printable bytes as they would print, and other bytes as \x followed by two hex digits.

In [33]:
ape_exif[37510][14:26]

b'\x00m\x00o\x00n\x00k\x00e\x00y'

This is text UTF-16 encoded, which means every character is represented as two bytes.

Bytes 14 and 15 are shown as `\x00m`, or the character `m`.

The first 8 bytes are not UTF-16, they are just UNICODE followed by the null byte.

In [36]:
ape_exif[37510][8:].decode('utf-16-be')

'(((monkey ((in colorful suit)) sitting on tree branch, dark mysterious forest, dramatic lighting, dark photo, night))),, best quality, ultra high res, (photorealistic:1.4),, high resolution, detailed, raw photo, sharp re, by lee jeffries nikon d850 film stock photograph 4 kodak portra 400 camera f1.6 lens rich colors hyper realistic lifelike texture dramatic lighting unrealengine trending on artstation cinestill 800,, photorealistic, photo, masterpiece, realistic, realism, photorealism, high contrast, photorealistic digital art trending on Artstation 8k HD high definition detailed realistic, detailed, skin texture, hyper detailed, realistic skin texture, armature, ((HD, clear, sharp, in focus)), (((national geographic)))\nNegative prompt: (((person))), (((portrait))), EasyNegative, paintings, sketches, (worst quality:2), (low quality:2), (normal quality:2), lowres, normal quality, ((monochrome)), ((grayscale)), skin spots, acnes, skin blemishes, age spot, glans,extra fingers,fewer fin

Why is there `-be` after `utf-16`?  It stands for BIG ENDIAN, as opposed to LITTLE ENDIAN.


### Exercise

The image file ../data/wolf.jpeg is another image I fould at civitai.com.

Here is the image:

<img src='../data/wolf.jpeg' width=600>

Extract the user comment chunk from the wolf image.

In [19]:
ape1 = Image.open('../data/wolf.jpeg')

ape1_exif = ape1._getexif()
type(ape1_exif)

dict

In [20]:
ape1_exif

{34665: 26,
 37510: b'UNICODE\x00\x00w\x00o\x00l\x00f\x00 \x00w\x00e\x00a\x00r\x00i\x00n\x00g\x00 \x00s\x00h\x00e\x00e\x00p\x00 \x00c\x00o\x00s\x00t\x00u\x00m\x00e\x00,\x00 \x00m\x00a\x00n\x00y\x00 \x00s\x00h\x00e\x00e\x00p\x00 \x00i\x00n\x00 \x00t\x00h\x00e\x00 \x00b\x00a\x00c\x00k\x00g\x00r\x00o\x00u\x00n\x00d\x00,\x00 \x00o\x00u\x00t\x00d\x00o\x00o\x00r\x00s\x00,\x00 \x00f\x00a\x00r\x00m\x00,\x00 \x00m\x00a\x00s\x00t\x00e\x00r\x00p\x00i\x00e\x00c\x00e\x00,\x00 \x00b\x00e\x00s\x00t\x00 \x00q\x00u\x00a\x00l\x00i\x00t\x00y\x00,\x00 \x00f\x00u\x00n\x00n\x00y\x00\n\x00N\x00e\x00g\x00a\x00t\x00i\x00v\x00e\x00 \x00p\x00r\x00o\x00m\x00p\x00t\x00:\x00 \x00l\x00o\x00w\x00 \x00q\x00u\x00a\x00l\x00i\x00t\x00y\x00,\x00 \x00w\x00o\x00r\x00s\x00t\x00 \x00q\x00u\x00a\x00l\x00i\x00t\x00y\x00,\x00 \x00m\x00o\x00n\x00o\x00c\x00h\x00r\x00o\x00m\x00e\x00,\x00 \x00l\x00o\x00w\x00r\x00e\x00s\x00,\x00 \x00p\x00a\x00i\x00n\x00t\x00i\x00n\x00g\x00,\x00 \x00c\x00r\x00a\x00y\x00o\x00n\x00,\x00 \x00s\x00k\x00e\

In [28]:
ape1_exif[37510][8:].decode('utf-16-be')

'wolf wearing sheep costume, many sheep in the background, outdoors, farm, masterpiece, best quality, funny\nNegative prompt: low quality, worst quality, monochrome, lowres, painting, crayon, sketch, graphite, impressionist, noisy, blurry, long neck, long torso, bad anatomy, Cropped, Fake\nSteps: 20, Sampler: Euler a, CFG scale: 7, Seed: 1005428236, Face restoration: CodeFormer, Size: 1024x1024, Model hash: 31e35c80fc, Model: sd_xl_base_1.0, Version: v1.5.1'

## exiftool

The package exiftool is one that I found just yesterday, and it appears to be a really good package for extracting exif data from images.

Note that you can use a list containing multiple image files.

In [42]:
files = ['../data/ape.jpeg']

with exiftool.ExifToolHelper() as et:
    meta = et.get_metadata(files)

type(meta),len(meta),type(meta[0])

(list, 1, dict)

In [44]:
for k,v in meta[0].items():
    print(k,v,'\n')

SourceFile ../data/ape.jpeg 

ExifTool:ExifToolVersion 12.6 

File:FileName ape.jpeg 

File:Directory ../data 

File:FileSize 65041 

File:FileModifyDate 2023:09:26 15:04:34-04:00 

File:FileAccessDate 2023:10:09 15:46:47-04:00 

File:FileInodeChangeDate 2023:09:26 15:04:34-04:00 

File:FilePermissions 100644 

File:FileType JPEG 

File:FileTypeExtension JPG 

File:MIMEType image/jpeg 

File:ExifByteOrder II 

File:ImageWidth 480 

File:ImageHeight 960 

File:EncodingProcess 0 

File:BitsPerSample 8 

File:ColorComponents 3 

File:YCbCrSubSampling 2 2 

JFIF:JFIFVersion 1 1 

JFIF:ResolutionUnit 1 

JFIF:XResolution 96 

JFIF:YResolution 96 

EXIF:UserComment (((monkey ((in colorful suit)) sitting on tree branch, dark mysterious forest, dramatic lighting, dark photo, night))),, best quality, ultra high res, (photorealistic:1.4),, high resolution, detailed, raw photo, sharp re, by lee jeffries nikon d850 film stock photograph 4 kodak portra 400 camera f1.6 lens rich colors hyper realist

All of the images I have gotten from civitai.com have information about how the image was created.

### Exercise

Here is a cat image I got from civitai.com.  The path is ../data/cat.png

<img src='../data/cat.png' width=600>

In [30]:
with exiftool.ExifToolHelper() as et:
    meta = et.get_metadata(['../data/cat.png'])
    
for k,v in meta[0].items():
    print(k,v,'\n')

SourceFile ../data/cat.png 

ExifTool:ExifToolVersion 12.6 


File:FileName cat.png 

File:Directory ../data 

File:FileSize 2677646 

File:FileModifyDate 2023:10:26 10:21:53-04:00 

File:FileAccessDate 2023:10:31 11:04:24-04:00 

File:FileInodeChangeDate 2023:10:26 10:22:14-04:00 

File:FilePermissions 100644 

File:FileType PNG 

File:FileTypeExtension PNG 

File:MIMEType image/png 

File:ExifByteOrder II 

PNG:ImageWidth 1024 

PNG:ImageHeight 1536 

PNG:BitDepth 8 

PNG:ColorType 6 

PNG:Compression 0 

PNG:Filter 0 

PNG:Interlace 0 

PNG:PixelsPerUnitX 3780 

PNG:PixelsPerUnitY 3780 

PNG:PixelUnits 1 

PNG:Parameters (Binary data 674 bytes, use -b option to extract) 

EXIF:UserComment a cute kitten made out of metal, (cyborg:1.1), ([tail | detailed wire]:1.3), (intricate details), hdr, (intricate details, hyperdetailed:1.2), cinematic shot, vignette, centered

Negative prompt: (worst quality, low quality, normal quality, lowres, low details, oversaturated, undersaturated, overex

Use exiftool to extract information about the cat image.

Bella

<img src='../data/bella.JPG' width=600>

In [3]:
with exiftool.ExifToolHelper() as et:
    meta = et.get_metadata(['../data/bella.JPG'])
    
for k,v in meta[0].items():
    print(k,v,'\n')

SourceFile ../data/bella.JPG 

ExifTool:ExifToolVersion 12.6 

File:FileName bella.JPG 

File:Directory ../data 

File:FileSize 3907329 

File:FileModifyDate 2023:10:09 15:53:15-04:00 

File:FileAccessDate 2023:10:09 15:54:39-04:00 

File:FileInodeChangeDate 2023:10:09 15:53:15-04:00 

File:FilePermissions 100644 

File:FileType JPEG 

File:FileTypeExtension JPG 

File:MIMEType image/jpeg 

File:ExifByteOrder MM 

File:ImageWidth 4032 

File:ImageHeight 3024 

File:EncodingProcess 0 

File:BitsPerSample 8 

File:ColorComponents 3 

File:YCbCrSubSampling 2 2 

EXIF:Make Apple 

EXIF:Model iPhone SE (3rd generation) 

EXIF:Orientation 6 

EXIF:XResolution 72 

EXIF:YResolution 72 

EXIF:ResolutionUnit 2 

EXIF:Software 16.5 

EXIF:ModifyDate 2023:09:09 21:49:20 

EXIF:HostComputer iPhone SE (3rd generation) 

EXIF:YCbCrPositioning 1 

EXIF:ExposureTime 0.04166666667 

EXIF:FNumber 1.8 

EXIF:ExposureProgram 2 

EXIF:ISO 640 

EXIF:ExifVersion 0232 

EXIF:DateTimeOriginal 2023:09:09 21:49