# Getting image metadata


In this brief notebook we explore how to get image metadata, more specifically EXIF (Exchangeable image file format) data from a file. More about the format can be found on [Wikipedia](https://en.wikipedia.org/wiki/Exif) and the implementation is based on [this answer](https://stackoverflow.com/questions/4764932). So let's use some pictures on my personal database to see what sort of information we may extract.

To follow this notebook you will need `PIL==8.3.1`, `exifread==2.3.2`, and `folium==0.12.1`. Other versions might work, but since I am not making a `requirements.txt` for this notebook it is better to keep track of the versions it worked with. Methods for extracting data with be explained for both `PIL` and [`exifread`](https://github.com/ianare/exif-py), `folium` is there to do something actually usefull with the data. So let's import the packages:

In [1]:
from pathlib import Path
from PIL import Image
import exifread
import folium

I put some images from my last cool trip in 2018 to play with under ['data/note-500-image-metadata'](./data/note-500-image-metadata). Here we make use of Python `pathlib.Path` to get all images matching the extension `jpg` in a list. We will use a single image here but in the end you can change index of the list and check the others.

In [2]:
file_path = Path('data/note-500-image-metadata')
file_list = list(file_path.glob('*.jpg'))
file_list

[PosixPath('data/note-500-image-metadata/20180430_195550.jpg'),
 PosixPath('data/note-500-image-metadata/20180522_140536.jpg'),
 PosixPath('data/note-500-image-metadata/20180522_205408.jpg'),
 PosixPath('data/note-500-image-metadata/20180526_090143.jpg')]

Next we select one image from the list. If you want to see it you can uncomment `Image.open(selected)` below. I keep it commented to keep the file size of this notebook small (since the images are already in the repository).

In [3]:
selected =  file_list[1]

# Image.open(selected)

## Using PIL.Image (not recommended)

Using the well-known `PIL.Image` is not the recommended way and we choose to present it first so that we can explore further the recommended way. The image object has a method `getexif` (commented line) that provides (i) little data and in (ii) a nonsense format. Using the private method (you should not to do so) `_getexif` gets more data, but good luck making any proper use of it.

In [4]:
# exif_data = Image.open(selected).getexif().items()
exif_data = Image.open(selected)._getexif().items()
exif_data = {k: v for k, v in exif_data if k != 37500}
exif_data

{34853: {0: b'\x02\x02\x00\x00',
  1: 'N',
  2: (45.0, 20.0, 53.0),
  3: 'E',
  4: (14.0, 3.0, 4.0),
  5: b'\x00',
  6: 397.0,
  7: (12.0, 5.0, 34.0),
  29: '2018:05:22'},
 296: 2,
 34665: 214,
 271: 'samsung',
 272: 'SM-G955F',
 305: 'G955FXXU1CRD7',
 274: 6,
 306: '2018:05:22 14:05:36',
 531: 1,
 282: 72.0,
 283: 72.0,
 36864: b'0220',
 37121: b'\x01\x02\x03\x00',
 37377: 9.78,
 36867: '2018:05:22 14:05:36',
 36868: '2018:05:22 14:05:36',
 37378: 1.53,
 37379: 7.61,
 37380: 0.0,
 37381: 1.53,
 37383: 5,
 37385: 0,
 37386: 4.2,
 37510: b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00',
 40961: 1,
 40962: 4032,
 41990: 0,
 37520: '0546',
 37521: '0546',
 37522: '0546',
 40963: 3024,
 33434: 0.001141552511415525,
 40965: 852,
 33437: 1.7,
 42016: 'F12LLJA00VM F12LLKL01GM\n',
 34850: 2,
 34855: 40,
 41986: 0,
 40960: b'0100',
 41987: 0,
 41989: 26}

## Using exifread (recommended)

This small and embedable package [exifread](https://github.com/ianare/exif-py) provides what it takes to get EXIF data from an image in a well formated dictionary. The following snippet illustrates how to do so:

In [5]:
with open(selected, 'rb') as reader:
    exif_data = exifread.process_file(reader).items()
    exif_data = {k:v for k, v in exif_data if k != 'JPEGThumbnail'}

exif_data

{'Image Make': (0x010F) ASCII=samsung @ 162,
 'Image Model': (0x0110) ASCII=SM-G955F @ 170,
 'Image Orientation': (0x0112) Short=Rotated 90 CW @ 42,
 'Image XResolution': (0x011A) Ratio=72 @ 146,
 'Image YResolution': (0x011B) Ratio=72 @ 154,
 'Image ResolutionUnit': (0x0128) Short=Pixels/Inch @ 78,
 'Image Software': (0x0131) ASCII=G955FXXU1CRD7 @ 180,
 'Image DateTime': (0x0132) ASCII=2018:05:22 14:05:36 @ 194,
 'Image YCbCrPositioning': (0x0213) Short=Centered @ 114,
 'Image ExifOffset': (0x8769) Long=214 @ 126,
 'GPS GPSVersionID': (0x0000) Byte=[2, 2, 0, 0] @ 892,
 'GPS GPSLatitudeRef': (0x0001) ASCII=N @ 904,
 'GPS GPSLatitude': (0x0002) Ratio=[45, 20, 53] @ 996,
 'GPS GPSLongitudeRef': (0x0003) ASCII=E @ 928,
 'GPS GPSLongitude': (0x0004) Ratio=[14, 3, 4] @ 1020,
 'GPS GPSAltitudeRef': (0x0005) Byte=0 @ 952,
 'GPS GPSAltitude': (0x0006) Ratio=397 @ 1044,
 'GPS GPSTimeStamp': (0x0007) Ratio=[12, 5, 34] @ 1064,
 'GPS GPSDate': (0x001D) ASCII=2018:05:22 @ 1052,
 'Image GPSInfo': (0

There is a bunch of information above, but let's imagine here you forgot where a picture was taken. Well, if your camera or cellphone has a GPS embedded you get find out with the extracted metadata. The following function extracts the latitude and logitude from the format exported by `exifread` for you and provides the results in degrees.

In [6]:
def get_exif_location(exif_data):
    """ Returns the latitude and longitude from exif data.
    
    Based on https://gist.github.com/snakeye/fdc372dbf11370fe29eb which
    by its way is based on https://gist.github.com/erans/983821.
    """
    
    def to_degress(value):
        """ Convert the GPS coordinates stored in the EXIF to degrees. """
        d = float(value.values[0].num) / float(value.values[0].den)
        m = float(value.values[1].num) / float(value.values[1].den)
        s = float(value.values[2].num) / float(value.values[2].den)
        return d + m / 60 + s / 3600

    lat, lon = None, None
    gps_lat_val = exif_data.get('GPS GPSLatitude', None)
    gps_lat_ref = exif_data.get('GPS GPSLatitudeRef', None)
    gps_lon_val = exif_data.get('GPS GPSLongitude', None)
    gps_lon_ref = exif_data.get('GPS GPSLongitudeRef', None)

    if gps_lat_val and gps_lat_ref and gps_lon_val and gps_lon_ref:
        lat = to_degress(gps_lat_val)
        lon = to_degress(gps_lon_val)
        
        lat = -lat if gps_lat_ref.values[0] != 'N' else lat
        lon = -lon if gps_lon_ref.values[0] != 'E' else lon

    return lat, lon

So here we go, we get the coordinates from the image and with help of `folium` we check on the map where it was taken.

In [7]:
coordinates = get_exif_location(exif_data)

m = folium.Map(location=coordinates)
folium.Marker(coordinates).add_to(m)
m