# Media GeoTag Mapper (MGTM): iPhone Media Example 

By Kenneth Burchfiel

Released under the MIT license

GitHub link: https://github.com/kburchfiel/media_geotag_mapper

This notebook shows how to use the functions in media_mapper_functions.py to retrieve geographic coordinates from iPhone images and videos, then store them as interactive maps. **For further documentation on this code, please reference my main Media Geotag Mapper tutorial (media_geotag_mapper_tutorial_v10.ipynb).** The documentation and comments in this notebook will focus on how to get the code to work with iPhone media.

In order to create maps showing the paths traveled between geotags, it's crucial to have your geotag data sorted in chronological order. I found that, for iPhone data, the timestamps stored within the 'alt_capture_time' column (created through the generate_loc_list function) best represented the actual time that pictures and video clips were taken. Therefore, this notebook sorts the geotag data by 'alt_capture_time' instead of 'modified_date' (which worked well for mapping Samsung media data). 

Since I don't own an iPhone, I tested out this notebook on iPhone image and video files that a friend had taken. To protect that friend's privacy, I won't be sharing the maps or other output files here, but those files are very similar to the ones found in the main tutorial notebook.


In [1]:
import time
start_time = time.time() # Allows the program's runtime to be measured
import os
import pandas as pd
import datetime
import matplotlib.pyplot as plt
import IPython.display
from IPython.display import display

from media_geotag_functions_v2 import generate_media_list, \
    retrieve_pic_locations, retrieve_clip_locations, generate_loc_list, \
    map_media_locations, folder_list_to_map, flip_lon, create_map_screenshot, \
    calculate_distance_by_year, convert_png_to_smaller_jpg, \
    batch_create_map_screenshots, batch_convert_pngs_to_smaller_jpgs

In [2]:
folder_name = 'combined'

top_folder_list = ['temp_iphone_image_data']

In [3]:
generate_new_lists = True

In [4]:
if generate_new_lists == True:
    df_media = generate_media_list(top_folder_list=top_folder_list,
    folder_name = folder_name)

In [5]:
if generate_new_lists == True:
    df_all_locations = generate_loc_list(df_media = df_media, 
    folder_name = folder_name)

Retrieving picture locations:


100%|██████████| 2429/2429 [00:53<00:00, 45.66it/s] 


Retrieving clip locations:


100%|██████████| 320/320 [00:53<00:00,  6.00it/s]


In [6]:
df_media = pd.read_csv(f'{folder_name}_media_list.csv')
# df_media.head(5)

I'll next create df_all_locations, a DataFrame containing all location data provided for the files in df_media. I'll limit the sample output of this notebook to geotags taken within Colorado.

In [7]:
df_all_locations = pd.read_csv(
    f'{folder_name}_media_locations.csv').reset_index(drop=True)
# df_all_locations

In [8]:
if generate_new_lists == True:
    df_all_locations = flip_lon(df_all_locations, lat_south_bound = 25,
    lat_north_bound = 45, lon_west_bound = 70, lon_east_bound = 95)
    df_all_locations.to_csv(f'{folder_name}_media_locations.csv', index = False)
    df_all_locations = pd.read_csv(f'{folder_name}_media_locations.csv')

In [9]:
# df_all_locations.query("lon > 70 & lon < 95 & lat > 25 & lat < 45") 
# Confirms that I flipped all longitude coordinates within this frame
# back to their correct value

The following code modifies df_all_locations to filter out media whose EXIF data or metadata didn't have an alt_capture_time value. It also sorts df_all_locations by alt_capture times and removes rows that lacked geotag data. These steps are taken to prepare the dataset for mapping tasks.

In [10]:
df_all_locations = df_all_locations.query("lat != 0 & lon != 0 & alt_capture_time != 'x'").sort_values('alt_capture_time')

df_all_locations.to_csv('df_all_locations.csv', index = False)

In [11]:
absolute_path_to_map_folder = \
r'D:\iPhone_media_geotag_tests\maps'
screenshot_save_path = 'map_screenshots'

In [12]:
# df_all_locations

The map_media_locations function calls pass 'alt_capture_time' as the timestam_column argument, since this column better represented the actual time that iPhone photos and clips were taken.

In [13]:
combined_map = map_media_locations(df_all_locations, folder_path = 'maps', 
timestamp_column = 'alt_capture_time',
file_name = 'combined', zoom_start = 6)

Added 747 markers to the map.


In [14]:
create_map_screenshot(absolute_path_to_map_folder = 
absolute_path_to_map_folder, map_name= 'combined_locations.html', 
screenshot_save_path = screenshot_save_path)

convert_png_to_smaller_jpg(png_folder = 'map_screenshots', 
png_image_name = 'combined_locations.png', jpg_folder = 'smaller_screenshots', 
reduction_factor = 1, quality_factor = 50)

# IPython.display.display(IPython.display.Image(
#     filename='smaller_screenshots/'+'combined_locations.jpg'))

In [15]:
map_media_locations(df_all_locations, folder_path = 'maps', 
file_name = 'combined_routes', timestamp_column = 'alt_capture_time', 
add_paths = True, zoom_start = 6)
print("Done")

Added 747 markers to the map.
Done


In [16]:
create_map_screenshot(absolute_path_to_map_folder = absolute_path_to_map_folder,
map_name= 'combined_routes_locations.html', 
screenshot_save_path = screenshot_save_path)

convert_png_to_smaller_jpg(png_folder = 'map_screenshots', 
png_image_name = 'combined_routes_locations.png', 
jpg_folder = 'smaller_screenshots', reduction_factor = 1, quality_factor = 50)

# IPython.display.display(IPython.display.Image(
#     filename='smaller_screenshots/'+'combined_routes_locations.jpg'))

In [17]:
create_map_screenshot(absolute_path_to_map_folder = absolute_path_to_map_folder,
map_name= 'combined_routes_intl_locations.html', 
screenshot_save_path = screenshot_save_path)

convert_png_to_smaller_jpg(png_folder = 'map_screenshots', 
png_image_name = 'combined_routes_intl_locations.png', 
jpg_folder = 'smaller_screenshots', reduction_factor = 1, quality_factor = 50)

# IPython.display.display(IPython.display.Image(
#     filename='smaller_screenshots/'+'combined_routes_intl_locations.jpg'))

In [18]:
map_dict = {}
for i in range(2018, datetime.date.today().year+1):
    print(f"Creating map for {i}:")
    year_as_string = str(i)
    next_year = str(i+1)
    map_dict[i] = map_media_locations(df_all_locations.query(
        "alt_capture_time >= @year_as_string & alt_capture_time < @next_year"),
        folder_path = 'maps', file_name = f'{i}_combined', 
        add_paths = True, zoom_start = 6, timestamp_column = 'alt_capture_time')

Creating map for 2018:
Added 41 markers to the map.
Creating map for 2019:
Added 0 markers to the map.
Creating map for 2020:
Added 1 markers to the map.
Creating map for 2021:
Added 25 markers to the map.
Creating map for 2022:
Added 680 markers to the map.


In [19]:
batch_create_map_screenshots(absolute_path_to_map_folder = 
absolute_path_to_map_folder, screenshot_save_path = 
screenshot_save_path)

In [20]:
batch_convert_pngs_to_smaller_jpgs(png_folder = 'map_screenshots', 
    jpg_folder = 'smaller_screenshots', reduction_factor = 1, 
    quality_factor = 50) 

In [21]:
for root, dirs, files in os.walk('smaller_screenshots'):
    smaller_screenshot_list = files

# smaller_screenshot_list

In [22]:
# for map in smaller_screenshot_list:
#     print(f'\n\n{map}:')
#     IPython.display.display(IPython.display.Image(
#         filename='smaller_screenshots/'+map))

# # This method of displaying images within a loop comes from Stack Overflow
# # user DrMcCleod at https://stackoverflow.com/a/35061341/13097194 .


In order to get the calculate_distance_by_year column to work with this data, I converted the alt_capture_time columns to strings; created a new column (alt_capture_time_no_tz) that didn't contain any time zone offset data; and then converted the alt_capture_time_no_tz column data to DateTime values. There's probably a more elegant way to remove time zone data from these columns.

In [23]:
df_all_locations['alt_capture_time'] = df_all_locations[
    'alt_capture_time'].astype('str')

The following code initializes 'alt_capture_time_no_tz' by searching for a '-' in the [-6] position within each row's alt_capture_time value (which indicates the presence of time zone offset data). If that hyphen is found, then that row's alt_capture_time_no_tz value will equal the alt_capture_time value with the time zone offset data removed. If that hyphen isn't found, then the alt_capture_time_no_tz value will be the same as the alt_capture_time value.

In [24]:
df_all_locations['alt_capture_time_no_tz'] = df_all_locations[
    'alt_capture_time'].astype('str').apply(
        lambda x:x[:-6] if x[-6] == '-' else x )

In [25]:
# df_all_locations

In [26]:
df_all_locations['alt_capture_time_no_tz'] = pd.to_datetime(df_all_locations['alt_capture_time_no_tz'])

In [27]:
df_distances_by_year = calculate_distance_by_year(
    df_all_locations, timestamp_column = 'alt_capture_time_no_tz')

# df_distances_by_year

In [28]:
# sum(df_distances_by_year['total_distance'])

In [29]:
# fig, axes = plt.subplots()
# plt.bar(x = df_distances_by_year['year'].astype('str'), 
# height = df_distances_by_year['total_distance'])

In [30]:
end_time = time.time()
run_time = end_time - start_time
run_minutes = run_time // 60
run_seconds = run_time % 60
print("Completed run at",time.ctime(end_time),"(local time)")
print("Total run time:",'{:.2f}'.format(run_time),
"second(s) ("+str(run_minutes),"minute(s) and",'{:.2f}'.format(run_seconds),
"second(s))") 

Completed run at Tue May 10 14:27:11 2022 (local time)
Total run time: 238.82 second(s) (3.0 minute(s) and 58.82 second(s))
