# Flickr API  

![Flickr](https://www.flickrhelp.com/hc/article_attachments/4419907628308/unnamed__1_.png)  

Flickr provides access to download geotagged images through its API, which is relatively straightforward to use and imposes minimal restrictions for scraping images.  

A typical Flickr post can be viewed here: [link](https://www.flickr.com/photos/92959567@N00/227080829).  

The objective of this exercise is to create a table containing images from a specific area along with their associated information.  

To achieve this, we will use the "Python Flickr API" library to handle authentication. Once authenticated, you can utilize the methods detailed in the API documentation [here](https://www.flickr.com/services/api/).  

The API methods fall under different authentication levels, such as:  
- *Requires 'write' permission for authentication.*  
- *Requires 'read' permission for authentication.*  
- *Does not require authentication.*  

Focus only on the methods that do not require authentication and disregard those needing "write" or "read" permissions.

## Geo Search
### Preparation & First Searches
First, we pip install the library with `pip install flickrapi`.

Import the passwords: 

In [79]:
import sys
sys.path.append(r'C:\Users\vince\Dropbox\Codes\passwords')

import passwords as pw

api_key = pw.flickr_key
api_secret = pw.flickr_secret

# print (api_key)
# print (api_secret)

Import the libraries we need. The code connects to the flickr api. The connection will be stored in an object called `flickr`.

In [80]:
import flickrapi
import pandas as pd
import pprint
import folium


#connect to flickr
flickr = flickrapi.FlickrAPI(api_key, api_secret, format='parsed-json')


### Define search location and display map

In [81]:
# Define the location (latitude and longitude) and search parameters
centre_latitude = 51.51235 
centre_longitude = -0.11720 

# Define the bounding box. Use Open Street Map to get the coordinates
lat_north = 51.51491
lat_south = 51.50893
long_west = -0.12167
long_east = -0.11171

# Define the number of photos to retrieve per page
per_page = 250

# Create a map centered on the location
map_london = folium.Map(
    location=[centre_latitude, centre_longitude],
    tiles="Cartodb dark_matter",
    zoom_start=16,
    control_scale=True,
    zoom_control=False,
    dragging=False,
    scrollWheelZoom=False
)

# Add a circle marker to the map
folium.CircleMarker(
    location=[centre_latitude, centre_longitude],
    radius=2,
    color="cornflowerblue",
    stroke=False,
    fill=True,
    fill_opacity=0.6,
    opacity=1,
    popup="{} pixels".format(radius),    
).add_to(map_london) 

# Add a rectangle to the map
folium.Rectangle(
    bounds=[[lat_north , long_west], [lat_south, long_east]],
    fill=True,
    fill_opacity=0.1,
    weight=1,
    color="cornflowerblue",  
).add_to(map_london)

# you can also save the map as html
# map_london.save("map_london.html")

map_london

The main command will be `flickr.photos.search` , please read the [documentation](https://www.flickr.com/services/api/flickr.photos.search.html) well. We will search the flickr photos that are inside the rectangle. 

In [82]:
# Define the bounding box coordinates (latitude and longitude)
# Format: bbox = "min_longitude,min_latitude,max_longitude,max_latitude"
bbox = str(long_west)+','+str(lat_south)+','+str(long_east)+','+str(lat_north)

# Search for photos in the bounding box
photos = flickr.photos.search(bbox=bbox, 
                              per_page=per_page, 
                              page=1,
                              has_geo=1, 
                              extras='geo,description,tags,views,media,url_n,date_taken,owner_name')
# Print the results
pprint.pprint(photos, compact=True)


{'photos': {'page': 1,
            'pages': 293,
            'perpage': 250,
            'photo': [{'accuracy': '16',
                       'context': 0,
                       'datetaken': '2024-10-10 17:04:12',
                       'datetakengranularity': 0,
                       'datetakenunknown': '0',
                       'description': {'_content': 'Londoner'},
                       'farm': 66,
                       'geo_is_contact': 0,
                       'geo_is_family': 0,
                       'geo_is_friend': 0,
                       'geo_is_public': 1,
                       'height_n': 320,
                       'id': '54246247201',
                       'isfamily': 0,
                       'isfriend': 0,
                       'ispublic': 1,
                       'latitude': '51.510608',
                       'longitude': '-0.117187',
                       'media': 'photo',
                       'media_status': 'ready',
                       'owner': 

Currently, we face two issues:  
- The response displays the results per page, in our case we can increase to a maximum of 250 results per page, but we still have more than one page.   
- The response data needs to be transformed into a table.  

The suggested solution involves the following steps:  

1. Determine the total number of pages.  
2. Retrieve each page sequentially.  
3. For every page retrieved, append the relevant content to a new table.  

### Determine the number of pages

Determine the number of pages, extact the results of the first entry.

In [84]:
photos = flickr.photos.search(bbox=bbox, 
                              per_page=per_page, 
                              page=1,
                              has_geo=1, 
                              extras='geo,description,tags,views,media,url_o,url_s,date_taken,owner_name')

total_pages = photos['photos']['pages']
total_photos = photos['photos']['total']

print(f"Total photos: {total_photos}")
print(f"Total pages: {total_pages}")
print("First entry:")
print("ID: " + photos['photos']['photo'][0]["id"])
print("Title: " + photos['photos']['photo'][0]["title"])
print("Owner: " + photos['photos']['photo'][0]["owner"])
print("Secret: " + photos['photos']['photo'][0]["secret"])
print("Server: " + photos['photos']['photo'][0]["server"])


Total photos: 74512
Total pages: 299
First entry:
ID: 54246247201
Title: IMG_4542
Owner: 202051625@N07
Secret: b1761dbd2a
Server: 65535


### Extract Page Information

Now, the code below is a function that takes a page, extracts the information and returns a dataframe:

In [85]:
def get_page(bbox, page, per_page):
    response = flickr.photos.search(
        bbox=bbox,
        per_page=per_page,
        page=page,
        has_geo=1,
        extras="geo,description,tags,views,media,url_s,date_taken,owner_name",
    )
    photos = response["photos"]["photo"]
    rows = []
    for photo in photos:
        new_row = {
            "id": photo["id"],
            "server": photo["server"],
            "secret": photo["secret"],
            "title": photo["title"],
            "tags": photo["tags"],
            "views": photo["views"],
            "description": photo["description"]["_content"],
            "date_taken": photo["datetaken"],
            "latitude": photo["latitude"],
            "longitude": photo["longitude"],
            "url_s": photo["url_s"],
            "owner": photo["owner"],
            "owner_name": photo["ownername"],
            "media": photo["media"],
        }
        rows.append(new_row)
    df = pd.DataFrame(
        rows,
        columns=[
            "id",
            "server",
            "secret",
            "title",
            "tags",
            "views",
            "description",
            "date_taken",
            "latitude",
            "longitude",
            "url_s",
            "owner_name",
            "owner",
            "media",
        ],
    )
    return df


get_page(bbox, 1, per_page)

Unnamed: 0,id,server,secret,title,tags,views,description,date_taken,latitude,longitude,url_s,owner_name,owner,media
0,54246247201,65535,b1761dbd2a,IMG_4542,,0,Londoner,2024-10-10 17:04:12,51.510608,-0.117187,https://live.staticflickr.com/65535/5424624720...,paulabarzini,202051625@N07,photo
1,54245922084,65535,454c7ba00f,clare market,claremarket aldwych london uk crazycontrailsday,194,"aldwych, london",2024-10-27 12:59:26,51.514244,-0.116364,https://live.staticflickr.com/65535/5424592208...,pete gardner,28668614@N07,photo
2,54244605017,65535,0d2df1b751,Temple,,19,,2024-01-25 15:49:16,51.511244,-0.113706,https://live.staticflickr.com/65535/5424460501...,John Gulliver,32163022@N08,photo
3,54241915384,65535,d87952ff37,,,3,,2024-12-26 13:40:24,51.509241,-0.118884,https://live.staticflickr.com/65535/5424191538...,The West End,69754957@N00,photo
4,54241842203,65535,01abdbe5dd,,,5,,2024-12-26 13:40:33,51.509322,-0.118925,https://live.staticflickr.com/65535/5424184220...,The West End,69754957@N00,photo
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
245,54145645288,65535,c4ac8368fd,DSC_0865,london bus route 26 city fleet street,287,City of London Bus Route #26 Fleet Street,2024-11-06 17:29:29,51.513317,-0.113782,https://live.staticflickr.com/65535/5414564528...,photographer695,41087279@N00,photo
246,54145687604,65535,c6efba64e6,DSC_0864,london bus route 26 aldwych,286,London Bus Route #26 Aldwych,2024-11-06 17:29:19,51.513317,-0.113975,https://live.staticflickr.com/65535/5414568760...,photographer695,41087279@N00,photo
247,54144505847,65535,b3b48a0ca7,DSC_0866,london bus route 26 city fleet street,303,City of London Bus Route #26 Fleet Street,2024-11-06 17:30:25,51.513371,-0.113546,https://live.staticflickr.com/65535/5414450584...,photographer695,41087279@N00,photo
248,54145687494,65535,7c865d5a25,DSC_0867,london bus route 26 city fleet street,294,City of London Bus Route #26 Fleet Street,2024-11-06 17:30:59,51.513538,-0.112923,https://live.staticflickr.com/65535/5414568749...,photographer695,41087279@N00,photo


### Go through all pages
The next step involves creating a loop that processes all the pages and consolidates the data into a single large dataframe. Since each request takes approximately 5 to 6 seconds, processing all 299 pages could take around 30 minutes. While this duration is manageable, there are potential risks during such a lengthy process.

Here are a few considerations to address these challenges:
- **Download in batches**: For instance, process 20 pages at a time. However, this may not be an ideal solution.
- **Use a smaller bounding box**: This approach is more effective.

In the example provided below, only the first 20 pages are downloaded. To process all pages, replace the relevant line with `for page in range(1, total_pages + 1):`.

In [88]:
df = pd.DataFrame()

for page in range(1, 20):
    new_df = get_page(bbox, page, per_page)
    print(f"Getting page {page} of {total_pages}")
    df = pd.concat([df, new_df], ignore_index=True)
    print(f"Total photos so far: {len(df)}")


df

Getting page 1 of 299
Total photos so far: 250
Getting page 2 of 299
Total photos so far: 500
Getting page 3 of 299
Total photos so far: 750
Getting page 4 of 299
Total photos so far: 1000
Getting page 5 of 299
Total photos so far: 1250
Getting page 6 of 299
Total photos so far: 1500
Getting page 7 of 299
Total photos so far: 1750
Getting page 8 of 299
Total photos so far: 2000
Getting page 9 of 299
Total photos so far: 2250
Getting page 10 of 299
Total photos so far: 2500
Getting page 11 of 299
Total photos so far: 2750
Getting page 12 of 299
Total photos so far: 3000
Getting page 13 of 299
Total photos so far: 3250
Getting page 14 of 299
Total photos so far: 3500
Getting page 15 of 299
Total photos so far: 3750
Getting page 16 of 299
Total photos so far: 4000
Getting page 17 of 299
Total photos so far: 4250
Getting page 18 of 299
Total photos so far: 4500
Getting page 19 of 299
Total photos so far: 4750


Unnamed: 0,id,server,secret,title,tags,views,description,date_taken,latitude,longitude,url_s,owner_name,owner,media
0,54246247201,65535,b1761dbd2a,IMG_4542,,0,Londoner,2024-10-10 17:04:12,51.510608,-0.117187,https://live.staticflickr.com/65535/5424624720...,paulabarzini,202051625@N07,photo
1,54245922084,65535,454c7ba00f,clare market,claremarket aldwych london uk crazycontrailsday,194,"aldwych, london",2024-10-27 12:59:26,51.514244,-0.116364,https://live.staticflickr.com/65535/5424592208...,pete gardner,28668614@N07,photo
2,54244605017,65535,0d2df1b751,Temple,,19,,2024-01-25 15:49:16,51.511244,-0.113706,https://live.staticflickr.com/65535/5424460501...,John Gulliver,32163022@N08,photo
3,54241915384,65535,d87952ff37,,,3,,2024-12-26 13:40:24,51.509241,-0.118884,https://live.staticflickr.com/65535/5424191538...,The West End,69754957@N00,photo
4,54241842203,65535,01abdbe5dd,,,5,,2024-12-26 13:40:33,51.509322,-0.118925,https://live.staticflickr.com/65535/5424184220...,The West End,69754957@N00,photo
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4745,54143806807,65535,0655cb9fde,London Transport Museum,,10,,2024-10-19 12:08:06,51.512177,-0.121131,https://live.staticflickr.com/65535/5414380680...,Stephen Cannon,68842060@N00,photo
4746,54145124490,65535,1da5057b1e,London Transport Museum,,29,,2024-10-19 12:06:34,51.512102,-0.121162,https://live.staticflickr.com/65535/5414512449...,Stephen Cannon,68842060@N00,photo
4747,54145124450,65535,3decea7ac1,London Transport Museum,,11,,2024-10-19 11:46:06,51.512027,-0.121200,https://live.staticflickr.com/65535/5414512445...,Stephen Cannon,68842060@N00,photo
4748,54144943348,65535,c5bb827571,London Transport Museum,,8,,2024-10-19 11:45:08,51.511991,-0.121200,https://live.staticflickr.com/65535/5414494334...,Stephen Cannon,68842060@N00,photo


and we save the dataframe....

In [89]:
df.to_csv("london_photos.csv", index=False)

### Map Display
We display the location of our points on the map: 

In [114]:
# Create a map centered on the location
map_london = folium.Map(
    location=[centre_latitude, centre_longitude],
    tiles="Cartodb dark_matter",
    zoom_start=16,
    control_scale=True,
    zoom_control=False,
    dragging=False,
    scrollWheelZoom=False
)

# Add a circle marker to the map
folium.CircleMarker(
    location=[centre_latitude, centre_longitude],
    radius=2,
    color="cornflowerblue",
    stroke=False,
    fill=True,
    fill_opacity=0.6,
    opacity=1,
    popup="{} pixels".format(radius),    
).add_to(map_london) 

# Add a rectangle to the map
folium.Rectangle(
    bounds=[[lat_north , long_west], [lat_south, long_east]],
    fill=True,
    fill_opacity=0.05,
    weight=.5,
    color="cornflowerblue",  
).add_to(map_london)

# Add a circle marker for each photo
latitudes = df['latitude']
longitudes = df['longitude']

for latitude, longitude in  zip(latitudes,longitudes):
  coordinate = [latitude,longitude]
  radius = 1
  folium.CircleMarker(
    location=coordinate,
    radius=radius,
    stroke=False,
    fill=True,
    fillColor="orchid", 
    fill_opacity=0.3,
    opacity=0.3,
  ).add_to(map_london)
  
map_london

**Colors**  

Ever wondered where color names like "orchid" or "cornflowerblue" come from? These are built-in Windows colors, and you can find the complete list [here](https://learn.microsoft.com/en-us/dotnet/api/system.windows.media.colors?view=windowsdesktop-9.0).

---

## Combined Script

This is the combined script: 

### Import Libaries

In [115]:
import sys

sys.path.append(r"C:\Users\vince\Dropbox\Codes\passwords")

import passwords as pw
import flickrapi
import pandas as pd
import pprint
import folium

api_key = pw.flickr_key
api_secret = pw.flickr_secret

# connect to flickr
flickr = flickrapi.FlickrAPI(api_key, api_secret, format="parsed-json")

### Set search parameters

In [126]:
# Define the location (latitude and longitude) and search parameters
centre_latitude = 51.51239
centre_longitude = 0.00496

# Define the bounding box. Use Open Street Map to get the coordinates
lat_north = 51.51550
lat_south = 51.50941
long_west = 0.00024
long_east = 0.01015

# Format: bbox = "min_longitude,min_latitude,max_longitude,max_latitude" for Flickr search
bbox = str(long_west)+','+str(lat_south)+','+str(long_east)+','+str(lat_north)

# Define the number of photos to retrieve per page
per_page = 250

# Create a map centered on the location
map_london = folium.Map(
    location=[centre_latitude, centre_longitude],
    tiles="Cartodb dark_matter",
    zoom_start=16,
    control_scale=True,
    zoom_control=False,
    dragging=False,
    scrollWheelZoom=False,
)

# Add a circle marker to the map
folium.CircleMarker(
    location=[centre_latitude, centre_longitude],
    radius=2,
    color="cornflowerblue",
    stroke=False,
    fill=True,
    fill_opacity=0.6,
    opacity=1,
    popup="{} pixels".format(radius),
).add_to(map_london)

# Add a rectangle to the map
folium.Rectangle(
    bounds=[[lat_north, long_west], [lat_south, long_east]],
    fill=True,
    fill_opacity=0.1,
    weight=1,
    color="cornflowerblue",
).add_to(map_london)

map_london

### Get amount of pages and photos

In [127]:
photos = flickr.photos.search(bbox=bbox, per_page=per_page, page=1, has_geo=1)

total_pages = photos["photos"]["pages"]
total_photos = photos["photos"]["total"]

print(f"Total photos: {total_photos}")
print(f"Total pages: {total_pages}")

Total photos: 3841
Total pages: 16


### Extract information

In [128]:
def get_page(bbox, page, per_page):
    response = flickr.photos.search(
        bbox=bbox,
        per_page=per_page,
        page=page,
        has_geo=1,
        extras="geo,description,tags,views,media,url_s,date_taken,owner_name",
    )
    photos = response["photos"]["photo"]
    rows = []
    for photo in photos:
        new_row = {
            "id": photo["id"],
            "server": photo["server"],
            "secret": photo["secret"],
            "title": photo["title"],
            "tags": photo["tags"],
            "views": photo["views"],
            "description": photo["description"]["_content"],
            "date_taken": photo["datetaken"],
            "latitude": photo["latitude"],
            "longitude": photo["longitude"],
            "url_s": photo["url_s"],
            "owner": photo["owner"],
            "owner_name": photo["ownername"],
            "media": photo["media"],
        }
        rows.append(new_row)
    df = pd.DataFrame(
        rows,
        columns=[
            "id",
            "server",
            "secret",
            "title",
            "tags",
            "views",
            "description",
            "date_taken",
            "latitude",
            "longitude",
            "url_s",
            "owner_name",
            "owner",
            "media",
        ],
    )
    return df


df = pd.DataFrame()

# Use this code to get only the first 10 pages of photos
for page in range(1, 10):
    new_df = get_page(bbox, page, per_page)
    print(f"Getting page {page} of {total_pages}")
    df = pd.concat([df, new_df], ignore_index=True)
    print(f"Total photos so far: {len(df)}")

""" 
Use this code to get all the photos in the bounding box: 
for page in range(1, total_pages + 1):
    new_df = get_page(bbox, page, per_page)
    print(f"Getting page {page} of {total_pages}")
    df = pd.concat([df, new_df], ignore_index=True)
    print(f"Total photos so far: {len(df)}") 
"""

df.to_csv("london_photos.csv", index=False)

df

Getting page 1 of 16
Total photos so far: 250
Getting page 2 of 16
Total photos so far: 500
Getting page 3 of 16
Total photos so far: 750
Getting page 4 of 16
Total photos so far: 1000
Getting page 5 of 16
Total photos so far: 1250
Getting page 6 of 16
Total photos so far: 1500
Getting page 7 of 16
Total photos so far: 1750
Getting page 8 of 16
Total photos so far: 2000
Getting page 9 of 16
Total photos so far: 2250


Unnamed: 0,id,server,secret,title,tags,views,description,date_taken,latitude,longitude,url_s,owner_name,owner,media
0,54188656132,65535,9eb07f8473,Go-Ahead London MHV6,goahead london mhv6 bu16oyo volvo b5lh mcv evo...,485,Go-Ahead London MHV6 (BU16 OYO)\nVolvo B5LH/MC...,2024-12-07 13:17:20,51.514731,0.007992,https://live.staticflickr.com/65535/5418865613...,gbenviro200,33732381@N04,photo
1,54188288630,65535,1e94c66157,EastLondon-37556-YX60DXO-CanningTown-300124,yx60dxo enviro200 firstcapital dm44169 route30...,433,East London 37556 (YX60 DXO) \n\nADL Enviro 20...,2024-01-30 00:00:00,51.514291,0.008368,https://live.staticflickr.com/65535/5418828863...,Michael Wadman,33075566@N08,photo
2,54186949737,65535,fbf67379c4,EastLondon-47992-YJ12GVR-CanningTown-300124,yj12gvr ctplus optaresolo optare solo route309...,470,East London 47992 (YJ12 GVR) \n\nFormer CT Plu...,2024-01-30 00:00:03,51.514291,0.008368,https://live.staticflickr.com/65535/5418694973...,Michael Wadman,33075566@N08,photo
3,54188115424,65535,6b76e2d37b,EastLondon-64202-LF20XKP-CanningTown-300124,lf20xkp bydd8ur route323 eastlondonbus canning...,400,East London 64202 (LF20 XKP) \n\nBYD D8UR / AD...,2024-01-30 00:00:02,51.514291,0.008368,https://live.staticflickr.com/65535/5418811542...,Michael Wadman,33075566@N08,photo
4,54186949517,65535,a17c975acf,EastLondon-47985-YJ60PFE-CanningTown-300124,yj60pfe ctplus optaresolo optare solo route309...,309,East London 47985 (YJ60 PFE) \n\nFormer CT Plu...,2024-01-30 00:00:01,51.514291,0.008368,https://live.staticflickr.com/65535/5418694951...,Michael Wadman,33075566@N08,photo
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2245,19669717713,270,d30ac971ea,"City Island, Orchard Place",roundtower towerhamlets trinitybuoywharf round...,892,Thousands of private residents will be driving...,2015-07-11 11:03:51,51.510772,0.005171,https://live.staticflickr.com/270/19669717713_...,diamond geezer,36101699310@N01,photo
2246,20261193506,459,b17fea2926,Stagecoach London 19795 on route 5 at Canning ...,bus transport publictransport stagecoach londo...,2204,This is the closest route 5 gets to Central Lo...,2015-08-03 13:57:59,51.515259,0.007756,https://live.staticflickr.com/459/20261193506_...,SW11simon,42037567@N08,photo
2247,20276112171,3788,146972c94d,City Island,roundtower towerhamlets roundtower2,1237,One day this red footbridge will be a new way ...,2015-08-02 13:49:56,51.513363,0.005578,https://live.staticflickr.com/3788/20276112171...,diamond geezer,36101699310@N01,photo
2248,19644544533,3774,b8596fbdbc,City Island,roundtower towerhamlets roundtower2,1081,"Former industrial peninsula, soon to be luxury...",2015-08-02 13:48:49,51.513363,0.005578,https://live.staticflickr.com/3774/19644544533...,diamond geezer,36101699310@N01,photo


### Map display

In [130]:
# Create a map centered on the location
map_london = folium.Map(
    location=[centre_latitude, centre_longitude],
    tiles="Cartodb dark_matter",
    zoom_start=16,
    control_scale=True,
    zoom_control=False,
    dragging=False,
    scrollWheelZoom=False
)

# Add a circle marker to the map
folium.CircleMarker(
    location=[centre_latitude, centre_longitude],
    radius=2,
    color="cornflowerblue",
    stroke=False,
    fill=True,
    fill_opacity=0.6,
    opacity=1,
    popup="{} pixels".format(radius),    
).add_to(map_london) 

# Add a rectangle to the map
folium.Rectangle(
    bounds=[[lat_north , long_west], [lat_south, long_east]],
    fill=True,
    fill_opacity=0.05,
    weight=.5,
    color="cornflowerblue",  
).add_to(map_london)

# Add a circle marker for each photo
latitudes = df['latitude']
longitudes = df['longitude']

for latitude, longitude in  zip(latitudes,longitudes):
  coordinate = [latitude,longitude]
  radius = 1
  folium.CircleMarker(
    location=coordinate,
    radius=radius,
    stroke=False,
    fill=True,
    fillColor="orchid", 
    fill_opacity=0.3,
    opacity=0.3,
  ).add_to(map_london)
  
map_london

## What's Next?  

If you examine this [Flickr Image](https://www.flickr.com/photos/33075566@N08/54187832141), you'll notice additional information associated with the image, such as "Favorites" or "Comments," which were not captured during the initial search.  
To address this, you'll need to iterate through the list and include the missing entries. Refer to the documentation to determine the appropriate API calls required to retrieve this data.

You can also convert the image's capture date into a datetime object, which helps you better analyze and understand the time aspect.