#Flickr API

The dataset obtained from QGIS is already good. However, some information are missing that would be beneficial to include in the database.

A typical Flickr post can be seen here: [link](https://www.flickr.com/photos/92959567@N00/227080829).

The goal of this exercise is to add the following data to the table:

- Number of views
- Number of favorites
- Number of comments
- Username

We will utilize a Python library known as "Python Flickr API" to manage the authentication process. After a successful authentication, you can work with the methods outlined in the documentation available [here](https://www.flickr.com/services/api/).

The methods are associated with different authentication levels, such as:

- *This method requires authentication with 'write' permission.*
- *This method requires authentication with 'read' permission.*
- *This method does not require authentication.*

Please disregard the methods that necessitate "write" or "read" permission.

First we pip install the library.

In [54]:
!pip install flickrapi



The code below connects to the flickr api. The connection will be stored in an object called `flickr`.

In [55]:
import pandas as pd
import flickrapi

#the api key is saved externally.
creditsPath = "/content/drive/MyDrive/Colab Notebooks/RC15 23 Excercises/Day 3/keys/keys.csv"
df_credits = pd.read_csv(creditsPath)
api_key = df_credits["flickr_Key"][0]
api_secret=df_credits["flickr_Secret"][0]

#connect to flickr
flickr = flickrapi.FlickrAPI(api_key, api_secret, format='parsed-json')


We import the table as dataframe `df`.

In [56]:
#get the table
table = '/content/drive/MyDrive/Colab Notebooks/RC15 23 Excercises/Day 3/Excercise 03 - Flickr/Flickr.csv'
df = pd.read_csv(table)

df.head(3)

Unnamed: 0,p_id,lat,lon,o_id,p_date,accuracy,title,tags,url
0,214028,51.504265,-0.078659,51035570500@N01,2003-05-24 17:40:38,16,"City Hall, London",london england normanfoster architecture geota...,https://live.staticflickr.com/1/214028_a96ba3e...
1,436102,51.505937,-0.082182,40385587@N00,2004-03-16 14:02:05,14,Angel. A Figurehead,london thames ship olympus c740 figurehead wys...,https://live.staticflickr.com/1/436102_7907e51...
2,436113,51.505323,-0.08965,40385587@N00,2004-03-16 14:04:28,14,Sign at London Bridge Station,signs london olympus lee c740 wysiwyg,https://live.staticflickr.com/1/436113_d87fc68...


The first column carries the `p_id`, the unique ID of the photo. With this ID we can get the first set of [information](https://www.flickr.com/services/api/flickr.photos.getInfo.html):

In [57]:
photo_ID = df["p_id"][0]

info = flickr.photos.getInfo(photo_id= photo_ID)

import json
print(json.dumps(info, indent=2))

{
  "photo": {
    "id": "214028",
    "secret": "a96ba3e77a",
    "server": "1",
    "farm": 1,
    "dateuploaded": "1085416838",
    "isfavorite": 0,
    "license": "0",
    "safety_level": "0",
    "rotation": 0,
    "originalsecret": "a96ba3e77a",
    "originalformat": "jpg",
    "owner": {
      "nsid": "51035570500@N01",
      "username": "intuor",
      "realname": "Edward King",
      "location": "San Francisco, USA",
      "iconserver": "1",
      "iconfarm": 1,
      "path_alias": "kingedward",
      "gift": {
        "gift_eligible": true,
        "eligible_durations": [
          "year",
          "month",
          "week"
        ],
        "new_flow": true
      }
    },
    "title": {
      "_content": "City Hall, London"
    },
    "description": {
      "_content": "<a href=\"http://www.euboia.org\" rel=\"noreferrer nofollow\">www.euboia.org</a>\r\n<a href=\"http://www.geobloggers.com\" rel=\"noreferrer nofollow\">geotagged</a>"
    },
    "visibility": {
      "ispubl

This gives us the name, the number of views and the number of comments:

In [58]:
name = info["photo"]["owner"]["realname"]
number_views = info["photo"]["views"]
number_comments = info["photo"]["comments"]["_content"]

print(  "Name: "              + "\t\t\t" + name        + "\n" +\
        "Number of views: "   + "\t"     + number_views+ "\n" +\
        "Number of comments: "+ "\t"     + number_comments )

Name: 			Edward King
Number of views: 	482
Number of comments: 	0


To get the favorites, there is another [method](https://www.flickr.com/services/api/flickr.photos.getFavorites.html):

In [None]:
photo_favorites = flickr.photos.getFavorites(photo_id=photo_ID)

import json
print(json.dumps(photo_favorites, indent=2))

In [60]:
number_fav = photo_favorites["photo"]["total"]

number_fav

1

We can now encapsulate that process within a function. The concept is to utilize the photo ID as an input argument and obtain a dictionary containing the extracted values as the output. Opting for a dictionary as the return value of the function offers the benefit of allowing additional values to be incorporated later on. Moreover, a dictionary is a suitable object that can be easily integrated with Panda dataframes.

In [81]:
import time

def get_info(photo_ID):
  info = flickr.photos.getInfo(photo_id= photo_ID)
  time.sleep(0.25)
  photo_favorites = flickr.photos.getFavorites(photo_id=photo_ID)

  info_dict      = {'ID'              : photo_ID                              ,\
                    'name'            : info["photo"]["owner"]["realname"]    ,\
                    'number_views'    : info["photo"]["views"]                ,\
                    'number_comments' : info["photo"]["comments"]["_content"] ,\
                    'number_fav'      : photo_favorites["photo"]["total"]}
  return info_dict

Initially, we can proceed to evaluate the function. As evident, a construction such as `get_info(df["p_id"][30])` can be effortlessly iterated through, thus enabling the acquisition of data points for all entries.

In [82]:
#test on one photo
test = get_info(df["p_id"][30])
test

{'ID': 7862581,
 'name': '',
 'number_views': '101',
 'number_comments': '0',
 'number_fav': 0}

In [83]:
# It is a good idea to work out the loop on a smaller table.
df_small = df[5:100]

# To loop, it is nessesary to extract a column as list.
# It is not advisable to loop though a table directly
p_IDs = df_small["p_id"]

# We prepare empty lists that then can be added as columns into the dataframe
ID_list = []
name = []
number_views = []
number_comments = []
number_fav = []

#that is the loop. At first there is the creation of the dict object.
#from mthis dict object, the values is taken out and added to the list
for ID in p_IDs:
  info_ID = get_info(ID)
  ID_list.append(info_ID['ID'])
  name.append(info_ID['name'])
  number_views.append(info_ID['number_views'])
  number_fav.append(info_ID['number_fav'])

#  the lists are being added as columns
df_small.insert(loc=6, column='Name', value=name)
df_small.insert(loc=7, column='ID', value=ID_list)
df_small.insert(loc=8, column='NumberViews', value=number_views)
df_small.insert(loc=9, column='NumberFavorites', value=number_fav)

df_small

Unnamed: 0,p_id,lat,lon,o_id,p_date,accuracy,Name,ID,NumberViews,NumberFavorites,title,tags,url
5,459444,51.505163,-0.078696,98406434@N00,2004-06-27 11:01:11,16,,459444,221,0,London - Office reflection,reflection london fountain towerbridge cityhal...,https://live.staticflickr.com/1/459444_6560853...
6,1265192,51.505045,-0.079495,78462059@N00,2004-10-27 14:55:31,16,Andy Aldridge,1265192,117,0,20041027_14553165,adam london geotagged geotoolyuancc geolat5150...,https://live.staticflickr.com/1/1265192_f0bf20...
7,3824097,51.505002,-0.082740,35468140399@N01,2005-01-23 13:10:58,15,Richard,3824097,546,0,Much glass,windows reflection london architecture southba...,https://live.staticflickr.com/3/3824097_3733c6...
8,3824098,51.505002,-0.082740,35468140399@N01,2005-01-23 13:15:48,15,Richard,3824098,683,1,City Hall x 2,windows reflection london architecture towerbr...,https://live.staticflickr.com/3/3824098_5d8c31...
9,3845978,51.505022,-0.079629,35468140399@N01,2005-01-23 13:17:57,15,Richard,3845978,619,3,City Hall,london architecture towerbridge cityhall south...,https://live.staticflickr.com/1/3845978_46d752...
...,...,...,...,...,...,...,...,...,...,...,...,...,...
95,12365939,51.505323,-0.081195,63113103@N00,2005-05-01 09:48:33,12,,12365939,17,0,DSC01489.JPG,weeeman,https://live.staticflickr.com/9/12365939_30627...
96,12366056,51.505323,-0.081195,63113103@N00,2005-05-01 09:48:44,12,,12366056,86,0,WEEE Man,weeeman,https://live.staticflickr.com/8/12366056_27724...
97,12366248,51.505323,-0.081195,63113103@N00,2005-05-01 09:48:58,12,,12366248,37,0,DSC01491.JPG,weeeman,https://live.staticflickr.com/10/12366248_f582...
98,12366401,51.505323,-0.081195,63113103@N00,2005-05-01 09:49:12,12,,12366401,28,0,DSC01492.JPG,weeeman,https://live.staticflickr.com/8/12366401_837ad...


And all this can be saved:

In [84]:
path_updatedFile = "/content/drive/MyDrive/Colab Notebooks/RC15 23 Excercises/Day 3/Excercise 03 - Flickr/FlickrNew.csv"
df_small.to_csv(path_or_buf=path_updatedFile)