# A photo feed

The file ```photos_public.txt``` contains the [JSON version of the Flickr photo feed](https://api.flickr.com/services/feeds/photos_public.gne?&format=json) as accessed on 4 August 2015.  (We've removed some apostrophes to aid in processing.)  To see what the actual photos might look like, check out [the RSS version of the feed](https://api.flickr.com/services/feeds/photos_public.gne).

JSON, which stands for "JavaScript Object Notation", is a standard format used by many websites to exchange information.  Our goal is to organize the feed information in Python, and investigate how many of the Flickr photos were posted via the [Instagram](http://instagram.com/) app.

We start by looking at the ```items``` list in ```photos_public.txt```.  Each one of these items represents a photo.  We need to decide what information we're interested in keeping.  For simplicity, let's suppose we want:

* The link to the photo's Flickr page
* The link to a thumbnail version of the photo (listed in the JSON file as "media")
* The photo's title
* The author id
* The tags

In [1]:
# Build a basic class to contain the information we want to save

class FlickrPhoto(object):
    
    def __init__(self, link="", thumb_link = "", title="", a_id = "", tags=""):
        self.link = link
        self.thumb_link = thumb_link
        self.title = title
        self.author_id = a_id
        self.tags = tags
    
    def get_link(self):
        """ return link to Flickr page for photo"""
        return self.link
    
    def get_thumb_link(self):
        """ return link to thumbnail version of photo"""
        return self.thumb_link
    
    def get_title(self):
        """ return photo title"""
        return self.title
    
    def get_author_id(self):
        """ return author id"""
        return self.author_id
    
    def get_tags(self):
        """ return tags for photo"""
        return self.tags

In [5]:
blank = FlickrPhoto()

In [6]:
blank.get_tags()


''

In [7]:
test_photo = FlickrPhoto("a","b","c","d","e")
test_photo.get_tags()

'e'

In [8]:
test_photo_2 = FlickrPhoto("a","b","c")
test_photo_2.get_tags()

''

Let's make a fancier version of our photo class, which incorporates the ability to display the thumbnail.

In [9]:
# Import code for dealing with URLs and images
from PIL import Image
import requests
from io import BytesIO

class FlickrPhoto(object):
    
    def __init__(self, link="", thumb_link = "", title="", a_id = "", tags=""):
        self.link = link
        self.thumb_link = thumb_link
        self.title = title
        self.author_id = a_id
        self.tags = tags
    
    def get_link(self):
        """ return link to Flickr page for photo"""
        return self.link
    
    def get_thumb_link(self):
        """ return link to thumbnail version of photo"""
        return self.thumb_link
    
    def get_title(self):
        """ return photo title"""
        return self.title
    
    def get_author_id(self):
        """ return author id"""
        return self.author_id
    
    def get_tags(self):
        """ return tags for photo"""
        return self.tags


    def display_thumbnail(self):
        """ open the thumbnail version in default image viewer"""
        
        if self.thumb_link=="":
            print("No thumbnail link!")
            return
        
        response = requests.get(self.thumb_link)
        Image.open(BytesIO(response.content)).show()

In [10]:
blank = FlickrPhoto()
blank.display_thumbnail()

No thumbnail link!


In [None]:
test_photo = FlickrPhoto(thumb_link = "https://farm1.staticflickr.com/301/20299106861_23979a8033_m.jpg")
test_photo.display_thumbnail()

We'll use the [JSON module](https://docs.python.org/3/library/json.html) to parse our JSON file and create a list of FlickrPhoto objects.

In [None]:
import json

In [None]:
with open('photos_public.txt') as data_file:
    photos_j = json.loads(data_file.read())

In [None]:
photo_list = [FlickrPhoto(photo["link"], photo["media"], photo["title"], photo["author_id"], photo["tags"]) for photo in photos_j["items"]]

Now, let's create a data frame which contains the author id and whether a photo is tagged as coming from Instagram.

In [None]:
import pandas as pd

In [None]:
column_names = ("Author id", "Instagram")
dict_list = []
for photo in photo_list:
    has_instagram = "instagram" in photo.get_tags()
    dict_list.append(dict(zip(column_names, [photo.get_author_id(),has_instagram])))
photo_df = pd.DataFrame(dict_list)
photo_df