# Activity 03: Exploring Apple Photos

You may not know this, but anytime you take a photo on your iPhone, a lot of machine learning goes on behind the scenes to determine loads of information about the photo. Where it was taken, what's in it, and even how good of a photo it is.

A data scientist built a tool to extract all that information from any Apple Photos database and put it into a more readable format. The tools is called dogsheep photos, and is available to install on your Mac: https://github.com/dogsheep/dogsheep-photos. For today, we'll use about 500 or so of his photos to explore the information that an Apple Photos database contains. All information has come from: https://dogsheep-photos.dogsheep.net/ which is a publically available database of photos from Simon Willison, the author of Dog Sheep Photos.

## Getting Started

Get started by running the code below to load in all the table operations you learned about in Lesson 03: Tables and Intro to Python. These table operations are not built into the standard Python library of tools, so we need to import them all from the `datascience` Python module, which is maintained by UC-Berkeley. You can do this by running the cell below. You'll see this code reappear quite a bit in this course!

In [None]:
from datascience import *

Now, we need to load up some information on the photos. We'll create two tables, `photos` which will contain almost all the metadata about the photos, and `labels` which contains additional information about what Apple thinks the photo is of (person, animal, tree, etc.)

In [None]:
photos = Table.read_table('data/apple_photos.csv')
labels = Table.read_table('data/apple_labels.csv')

### The `photos` Table

Let's take a look at what we're working with in the `photos` table. Run the cell below to see the first 10 rows of this table, and all of the columns. You can scroll to the right to see all the columns. We'll often call the column labels, "attributes" or "features" of the data contained in each row.

In [None]:
photos

There are a lot of attributes in this table! Let's just keep a few to work with:

* photo
* uuid
* date
* place_city
* place_state_province
* place_country
* ZOVERALLAESTHETICSCORE
* pick another 1 or 2 that you're interested in exploring

Modify the cell below so you're selecting only the attributes / columns in the list above, plus one or two that you think might be interesting. Be sure to either copy/paste the labels or type very carfully. **Remember:** LaBeLs ArE cAsE sEnSiTiVe!

In [None]:
# You complete some code in this cell
my_photos = ...
my_photos

Explore this data set. Some questions to consider:

1. What does Apple think is the "best" photo as measured by it's aesthetic score, contained in `ZOVERALLAESTHETICSCORE`?
2. How many photos were taken in the country of France? The city of Sausalito?
3. **Find something that you think is interesting and post about it to the Activity 2 thread in the EdSTEM discussion board.**

### The `labels` table

Start by looking at what the `labels` table contains:

In [None]:
labels

Let's only keep the `uuid` and `normalized_string` columns, the rest won't be too interesting:

In [None]:
my_labels = labels.select('uuid', 'normalized_string')
my_labels

Let's create a table with observations where the photos contain a pelican. These rows will have the string `'pelican'` in the column labeled `normalized_string`. We'll call this table  `pelican_uuid`, because we are interested in using the UUID to match it back to our `my_photos` table, which has a lot more information on the photo. 

Complete the code below to create the table as described above. It should contain the columns `uuid` and `normalized_string` and only contain data about the photos that contain pelicans.

In [None]:
# You complete some code in this cell
pelican_uuid = ...
pelican_uuid

## What's the best pelican photo?

The code below will create a new table, `pelican_photos` which has all the information found in `my_photos`, but only for the photos we found in the `pelican_uuid` table. 

Don't worry about what this code is doing. Just run it and see how the table is structured.

In [None]:
pelican_photos = my_photos.join( 'uuid', pelican_uuid)
pelican_photos

Using table operations, sort the table named `pelican_photos` by `ZOVERALLAESTHETICSCORE` to determine which pelican photo Apple thinks looks best. Save this sorted table as `best_pelican_at_the_top`.

In [None]:
# You complete some code in this cell
best_pelican_at_the_top = ...
best_pelican_at_the_top

## The WINNNER!

Assuming the table above is sorted properly, running the cell below should should you the best pelican as determined by Apple.

In [None]:
best_pelican = best_pelican_at_the_top.column('photo').item(0)[12:-8]
from IPython import display
display.Image(best_pelican)

## The LOSER!

Assuming the table above is sorted properly, running the cell below should should you the worst pelican as determined by Apple.

In [None]:
best_pelican = best_pelican_at_the_top.column('photo').item(-1)[12:-8]
from IPython import display
display.Image(best_pelican)