# Semantic search of your personal cloud photos

### A generative AI app that uses:

#### **Pinecone** vector database for embedding storage and semantic search

#### **Hugging Face** models and pipelines
#### **OpenAI CLIP** model for image and query text embedding creation

#### **Google API** to access your personal Google Photos to perform semantic search and find "that one photo" 

#### Tested locally on a **MacBook Pro**, M2 chip

# References

**polzerdo55862** has a great notebook tutorial on using the Google Photos API via python. Some of the Google API cells below are a copy/paste from that notebook.
https://github.com/polzerdo55862/google-photos-api/blob/main/Google_API.ipynb

**Pinecone** quick tour shows how to initialize, fill, and delete a Pinecone "index" 
https://github.com/pinecone-io/examples/blob/master/docs/quick-tour/hello-pinecone.ipynb

**Pinecone** examples of how to query an index
https://github.com/pinecone-io/examples/blob/master/docs/semantic-search.ipynb

**HuggingFace** example of how to use CLIP in a stand-alone query and search pipeline
https://huggingface.co/openai/clip-vit-large-patch14

**Antti Havanko** example of how to use CLIP to generate embeddings for use in a vector search engine
https://anttihavanko.medium.com/building-image-search-with-openai-clip-5a1deaa7a6e2

# Getting Started Tips

## Create virtualenv and install required packages

(use Anaconda or follow these steps 1 - 4)

1. Open the terminal and navigate to your working directory. The folder structure of the repo includes the following directories:

    * **credentials**: folder to store the credentials you need to authenticate your "Python App" to the Google Photos Library


2. Create a virtual environment `python3 -m venv venv`, activate it `. ./venv/bin/activate` and install requirements `pip install -r requirements.txt`

3. Install ipykernel which provides the IPython kernel for Jupyter: `pip install ipykernel` and add your virtual environment to Jupyter: `python -m ipykernel install --user --name=venv` 

    You can check the installation by navigating to /Users/<user>/Library/Jupyter/kernels. There should be a new directory called 'photoapp'. In the folder you can find the file 'kernel.json', which contains the path for the used python installation is defined.

4. Start jupyter notebook or jupyter lab: `jupyter lab .` and select the just created environment "venv" as Kernel

![](read_me_img/select_kernel.png)

## Enable Google API

5. Enable Google Photos API Service

   1. Go to the Google API Console [https://console.cloud.google.com/](https://console.cloud.google.com/). 
   2. From the menu bar, select a project or create a new project.
   
      ![](read_me_img/gifs/create_new_project_speed.gif)
      
   3. To open the Google API Library, from the Navigation menu, select APIs & Services > Library. 
   4. Search for "Google Photos Library API". Select the correct result and click "enable". If its already enabled, click "manage"
   
       ![](read_me_img/gifs/enable_api_speed.gif)
       
   5. Afterwards it will forward you to the "Photos API/Service details" page (https://console.cloud.google.com/apis/credentials)


6. Configure "OAuth consent screen" ([Source](https://stackoverflow.com/questions/65184355/error-403-access-denied-from-google-authentication-web-api-despite-google-acc))

   1. Go back to the Photos API Service details page and click on "[OAuth consent screen](https://console.cloud.google.com/apis/credentials/consent)" on the left side (below "Credentials") 
   2. Add a Test user: Use the email of the account you want to use for testing the API call
   
        ![](read_me_img/add_test_user.png)

7. Create API/OAuth credentials

   1. On the left side of the Google Photos API Service page, click Credentials
   2. Click on "Create Credentials" and create a OAuth client ID
   3. As application type I am choosing "Desktop app" and give your client you want to use to call the API a name
   4. Download the JSON file to the created credentials, rename it to "client_secret.json" and save it in the folder "credentials"
   
        ![](read_me_img/gifs/create_credentials_speed.gif)

# Building the Semantic Photo Seach Application

verify your venv

In [1]:
!which python
!which pip

/Users/joshuapoduska/anaconda3/envs/photoapp/bin/python
/Users/joshuapoduska/anaconda3/envs/photoapp/bin/pip


## Use the Google Photo Library API for the first time

The following section shows how to use OAuth Credentials for authentication with the Google Library API. The code section below covers the following steps:

Initialize GooglePhotosApi `google_photos_api = GooglePhotosApi()`

Create Service using the `client_secret.json` file: `service = google_photos_api.create_service()`
        
       Calling the API for the first time:
       1. Google will ask you if you want to grant the App the required permissions you defined with the scope
       2. Since its just a test app at the moment, Google will make you aware of that > Click on "Continue"
       3. Once you granted the app the required permissions, you will see a "token_......pickle" file created in the folder "credentials". This token file will be used for future calls.

Class to establish GoogleAPI credentials

In [2]:
import pickle
import os
from google_auth_oauthlib.flow import Flow, InstalledAppFlow
from googleapiclient.discovery import build
#from googleapiclient.http import MediaFileUpload
from google.auth.transport.requests import Request
import requests

class GooglePhotosApi:
    def __init__(self,
                 api_name = 'photoslibrary',
                 client_secret_file= r'./credentials/client_secret.json',
                 api_version = 'v1',
                 scopes = ['https://www.googleapis.com/auth/photoslibrary']):
        '''
        Args:
            client_secret_file: string, location where the requested credentials are saved
            api_version: string, the version of the service
            api_name: string, name of the api e.g."docs","photoslibrary",...
            api_version: version of the api

        Return:
            service:
        '''

        self.api_name = api_name
        self.client_secret_file = client_secret_file
        self.api_version = api_version
        self.scopes = scopes
        self.cred_pickle_file = f'./credentials/token_{self.api_name}_{self.api_version}.pickle'

        self.cred = None

    def run_local_server(self):
        # is checking if there is already a pickle file with relevant credentials
        if os.path.exists(self.cred_pickle_file):
            with open(self.cred_pickle_file, 'rb') as token:
                self.cred = pickle.load(token)

        # if there is no pickle file with stored credentials, create one using google_auth_oauthlib.flow
        if not self.cred or not self.cred.valid:
            if self.cred and self.cred.expired and self.cred.refresh_token:
                self.cred.refresh(Request())
            else:
                flow = InstalledAppFlow.from_client_secrets_file(self.client_secret_file, self.scopes)
                self.cred = flow.run_local_server()

            with open(self.cred_pickle_file, 'wb') as token:
                pickle.dump(self.cred, token)
        
        return self.cred


### Initialize photos api and create service

In [4]:
# No errors means you are good to go
# A common fix is to delete the pickle file in the credentials folder and rerun this cell

google_photos_api = GooglePhotosApi()
creds = google_photos_api.run_local_server()

## Load the images and build metadata

### Setup

Use python requests module and the token file to retrieve data from Google Photos

In [5]:
import json
import requests

def get_response_from_medium_api(year, month, day):
    url = 'https://photoslibrary.googleapis.com/v1/mediaItems:search'
    payload = {
                  "filters": {
                    "dateFilter": {
                      "dates": [
                        {
                          "day": day,
                          "month": month,
                          "year": year
                        }
                      ]
                    }
                  }
                }
    headers = {
        'content-type': 'application/json',
        'Authorization': 'Bearer {}'.format(creds.token)
    }
    
    try:
        res = requests.request("POST", url, data=json.dumps(payload), headers=headers)
    except:
        print('Request error') 
    
    return(res)

Use the response of the API to write the results and required metadata into a data frame:

In [6]:
def list_of_media_items(year, month, day, media_items_df):
    '''
    Args:
        year, month, day: day for the filter of the API call 
        media_items_df: existing data frame with all find media items so far
    Return:
        media_items_df: media items data frame extended by the articles found for the specified tag
        items_df: media items uploaded on specified date
    '''

    items_list_df = pd.DataFrame()
    
    # create request for specified date
    response = get_response_from_medium_api(year, month, day)

    try:
        for item in response.json()['mediaItems']:
            items_df = pd.DataFrame(item)
            items_df = items_df.rename(columns={"mediaMetadata": "creationTime"})
            items_df.set_index('creationTime')
            items_df = items_df[items_df.index == 'creationTime']

            #append the existing media_items data frame
            items_list_df = pd.concat([items_list_df, items_df])
            media_items_df = pd.concat([media_items_df, items_df])
    
    except:
        print(response.text)

    return(items_list_df, media_items_df)

Data fields of note:

**id** Immutable
      
**baseUrl** Base URLs within the Google Photos Library API allow you to access the bytes of the media items. They are valid for 60 minutes. 

(https://developers.google.com/photos/library/guides/access-media-items)

In [7]:
import timm
import transformers
import torch

  from .autonotebook import tqdm as notebook_tqdm


Connect to MacBook MPS if NVIDIA is not available

In [8]:
# Check that MPS is available
if not torch.backends.mps.is_available():
    if not torch.backends.mps.is_built():
        print("MPS not available because the current PyTorch install was not "
              "built with MPS enabled.")
    else:
        print("MPS not available because the current MacOS version is not 12.3+ "
              "and/or you do not have an MPS-enabled device on this machine.")

else:
    mps_device = torch.device("mps")

In [9]:
# device = 'cuda' if torch.cuda.is_available() else 'cpu'
device = 'cuda' if torch.cuda.is_available() else mps_device

In [10]:
device

device(type='mps')

### Get list of media files for the dates specified

In [14]:
import pandas as pd
from datetime import date, timedelta, datetime
import requests

# create a list with all dates between start date and today
sdate = date(2023,2,1)  
edate = date(2023,4,1)
# sdate = date(2023,9,21)  
# edate = date(2023,10,2)
# sdate = date(2023,10,2)  
# edate = date.today()
date_list = pd.date_range(sdate,edate-timedelta(days=1),freq='d')

media_items_df = pd.DataFrame()

for date in date_list:
    
    # get a list with all media items for specified date (year, month, day)
    items_df, media_items_df = list_of_media_items(year = date.year, 
                                                   month = date.month, 
                                                   day = date.day, 
                                                   media_items_df = media_items_df)

{}

{}

{}

{}

{}

{}

{}

{}

{}

{}

{}

{}

{}

{}

{}

{}

{}



In [15]:
print(f'{len(media_items_df)} images captured')

172 images captured


### Build metadata in the dataframe

In [16]:
media_items_df.drop(['productUrl', 'mimeType', 'filename'], axis=1, inplace=True)
media_items_df['year'] = [int(x) for x in media_items_df['creationTime'].str[0:4].values]
media_items_df['month'] = [int(x) for x in media_items_df['creationTime'].str[5:7].values]
media_items_df['day'] = [int(x) for x in media_items_df['creationTime'].str[8:10].values]
media_items_df.reset_index(drop=True, inplace=True)

### Show first image

In [17]:
from IPython.display import Image
Image(url=media_items_df['baseUrl'].values[0])

## Generate image embeddings

### Load images into memory

In [18]:
from PIL import Image
import requests

url_list = media_items_df['baseUrl'].values.tolist()
images=[Image.open(requests.get(i, stream=True).raw)  for i in url_list]

### Load the Model

In [21]:
from sentence_transformers import SentenceTransformer

img_model = SentenceTransformer('clip-ViT-B-32')

### Create embeddings

In [22]:
embeddings = img_model.encode(images)

### Load embeddings and metadata into the dataframe

In [23]:
img_embeddings = []
for i, embedding in enumerate(embeddings):
    img_embeddings.append(embeddings[i].tolist())
    
media_items_df['vector'] = img_embeddings
media_items_df['metadata'] = media_items_df.loc[:,['year','month','day']].to_dict('records')
media_items_df.head()

Unnamed: 0,id,baseUrl,creationTime,year,month,day,vector,metadata
0,ALuQekrbqnqakVkqJYakQ4j5qzL4TTTbwqRT69PsQhJFaB...,https://lh3.googleusercontent.com/lr/AAJ1LKcpe...,2023-02-02T03:54:08Z,2023,2,2,"[-0.19268657267093658, 0.28584080934524536, 0....","{'year': 2023, 'month': 2, 'day': 2}"
1,ALuQekqGvuwe5LuX__EGFulwMtnGJBJSPJuOS4_r-MMnid...,https://lh3.googleusercontent.com/lr/AAJ1LKdPe...,2023-02-02T03:53:54Z,2023,2,2,"[-0.16807980835437775, 0.10398601740598679, 0....","{'year': 2023, 'month': 2, 'day': 2}"
2,ALuQekr9UXxZI9QDrGA7LpnzU-ETdX_858BchNH_3Fn5As...,https://lh3.googleusercontent.com/lr/AAJ1LKcda...,2023-02-04T07:12:15Z,2023,2,4,"[-0.2119603306055069, 0.1973370760679245, 0.37...","{'year': 2023, 'month': 2, 'day': 4}"
3,ALuQekobrryEN7uljUaErFG2SmiJwFaQfrB4UfVxphnceg...,https://lh3.googleusercontent.com/lr/AAJ1LKc3t...,2023-02-04T07:12:15Z,2023,2,4,"[0.11843045800924301, 0.11033139377832413, 0.0...","{'year': 2023, 'month': 2, 'day': 4}"
4,ALuQeko0Yzx8x4L9eI5nxhT6Q4srmwAGC4W1N11d-jDG1H...,https://lh3.googleusercontent.com/lr/AAJ1LKf0G...,2023-02-04T07:12:15Z,2023,2,4,"[-0.1757127046585083, 0.6187204122543335, 0.44...","{'year': 2023, 'month': 2, 'day': 4}"


## Load embeddings to Pinecone

Pinecone offers a free tier with one index: https://www.pinecone.io/pricing/

### Create a Pinecone index

In [24]:
from tqdm.autonotebook import tqdm
import getpass

In [25]:
import os
import pinecone

# get api key from app.pinecone.io
print("Enter your Pinecone API key") 
api_key = getpass.getpass()

# find your environment next to the api key in pinecone console
# print("Enter your Pinecone Environment") 
# env = getpass.getpass()
env = 'gcp-starter'

pinecone.init(
    api_key=api_key,
    environment=env
)

Enter your Pinecone API key


 ········


In [63]:
# Giving our index a name
index_name = "photo-captions"

In [64]:
# Delete the index, if an index of the same name already exists
if index_name in pinecone.list_indexes():
    pinecone.delete_index(index_name)

In [65]:
# vector dimenstions
vdim = len(media_items_df['vector'][0])
print(f'{vdim} dimentions in each vector')

#vector count
vcount = len(media_items_df)
print(f'{len(media_items_df)} images captured')

512 dimentions in each vector
172 images captured


In [66]:
import time

pinecone.create_index(name=index_name, dimension=vdim, metric="cosine")

# wait for index to be ready before connecting
while not pinecone.describe_index(index_name).status['ready']:
    time.sleep(1)

### Parallel async index load in batches of 100

In [68]:
import random
import itertools

def chunks(iterable, batch_size=100):
    """A helper function to break an iterable into chunks of size batch_size."""
    it = iter(iterable)
    chunk = tuple(itertools.islice(it, batch_size))
    while chunk:
        yield chunk
        chunk = tuple(itertools.islice(it, batch_size))

In [69]:
vectors=zip(media_items_df.id, media_items_df.vector, media_items_df.metadata)
index = pinecone.Index(index_name=index_name)

In [70]:
# Upsert data with 100 vectors per upsert request asynchronously
# - Create pinecone.Index with pool_threads=30 (limits to 30 simultaneous requests)
# - Pass async_req=True to index.upsert()
with pinecone.Index(index_name=index_name, pool_threads=30) as index:
    # Send requests in parallel
    async_results = [
        index.upsert(vectors=ids_vectors_chunk, async_req=True)
        for ids_vectors_chunk in chunks(vectors, batch_size=100)
    ]
    # Wait for and retrieve responses (this raises in case of error)
    [async_result.get() for async_result in async_results]

In [79]:
while index.describe_index_stats()['total_vector_count'] == 0:
    index.describe_index_stats()

{'dimension': 512,
 'index_fullness': 0.00172,
 'namespaces': {'': {'vector_count': 172}},
 'total_vector_count': 172}

# Query embeddings with Pinecone

### Load the Text Model

In [33]:
text_model = SentenceTransformer('sentence-transformers/clip-ViT-B-32-multilingual-v1')

### Helper functions

In [34]:
import ipyplot

In [35]:
def month_name(month):
    if month == 1: month_text = 'January'
    if month == 2: month_text = 'February'
    if month == 3: month_text = 'March'
    if month == 4: month_text = 'April'
    if month == 5: month_text = 'May'
    if month == 6: month_text = 'June'
    if month == 7: month_text = 'July'
    if month == 8: month_text = 'August'
    if month == 9: month_text = 'September'
    if month == 10: month_text = 'October'
    if month == 11: month_text = 'November'
    if month == 12: month_text = 'December'
    return month_text

In [36]:
def query_images(query, years_filter, months_filter, top_k):
    
    # create the query vector
    xq = text_model.encode(query).tolist()

    # now query
    xc = index.query(xq,
                     filter= {
                         "year": {"$in":years_filter},
                         "month": {"$in":months_filter}
                     }, 
                     top_k= top_k, 
                     include_metadata=True)

    img_urls = []
    meta_text = []

    for i in range(0,top_k):
        img_id = xc['matches'][i]['id']
        img_url = media_items_df.loc[media_items_df['id'] == img_id, 'baseUrl'].iloc[0]
        img_urls.append(img_url)
        img_year = media_items_df.loc[media_items_df['id'] == img_id, 'metadata'].iloc[0]['year']
        img_month = media_items_df.loc[media_items_df['id'] == img_id, 'metadata'].iloc[0]['month']
        img_text = month_name(img_month) + ' of ' + str(img_year)
        meta_text.append(img_text)

    ipyplot.plot_images(img_urls, meta_text, img_width=250, show_url=False)

### Queries

In [37]:
years_filter = [2023]
months_filter = [2, 3]
top_k = 3

In [80]:
query = "birthday party with a donut cake"
query_images(query, years_filter, months_filter, top_k)

In [81]:
query = "snowballs flying through the air in a snowball fight"
query_images(query, years_filter, months_filter, top_k)

In [82]:
query = "busted lip due to a snowball fight"
query_images(query, years_filter, months_filter, top_k)