# Bulk Upload Product Images from Google Drive to Shopify

![picture](drive-to-shopify.jpg)


### OBJECTIVE:

The following code allows Shopify store owners to bulk upload product images from a Google Drive folder to a Shopify store, via Shopify's "import products by CSV" feature.


### STEPS:

   * [1. Preparation](#1)
   * [2. Get Data from Google Drive](#2)
   * [3. Add Google Drive Image Data to Shopify Product CSV](#3)
   * [4. Export New Shopify Product CSV](#4)


### ACKNOWLEDGEMENTS:

Thanks to: [Jie Jenn](https://www.youtube.com/watch?v=kFR-O8BHIH4), [NeuralNine](https://www.youtube.com/watch?v=fkWM7A-MxR0), [alicia_mb](https://www.freepik.com/free-vector/hand-drawn-arrow-set_15961277.htm#query=hand%20drawn%20arrow&position=0&from_view=keyword&track=ais)

### 1. PREPARATION<a class="anchor" id="1"></a>

There are several steps that must be completed before running the code in this .ipynb file.

*1. Create a Shopify product .csv file*  
  
A detailed description of how to create a properly formatted Shopify product .csv file can be found [here](https://help.shopify.com/en/manual/products/import-export/using-csv). The product .csv file must be called shopify-products.csv, and must be located in the same directory as this .ipynb file. The product .csv file's "Image Src" column may be empty, or contain data—both options are fine. If there is existing "Image Src" data, it will be overwritten.

*2. Upload product images to Google Drive*  

Upload all product images to Google Drive folder. The folder must be a shared folder that is viewable to "anyone with the link". All product images must have a name that matches the Shopify handle of that product. If a certain product has multiple images, the first image should be named after the product's Shopify handle, and the other images should include the suffix "-image[number]". For example:  
| Image Description | Shopify Handle | Image File Name |
| --- | --- | --- |
| the product's primary image | running-shoes | running-shoes.jpg |
| another image of the same product | running-shoes | running-shoes-image2.jpg |
| a third image of the same product | running-shoes | running-shoes-image3.jpg |

Images may be of any file type supported by Shopify.

*3. Connect to the Google Drive API*  

A detailed description of how to connect to the Google Drive API can be found [here](https://developers.google.com/drive/api/quickstart/python). A video walkthrough of this process can be found [here](https://www.youtube.com/watch?v=fkWM7A-MxR0). Once the credentials file is downloaded, it must be named credentials.json and added to the same directory as this .ipynb file.

*4. Install all necessary Python packages*  

google-api-python-client  
google-auth-httplib2  
google-auth-oauthlib  

A description of how to install these packages can be found [here](https://developers.google.com/drive/api/quickstart/python). A video walkthrough can be found [here](https://www.youtube.com/watch?v=fkWM7A-MxR0).


### 2. GET DATA FROM GOOGLE DRIVE<a class="anchor" id="1"></a>

Once all of the preparation is complete, there should be 3 files in the current working directory: this .ipynb file, shopify-products.csv, and credentials.json. There should also be a folder on Google Drive containing the product images.  
  
The only variable in the following code that needs to be set is the Google Drive folder id. This can be copied from the web browser's address bar, after opening the Google Drive folder. The folder id is what comes after "folders/" in the URL.

In [None]:
# Folder id (taken from URL, after opening folder in browser)
folder_id = 

In [None]:
# Import libraries
from __future__ import print_function

import os.path

from google.auth.transport.requests import Request
from google.oauth2.credentials import Credentials
from google_auth_oauthlib.flow import InstalledAppFlow
from googleapiclient.discovery import build
from googleapiclient.errors import HttpError

import pandas as pd

In [None]:
# Create Drive API access token (this step can be skipped if you alread have an access token)
# Code was copied from: https://developers.google.com/drive/api/quickstart/python

# If modifying these scopes, delete the file token.json.
SCOPES = ['https://www.googleapis.com/auth/drive.metadata.readonly']


def main():
    """Shows basic usage of the Drive v3 API.
    Prints the names and ids of the first 10 files the user has access to.
    """
    creds = None
    # The file token.json stores the user's access and refresh tokens, and is
    # created automatically when the authorization flow completes for the first
    # time.
    if os.path.exists('token.json'):
        creds = Credentials.from_authorized_user_file('token.json', SCOPES)
    # If there are no (valid) credentials available, let the user log in.
    if not creds or not creds.valid:
        if creds and creds.expired and creds.refresh_token:
            creds.refresh(Request())
        else:
            flow = InstalledAppFlow.from_client_secrets_file(
                'credentials.json', SCOPES)
            creds = flow.run_local_server(port=0)
        # Save the credentials for the next run
        with open('token.json', 'w') as token:
            token.write(creds.to_json())

    try:
        service = build('drive', 'v3', credentials=creds)

        # Call the Drive v3 API
        results = service.files().list(
            pageSize=10, fields="nextPageToken, files(id, name)").execute()
        items = results.get('files', [])

        if not items:
            print('No files found.')
            return
        print('Files:')
        for item in items:
            print(u'{0} ({1})'.format(item['name'], item['id']))
    except HttpError as error:
        # TODO(developer) - Handle errors from drive API.
        print(f'An error occurred: {error}')


if __name__ == '__main__':
    main()

In [None]:
# Create service instance
SCOPES = ['https://www.googleapis.com/auth/drive.metadata.readonly']
creds = Credentials.from_authorized_user_file('token.json', SCOPES)
service = build('drive', 'v3', credentials=creds)

# Folder id (taken from URL, after opening folder in browser)
# folder_id = '' <----- This was set above, instead
query = "parents = '{}'".format(folder_id)

# Get file information for all files in the folder
response = service.files().list(q=query).execute()
files = response.get('files')
nextPageToken = response.get('nextPageToken')

while nextPageToken:
    response = service.files().list(q=query, pageToken=nextPageToken).execute()
    files = response.get('files')
    nextPageToken = response.get('nextPageToken')

# Save data to a Pandas DataFrame
drive_data = pd.DataFrame(files)

### 3. ADD GOOGLE DRIVE DATA TO SHOPIFY PRODUCT CSV<a class="anchor" id="3"></a>

In [None]:
# Keep only necessary columns from Google Drive data
drive_data = drive_data[['id', 'name']]

# Remove file extension from image name
drive_data['name'] = drive_data['name'].str.rsplit('.', n=1).str[0]

# Change id column into a URL to that file that Shopify can use
drive_data['id'] = "https://docs.google.com/uc?export=download&confirm=no_antivirus&id=" + drive_data['id']

# Filter out extra images (i.e. the 2nd, 3rd, etc. images of each product)
mask = drive_data['name'].str.match(r'(?i)^((.|\n)*)-image((.|\n)*)$', na=False)
extra_images = drive_data[mask]
drive_data = drive_data[~mask]

# Import Shopify data
shopify_data = pd.read_csv('shopify-products.csv')

# Merge Shopify data with Drive data
data = pd.merge(shopify_data, drive_data, how='left', left_on='Handle', right_on='name')

# Update "Image Src"
data.loc[data['id'].notnull(), 'Image Src'] = data['id']

# Drop extra columns
data = data.drop(columns=['id', 'name'])

# Make separate columns for handles and image-number suffixes
extra_images['handle'] = extra_images['name'].str.rsplit('-image', n=1).str[0]
extra_images['number'] = extra_images['name'].str.rsplit('-image', n=1).str[1].astype(int)

# Put extra images in order
extra_images = extra_images.sort_values(['handle', 'number']).reset_index(drop=True)

# Create data frame with the same columns as a Shopify product .csv file
extra_images_formatted = pd.DataFrame(columns=data.columns)
extra_images_formatted['Handle'] = extra_images['handle']
extra_images_formatted['Image Src'] = extra_images['id']

# Slice and concatenate data frame, to insert extra images
for i in range(0, len(extra_images_formatted)):
    line_to_concat = extra_images_formatted[extra_images_formatted.index == i]
    handle_to_concat = extra_images_formatted.at[i, 'Handle']
    index_to_concat_at = data['Handle'].where(data['Handle'] == handle_to_concat).last_valid_index()
    data = pd.concat([data.iloc[:index_to_concat_at + 1],
               line_to_concat,
               data.iloc[index_to_concat_at + 1:]]).reset_index(drop=True)

### 4. EXPORT NEW SHOPIFY PRODUCT CSV<a class="anchor" id="4"></a>

In [None]:
# Export product data
data.to_csv('shopify-products-new.csv', index=False)