# Amazon UK Query Suggestions

## Installations

Importing all necessary modules to run this notebook. Ensure fake-useragent has been installed prior to running this notebook.

In [None]:
import pandas as pd
import numpy as np
import requests
import json
import time
import warnings

In [None]:
from fake_useragent import UserAgent

## UserAgent

Initializing all necessary variables before creating URL for scraping.

In [None]:
ua = UserAgent()
headers = {"user-agent": ua.chrome}

## Loading List of Toys Collected from Previous Research

all_items.txt contains a list of strings, where each string represents a toy that will be searched on Amazon UK. This text file contains 166 rows.

In [None]:
with open('../predoc_info/all_items.txt') as f:
    contents = f.read().splitlines()

## Loading Pre-Documented Gender Stereotyped Toys

Taking in predoc_stereotyped_items.csv, a CSV file containing 72 rows.

In [None]:
stereo_toys = pd.read_csv('../predoc_info/predoc_stereotyped_items.csv', delimiter =',')
stereo_toys[:10]

## Exploratory Data Analysis

### Dataset Statistics

Analyzing toys with respect to their pre-documented gender stereotype.

In [None]:
boy_toys = stereo_toys['BOY'].dropna().unique().tolist()
girl_toys = stereo_toys['GIRL'].dropna().unique().tolist()
neutral_toys = stereo_toys['NEUTRAL'].dropna().unique().tolist()

print("stereotypical boy toys: ", len(boy_toys), 
      " stereotypical girl toys: ", len(girl_toys), 
      " stereotypically gender neutral toys: ", len(neutral_toys))

## Preparing the data for query

### Adding "for"

The following code adds "for" on to the query."

In [None]:
search_terms = []
for x in contents:
    search_terms.append((x, x+' for'))
search_terms[:5]

## Scraping Functions

### Amazon UK Scrape Function

The following functions take the inputted query and place it into the URL. The URL then creates the data that is scraped using amazon_auto().

In [None]:
def amazon_scrape(query):
    url = f"https://completion.amazon.com/api/2017/suggestions?session-id=131-6901588-5783061&customer-id=A373R49950VTB6&request-id=72HVV86S4JC3AK898B3X&page-type=Gateway&lop=en_gb&site-variant=desktop&client-info=amazon-search-ui&mid=ATVPDKIKX0DER&alias=aps&ks=undefined&prefix={query}&event=onFocusWithSearchTerm&limit=11&b2b=0&fresh=0&fb=1&suggestion-type=KEYWORD&suggestion-type=WIDGET&_=1637596795610"
    response = requests.get(url, headers=headers, verify=False).json()
    return response

In [None]:
def amazon_auto(item):
    res = amazon_scrape(item)
    suggestions = res['suggestions']
    results = []
    for s in suggestions:
        results.append(s['value'])
    return results

## Database Initialization

Initializing databases to store scraped data.

In [None]:
columns = ['platform', 'item', 'suggestions']
df = pd.DataFrame(columns = columns)

## Running Queries

This code is used to scrape all relevant data from the toys included in search_terms.

In [None]:
warnings.filterwarnings('ignore')
platforms = ['Amazon_UK']
data = []
for item, q in search_terms:
    for platform in platforms:
        result = amazon_auto(q) 
        values = [platform, item, result]
        zipped = zip(columns, values)
        a_dictionary = dict(zipped)
        time.sleep(1.5)
        data.append(a_dictionary)

Appending suggestion data to previously initialized dataframe.

In [None]:
df = df.append(data, True)
df

Export data to CSV file.

In [None]:
df.to_csv('az_uk_query_suggestions.csv', index = False)