# Argos Query Suggestions

## Installations

Importing all necessary modules to run this notebook. Ensure fake-useragent has been installed prior to running this notebook.

In [None]:
import pandas as pd
import numpy as np
import requests
import json
import time
import warnings
import matplotlib.pyplot as plt
import seaborn

In [None]:
from fake_useragent import UserAgent

## UserAgent

Initializing all necessary variables before creating URL for scraping.

In [None]:
ua = UserAgent()
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36'}

## Loading List of Toys Collected from Previous Research

all_items.txt contains a list of strings, where each string represents a toy that will be searched on Amazon UK. This text file contains 166 rows.

In [None]:
with open('../predoc_info/all_items.txt') as f:
    contents = f.read().splitlines()

## Loading Pre-Documented Gender Stereotyped Toys

Taking in predoc_stereotyped_items.csv, a CSV file containing 72 rows.

In [None]:
stereo_toys = pd.read_csv('../predoc_info/predoc_stereotyped_items.csv', delimiter =',')
stereo_toys[:10]

## Exploratory Data Analysis

### Dataset Statistics

Analyzing toys with respect to their pre-documented gender stereotype.

In [None]:
boy_toys = stereo_toys['BOY'].dropna().unique().tolist()
girl_toys = stereo_toys['GIRL'].dropna().unique().tolist()
neutral_toys = stereo_toys['NEUTRAL'].dropna().unique().tolist()

print("stereotypical boy toys: ", len(boy_toys), 
      " stereotypical girl toys: ", len(girl_toys), 
      " stereotypically gender neutral toys: ", len(neutral_toys))

Visualizing item gender distribution.

In [None]:
data = [len(boy_toys), len(girl_toys), len(neutral_toys)]
keys = ['Boys', 'Girls', 'Neutral']
  
# define Seaborn color palette to use
palette_color = seaborn.color_palette('bright')
  
# plotting data on chart
plt.pie(data, labels=keys, colors=palette_color, autopct='%.0f%%')
  
# displaying chart
plt.title('Percentage of Pre-documented Gendered Toys')
plt.show()

This pie chart visualizes the percentage of toys assigned to genders. Boys have the highest percentage, with girls being second and neutral the last.

## Preparing the data for query

### Adding "for"

The following code adds "for" on to the query.

In [None]:
search_terms = []
for x in contents:
    search_terms.append((x, x+' for'))
search_terms[:5]

## Scraping Functions

### Argos Scrape Function

The following functions take the inputted query and place it into the URL. The URL then creates the data that is scraped using argos_auto().

In [None]:
def argos_scrape(query):
    url = f'https://www.argos.co.uk/suggest?term={query}%20f&fuzziness=true&size=5&includeFaq=true'
    response = requests.get(url, headers=headers, verify=False).json()
    return response

In [None]:
def argos_auto(item):
    res = argos_scrape(item)
    suggestions = res['autoSuggest']['keywords']
    results = []
    for s in suggestions:
        results.append(s['value'])
    return results

## Database Initialization

Initializing databases to store scraped data.

In [None]:
columns = ['platform', 'item', 'suggestions']
df = pd.DataFrame(columns = columns)

## Running Queries

This code is used to scrape all relevant data from the toys included in search_terms. If needed, one can use 'Trial Run' to check if the code is working properly instead of iterating through the entirety of search_terms.

#### Trial Run

In [None]:
trial = search_terms[:5]

warnings.filterwarnings('ignore')
platform = 'Argos'
trial_data = []
for item, q in trial:
    result = argos_auto(q)
    values = [platform, item, result]
    zipped = zip(columns, values)
    a_dictionary = dict(zipped)
    time.sleep(1.5)
    trial_data.append(a_dictionary)

In [None]:
trial_data

#### Full Run

In [None]:
warnings.filterwarnings('ignore')
platform = 'Argos'
data = []
for item, q in search_terms:
    result = argos_auto(q)
    values = [platform, item, result]
    zipped = zip(columns, values)
    a_dictionary = dict(zipped)
    time.sleep(1.5)
    data.append(a_dictionary)

In [None]:
data[0]

Appending suggestion data to previously initialized dataframe.

In [None]:
df = df.append(data, True)
df

Export data to CSV file.

In [None]:
df.to_csv('argos_query_suggestions.csv', index = False)