# Google Trends Tutorial


Google Trends is a website for investigating how different Google keyword searches have trended over time. There is so much you can do with this resource, from comparing the interest in different topics by state to discovering new trends in shopping searches.

### To get started, first check out the [Google Trends website](https://trends.google.com/trends/)
  - Enter some keyword search terms to see trends over time (you can compare up to 5 terms)
  - You can filter on date range, geography, search categories (e.g. Shopping Search) and types (e.g News, Images, etc, all the normal types of Google searches you can do)
  - Google returns daily or weekly aggregations (you can't choose) based on how long the time frame is

### A few important caveats to bear in mind with Google Trends
  - Google provides *sampled* data, so if you pull the same data twice there may be a small amount of variability.
  - The data is scaled, so that the maximum value in any requested dataset will be 100. So if you separately pull reports for different keywords, you won't be able to tell relative popularity between them (but you can pull multiple keywords at a time to compare their search interest directly)
  - The search interest values provided are NOT the same as search *volume*. They are search interest *relative to all Google searches* during the time period and geography you are requesting. Since the overall usage of Google has grown over the years, this means that a search term that has constant search *volume* will show up as having decreasing search *interest* over time (e.g. "computer"). A term whose search volume has grown in pace with the internet's growth will show up as having a relatively constant search interest (e.g. "life").
  
### PyTrends
Google does not supply an API for Google Trends, but some lovely folks at General Mills have created the **PyTrends** package for pulling in this data from Python. Documentation is in the README of the GitLab repo [here](https://github.com/GeneralMills/pytrends). You can install PyTrends by running:

```
pip install pytrends
```

Let's get started by importing the above package into this Jupyter Notebook:

In [None]:
from pytrends.request import TrendReq

import pandas as pd
import matplotlib.pyplot as plt

%matplotlib inline

# set a nice plotting style
# plt.style.use('ggplot')

In [None]:
with open('super_secrets.txt', 'r') as f:
    secrets = f.read().splitlines() 
username, password = secrets

connector = TrendReq(username, password)

In [None]:
# Enter your Google username and password

# connector = TrendReq('username', 'password')

## Download some data

Now that you are logged into Google, you can build a "payload" to specify what data you want to download. PyTrends returns the data as a Pandas dataframe.

In [None]:
my_search_terms = ['sklearn']  # list with up to 5 items

connector.build_payload(my_search_terms)
df = connector.interest_over_time()

In [None]:
df.head()

In [None]:
df.plot()

In [None]:
connector.build_payload?

### Building advanced requests

By default the `build_payload()` method will give you 5 years of worldwide data, with no filters applied on search type or category. There are many categories you can choose from by setting the `cat` parameter (default `cat=0` means all categories, i.e. a normal Google search); a few other options are:
- Shopping: `cat=18`
- Sports: `cat=20`
- Travel: `cat=67`
    
Similary, the default setting of `gprop=''` gives you regular Web Search data. You can filter to other types of searches by setting `gprop` to:
- 'images'
- 'news'
- 'youtube'
- 'froogle' (Google Shopping)

Geography can be specified with the `geo` parameter, e.g. `geo='US'` will give you search interest in the U.S.

Timeframe is a bit tricky. If you need to be more specific than the default 5 year window, I recommend giving custom start and end dates. The structure for this is to set the `timeframe` parameter to a string with a space between start and end dates: `'yyyy-mm-dd yyyy-mm-dd'`. See the following example.

In [None]:
# Google Shopping searches for skis and snowboards in the US for last three years
connector.build_payload(['skis', 'snowboards'], cat=0, timeframe='2014-08-01 2017-08-01',
                        geo='US', gprop='froogle')

In [None]:
df = connector.interest_over_time()
df.plot()

In [None]:
connector.build_payload(['whistler', 'snowbird'], geo='US')
df = connector.interest_over_time()
df.plot()

In [None]:
connector.interest_by_region()  # check out hawaii!

### Related search terms

In [None]:
related = connector.related_queries()

In [None]:
related.keys()

In [None]:
related['sklearn']['top']

In [None]:
related['sklearn']['rising']