# Task: Collect Google Trends weekly and daily data since 2015-01-01

In this code, I use a Python API for Google Trends to collect data from Google Trends. This is pytrends library. At first, we need to install pytrends by the following line: (they were commented because this library was already installed on my laptop. You can uncomment it and run it to install on your machine and comment it again)

In [1]:
# !pip install pytrends
# !pip install pytrends --upgrade

Next, import some necessary libraries for the implementation:

In [2]:
import pandas as pd
from pytrends.request import TrendReq
pytrend = TrendReq()

Identify the keyword needed to get the data:

In [3]:
kw_list = ["Bitcoin"]

Next, declare the timeframe we need to collect data from:

In [4]:
pytrend.build_payload(kw_list, timeframe='2015-01-01 2022-09-18', geo='', gprop='')

At first, just take a look at the data collected from this timeframe. We can see that the data collected is monthly data. As we know, depending on the selected time frame, Google changes the frequency of the provided data points. So, we need to adjust the timeframe or use the available library to get weekly or daily data and merge them to get the data that we need, which is my idea for this assessment.

In [5]:
# Monthly data
df = pytrend.interest_over_time()
df

Unnamed: 0_level_0,Bitcoin,isPartial
date,Unnamed: 1_level_1,Unnamed: 2_level_1
2015-01-01,3,False
2015-02-01,3,False
2015-03-01,3,False
2015-04-01,2,False
2015-05-01,2,False
...,...,...
2022-05-01,35,False
2022-06-01,41,False
2022-07-01,30,False
2022-08-01,22,False


For daily data, I use the available library from pytrends package, which is dailydata.

In [10]:
# Daily data
from pytrends import dailydata

df3 = dailydata.get_daily_data('bitcoin', 2015, 1, 2022, 9, geo= '')
df3.to_csv('dailyData.csv') # export daily data to csv file, this file was uploaded to Github repository too.

bitcoin:2015-01-01 2015-01-31
bitcoin:2015-02-01 2015-02-28
bitcoin:2015-03-01 2015-03-31
bitcoin:2015-04-01 2015-04-30
bitcoin:2015-05-01 2015-05-31
bitcoin:2015-06-01 2015-06-30
bitcoin:2015-07-01 2015-07-31
bitcoin:2015-08-01 2015-08-31
bitcoin:2015-09-01 2015-09-30
bitcoin:2015-10-01 2015-10-31
bitcoin:2015-11-01 2015-11-30
bitcoin:2015-12-01 2015-12-31
bitcoin:2016-01-01 2016-01-31
bitcoin:2016-02-01 2016-02-29
bitcoin:2016-03-01 2016-03-31
bitcoin:2016-04-01 2016-04-30
bitcoin:2016-05-01 2016-05-31
bitcoin:2016-06-01 2016-06-30
bitcoin:2016-07-01 2016-07-31
bitcoin:2016-08-01 2016-08-31
bitcoin:2016-09-01 2016-09-30
bitcoin:2016-10-01 2016-10-31
bitcoin:2016-11-01 2016-11-30
bitcoin:2016-12-01 2016-12-31
bitcoin:2017-01-01 2017-01-31
bitcoin:2017-02-01 2017-02-28
bitcoin:2017-03-01 2017-03-31
bitcoin:2017-04-01 2017-04-30
bitcoin:2017-05-01 2017-05-31
bitcoin:2017-06-01 2017-06-30
bitcoin:2017-07-01 2017-07-31
bitcoin:2017-08-01 2017-08-31
bitcoin:2017-09-01 2017-09-30
bitcoin:20

For weekly data, I adjust the dataframe so that the collected data can be weekly data. I tested on Google Trends website and saw that 5-year data can be weekly data, so I divided our timeframe into 2 timeframes: one from 2015-01-01 to 2020-01-01; and one from 2020-01-01 to now (because this code was written on 2022-09-18 to the time used here is 2022-09-18 for this problem) to create 2 dataframes for each time period and then append the first one by the second one to have the complete data for this assessment.

In [11]:
# Weekly data
pytrend.build_payload(kw_list, timeframe='2015-01-01 2020-01-01', geo='', gprop='')
df1 = pytrend.interest_over_time()

In [12]:
df1

Unnamed: 0_level_0,Bitcoin,isPartial
date,Unnamed: 1_level_1,Unnamed: 2_level_1
2015-01-04,3,False
2015-01-11,3,False
2015-01-18,2,False
2015-01-25,2,False
2015-02-01,2,False
...,...,...
2019-12-01,9,False
2019-12-08,8,False
2019-12-15,9,False
2019-12-22,8,False


In [13]:
pytrend.build_payload(kw_list, timeframe='2020-01-01 2022-09-18', geo='', gprop='')
df2 = pytrend.interest_over_time()
df2

Unnamed: 0_level_0,Bitcoin,isPartial
date,Unnamed: 1_level_1,Unnamed: 2_level_1
2020-01-05,14,False
2020-01-12,14,False
2020-01-19,13,False
2020-01-26,13,False
2020-02-02,14,False
...,...,...
2022-08-14,24,False
2022-08-21,23,False
2022-08-28,24,False
2022-09-04,22,False


In [14]:
df_weekly = df1.append(df2)
df_weekly

Unnamed: 0_level_0,Bitcoin,isPartial
date,Unnamed: 1_level_1,Unnamed: 2_level_1
2015-01-04,3,False
2015-01-11,3,False
2015-01-18,2,False
2015-01-25,2,False
2015-02-01,2,False
...,...,...
2022-08-14,24,False
2022-08-21,23,False
2022-08-28,24,False
2022-09-04,22,False


In [15]:
df_weekly.to_csv('weeklyData.csv') # export weekly data to a csv file (uploaded to Github too).