# Extracting Google Search Data - PyTrends
The Following Notebook demonstrate how to use PyTrends to extract daily information form Google Search...</br>
I've created a simple funtions that can extract multiple terms and countries and merge the results using the https://pypi.org/project/pytrends/ libraries...

In [None]:
%%time
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

In [None]:
%%time
# Install Pytrends
!pip install pytrends

In [None]:
%%time
# Import PyTrend
from pytrends.request import TrendReq
from pytrends import dailydata

In [None]:
%%time
def extract_data(start_date: str, end_date: str, search_terms: list, countries:list) -> pd.DataFrame:
    """
    A simple function to extract information as the score and unscaled_score for a list of search terms.
    and specified list of countries for a specified start and end date
    informatio is aggregated by day.
    
    Args: 
        start_date: Start date for the data extraction
        end_date: End date for the data extraction
        search_terms: A list of strings containing the search items example = ['hat', 'mug', 'sticker'] 
        countries: A list of strings containing the countries example = ['FI', 'SE']
    
    Returns: 
        search_df: Combined DataFrame of all the searched terms by country
    
    """
    start_date = pd.Timestamp(start_date)
    end_date = pd.Timestamp(end_date)
    
    search_df = pd.DataFrame()

    counter = 1
    for search_term in search_terms:
        for country in countries:
            print('Extracting:',counter,'/', len(search_terms) * len(countries), '...')
            df = dailydata.get_daily_data(search_term, 
                                          start_year = start_date.year, 
                                          start_mon  = start_date.month, 
                                          stop_year  = end_date.year, 
                                          stop_mon   = end_date.month, 
                                          geo        = country,
                                          verbose    = False
                                         )

            df['date'] = df.index
            df['search_word'] = search_term
            df['country'] = country
            df['score'] = df[search_term]
            df['score_unscaled'] = df[search_term + '_unscaled']
            df = df[['date', 'search_word', 'country', 'score', 'score_unscaled']]
            search_df = search_df.append(df)
            counter += 1
            
    search_df = search_df.reset_index(drop=True)
    print('...')
    print('Extraction Completed!')
    return search_df

In [None]:
%%time
example_df = extract_data(start_date = '01-01-2015', end_date = '01-01-2015', search_terms = ['hat', 'mug'], countries = ['FI', 'SE'])

In [None]:
%%time
example_df.sample(10)