# Fetching Data with an API and Preparing the Data

In [None]:
import numpy as np
import pandas as pd

## Requesting data from Google's Youtube API

First we have to create credentials. Go to https://console.cloud.google.com/. Sign in with your Google account if you haven't already. Click on "Create Project" and name a new project whatever you want. When it asks you to enable API's look through the API library, and select the Youtube Data API. Once you've enabled the API, you should be able to access your API Key. 

In [None]:
# Paste your youtube API here
api_key = 

We will just request data from Cal Poly Humboldt's Youtube Channel ([HumboldtOnline](https://www.youtube.com/@CalPolyHumboldt)).

In [None]:
# whenever we want to request data with an API
import requests

Read the Youtube API documentation [here](https://developers.google.com/youtube/v3/docs). We first want to get data about the Youtube channel. Navigate to the Channel endpoint, and read the documentation.

In [None]:
# the url for making the request 
url = url = "https://www.googleapis.com/youtube/v3/channels?key="+api_key+"&part=snippet&forHandle=CalPolyHumboldt"

In [None]:
# Request the data with a "GET" request
response = requests.get(url)
response

In [None]:
# check out response
type(response)

In [None]:
# learn more about response
help(response)

In [None]:
# use the json() method to access the data
response.json()

In [None]:
# Save the response data
payload = response.json()

In [None]:
# inpsect the payload
payload.keys()

In [None]:
# Inspect the data
payload['items'][0]['id']

In [None]:
# Save the channel id
channel_id = payload['items'][0]['id']

In [None]:
payload['items'][0]['snippet']

In [None]:
# Use this information to get video information (other notation)
search_url = 
parameters = 

search_response = requests.get(search_url, params = parameters)

In [None]:
# check to make sure it was a successful request
search_response...

In [None]:
payload = 

## Parsing/Preparing the Data

In [None]:
# Put the data in a pandas dataframe
payload_df = 
payload_df.head()

In [None]:
payload_normalized_df = 
payload_normalized_df.head()

In [None]:
payload_normalized_df.drop(columns = 'kind',inplace=True)

In [None]:
# inspect the id data
payload_normalized_df.columns = ['_'.join(i.split('.')[-2:]) if 'snippet.thumbnails' in i 
                                 else i.split('.')[-1] for i in payload_normalized_df.columns]
payload_normalized_df.head()

In [None]:
# Video title
clean_df = payload_normalized_df.copy()
clean_df.head()

## Enhancing the data with video-specific info

In [None]:
# Test getting data for a specific video
video_id = 
video_url = 
video_params = 

In [None]:
response_video_stats_test = requests.get(video_url,params = video_params)

In [None]:
response_video_stats_test.json()

Get data for multiple videos

In [None]:
# get data for multiple videos
ids = 
ids

In [None]:
# Create parameters for more video requests
more_video_params = 

In [None]:
# Request the data
response_stats = requests.get(video_url, params = more_video_params).json()

In [None]:
# Inspect the result
response_stats.keys()

In [None]:
# Access the statistics
response_stats['items'][0]['statistics']

In [None]:
# Add to the dataframe
clean_df['viewCount'] = [i['statistics']['viewCount'] for i in response_stats['items']]
clean_df.head()

In [None]:
# Add to the dataframe
clean_df['likeCount'] = [i['statistics']['likeCount'] for i in response_stats['items']]
clean_df.head()

## Activity

**Activity 1:** Explore other endpoints or parts of the Youtube API to get more information about Cal Poly Humboldt's channel or a specific video.

**Activity 2:** With a partner, choose a Youtube channel of your choice, and use the `requests` module to fetch basic video data from the YouTube API (e.g. videoId, publishedAt, title).

**Activity 3:** Put the response object in a Pandas DataFrame and use it to create two new columns `date` and `time` to show the date the video was published, and the time the video was published. 

In [None]:
import numpy as np
import pandas as pd

## Requesting data from Google's Youtube API

First we have to create credentials. Go to https://console.cloud.google.com/. Sign in with your Google account if you haven't already. Click on "Create Project" and name a new project whatever you want. When it asks you to enable API's look through the API library, and select the Youtube Data API. Once you've enabled the API, you should be able to access your API Key. 

In [None]:
# Paste your youtube API here
api_key = 

We will just request data from Cal Poly Humboldt's Youtube Channel (@HumboldtOnline). For this, we will need the channel's channel ID. Go to Youtube and find one of Cal Poly Humboldt's youtube videos. Right-click on your browser and click View Page Source. Search (Ctrl-F) for https://www.youtube.com/channel/ in the page source. The channel ID will appear directly after the /channel/ text in the URL path.

In [None]:
# Paste the channel ID here
channel_id = 

In [None]:
# whenever we want to request data from an API


Read the Youtube API documentation [here](https://developers.google.com/youtube/v3/docs). 

In [None]:
# The url to request from
url = 

In [None]:
# Request the data with a "GET" request
response = 
response

In [None]:
# check out response


In [None]:
# learn more about response


In [None]:
# use the json() method to access the data


In [None]:
# Save the response data
payload = response.json()

In [None]:
# inpsect the payload
payload.

In [None]:
# Inspect the data


## Parsing/Preparing the Data

In [None]:
# Put the data in a pandas dataframe
payload_df = 
payload_df.head()

In [None]:
# inspect the id data
payload_df

In [None]:
# inspect the snippet data
payload_df

In [None]:
# Create a new dataframe to store the clean data
clean_df = pd.DataFrame()

In [None]:
# add video ID
clean_df
clean_df.head()

In [None]:
# add published date/time
clean_df
clean_df.head()

In [None]:
# Video title
clean_df
clean_df.head()

## Enhancing the data with video-specific info

In [None]:
# Test getting data for a specific video
video_id = 
video_stats_test = "https://www.googleapis.com/youtube/v3/videos?id="+video_id+"&part=statistics&key="+api_key

In [None]:
response_video_stats_test = 

In [None]:
response_video_stats_test.json()

Get data for multiple videos

In [None]:
# get data for multiple videos
ids = ','.join(clean_df.videoID)
ids

In [None]:
# Create url for the video statistics
url_for_stats = "https://www.googleapis.com/youtube/v3/videos?id="+ids+"&part=statistics&key="+api_key

In [None]:
# Request the data
response_stats = 

In [None]:
# Inspect the result
response_stats.keys()

In [None]:
# Access the statistics
response_stats['items'][0]['statistics']

In [None]:
# Add to the dataframe
clean_df['viewCount'] = [i['statistics']['viewCount'] for i in response_stats['items']]
clean_df.head()

In [None]:
# Add to the dataframe
clean_df['likeCount'] = [i['statistics']['likeCount'] for i in response_stats['items']]
clean_df.head()

## Activity

**Activity 1:** With a partner, choose a Youtube channel of your choice, and use the `requests` module to fetch basic video data from the YouTube API (e.g. videoId, publishedAt, title).

**Activity 2:** Put the response object in a Pandas DataFrame and use it to create two new columns `date` and `time` to show the date the video was published, and the time the video was published. 