# Data Connector for YouTube 

In this example, we will be going over how to use Data Connector with youTube.

## Prerequisites

data_connector is a component in the dataprep library that aims to simplify the data access by providing a standard API set. The goal is to help the users skip the complex API configuration. In this tutorial, we demonstrate how to use data_connector library with YouTube.

If you haven't installed dataprep, run command `pip install dataprep` or execute the following cell.

In [1]:
# Run me if you'd like to install
#!pip install dataprep

In order for you to get the YouTube API working, you need to first have a [Google Account](https://accounts.google.com/signup/v2/webcreateaccount?continue=https%3A%2F%2Faccounts.google.com%2FManageAccount%3Fnc%3D1&flowName=GlifWebSignIn&flowEntry=SignUp).

# Obtaining access token from youTube

Assuming you have a google account, you can then fetch an API token following these three simple steps:

1. Login to [Google Cloud Platform](https://console.developers.google.com/) using your google account. On your dashboard, click on **Select a Project** and choose an existing project if you have one, or click on **New Project** - provide a project name and organization as required. 

2. Next, click on **Enable APIs and Services** button on the top left corner of the window. Scroll down to find **YouTube Data API v3** and **Enable** the API service. Ensure that the API service is enabled as shown in the image below.

![title](images/youtube_enabled.png)

3. Under **APIs & Services** on your project Navigation Menu, navigate to the **credentials** section. Click on **Create Credentials** on top of the window and select **API Key**. This will generate an API key that can be used to search data from YouTube using Data Connector!

![title](images/youtube_credentials.png)

# Initialize data_connector

To initialize run the following code. Copy and paste the **YouTube Access Token Secret** into the **access_token** variable and ensure the connector path is correct. This returns an object establishing a connection with YouTube. Once you have that running you can use the built in functions available in connector.

In [5]:
from dataprep.data_connector import Connector

auth_token = "<your_access_token>"
dc = Connector('youtube', _auth={"access_token":auth_token})

dc

<dataprep.data_connector.connector.Connector at 0x7f296ba68f10>

# Functionalities

Data connector has several functions you can perform to gain insight on the data downloaded from YouTube.

### Connector.info
The info method gives information and guidelines of using the connector. There are 3 sections in the response and they are table, parameters and examples.
>1. Table - The table(s) being accessed.
>2. Parameters - Identifies which parameters can be used to call the method. For YouTube, 
    * **q** is a required parameter that acts as a filter to fetch relevant video content. 
    * **part** parameter is mandatory to retrieve any resource from YouTube. This parameter allows you to fetch partial resource components that your application actually uses. (Ex: snippet, contentDetails, player, statistics, etc). To know more about the part parameter, please visit [YouTube Developer Documentation](https://developers.google.com/youtube/v3/getting-started#part). 
    * **type** is an optional parameter that allows you to specify the type of data (Ex: videos, channels, or playlists). Not specifying the type fetches all types of content related to your search query.
    * **maxResults** is an optional parameter used to specify the number of items to fetch per request.
>3. Examples - Shows how you can call the methods in the Connector class.

In [3]:
dc.info()


Table youtube.videos

Parameters
----------
q, part required 
type, maxResults, pageToken optional 

Examples
--------
>>> dc.query("videos", q="word1", part="word2")
>>> dc.show_schema("videos")



### Connector.show_schema
The show_schema method returns the schema of the website data to be returned in a Dataframe. There are two columns in the response. The first column is the column name and the second is the datatype.

As an example, lets see what is in the tweets table.

In [4]:
dc.show_schema("videos")

table: videos


Unnamed: 0,column_name,data_type
0,etag,string
1,videoId,string
2,publishedAt,string
3,channelId,string
4,title,string
5,description,string
6,channelTitle,string
7,publishTime,string


### Connector.query
The query method downloads the website data and displays it in a Dataframe. The parameters must meet the requirements as indicated in connector.info for the operation to run. You can use the **maxResults** parameter to specify the number of vidoes/channels/playlists to be fetched. Each request can currently fetch a maximum of 50 items.

When the data is received from the server, it will either be in a JSON or XML format. The data_connector reformats the data in pandas Dataframe for the convenience of downstream operations.

As an example, let's try to fetch **40 videos** related to **Data Science** from YouTube.

#### Searching for Videos related to Data Science

In [8]:
df = dc.query("videos", q="Data Science", part="snippet", type='videos', maxResults=40)
df

Unnamed: 0,etag,videoId,publishedAt,channelId,title,description,channelTitle,publishTime
0,6znlxNDEq3exijy84F6TBYh_Voc,xC-c7E5PK0Y,2018-06-23T01:51:50Z,UCV0qA-eDDICsRR9rPcnG7tw,What REALLY is Data Science? Told by a Data Sc...,Resume Template and Cover letter I used for ap...,Joma Tech,
1,BHk3ZOf85w_GJwg3WuR76i-q7nU,ua-CiDNNj30,2019-05-30T12:48:19Z,UC8butISFwT-Wl7EV0hUK0BQ,Learn Data Science Tutorial - Full Course for ...,Learn Data Science is this full tutorial cours...,freeCodeCamp.org,
2,xYtx3nTrU44ji_QqH_c5vo7UymM,4OZip0cgOho,2020-05-08T13:00:03Z,UCiT9RITQ9PW6BhXK0y2jaeg,How I Would Learn Data Science (If I Had to St...,"In this video, I talk about how I would learn ...",Ken Jee,
3,FE_xTTsNKnXGVu9BTpKldUsYrCM,X3paOmcrTjQ,2018-12-04T14:30:01Z,UCsvqVGtbbyHaMoevxPAq9Fg,Data Science In 5 Minutes | Data Science For B...,This Data Science tutorial video will give you...,Simplilearn,
4,NsdHjZvCjLF_8mkwcbf5JuNbuBY,-ETQ97mXXF0,2019-08-18T08:30:02Z,UCkw4JCwteGrDHIsyIIKo4tQ,Data Science Full Course - Learn Data Science ...,Data Science Master Program: https://www.edure...,edureka!,
5,gjuObVjaLkAKukHym389_YrHNpw,m5pwx3hgtzM,2019-12-16T18:46:58Z,UCiT9RITQ9PW6BhXK0y2jaeg,3 Reasons You Should NOT Become a Data Scientist,In this video I talk about 3 reasons that you ...,Ken Jee,
6,bSpzDqJtfgC2QCpCKqTngxsbHfs,iJUzouXg5kY,2019-07-23T21:28:41Z,UCsT0YIqwnpJCM-mx7-gSA4Q,Demystifying Data Science | Mr.Asitang Mishra ...,In this talk Mr.Asitang Mishra relates his exp...,TEDx Talks,
7,faKapvwZjBip4FT75gEOhXDXKtA,tQYCd8tg56U,2019-08-08T10:46:27Z,UCeObZv89Stb2xLtjLJ0De3Q,"Big data, дополненная реальность и компьютерно...",В сегодняшнем выпуске у меня в гостях Data Sci...,АйТиБорода,
8,rjNxmufFnhJR_gCJATtUojNzF4E,PXLVLS1vJHY,2020-01-15T16:15:01Z,UCEBpSZhI1X8WaP-kY_2LLcg,Is Data Science Really a Rising Career in 2020...,Download Our Free Data Science Career Guide:✅h...,365 Data Science,
9,pWjhYaLQQ9_DeEZ3V13ZXKDWOB0,UXi8Ml2UoYk,2019-03-08T08:05:41Z,UCEBpSZhI1X8WaP-kY_2LLcg,What Do You Need to Become a Data Scientist in...,Download Our Free Data Science Career Guide:✅h...,365 Data Science,


# That's all for now. 
If you are interested in writing your own configuration file or modify an existing one, refer to the [Configuration Files](https://github.com/sfu-db/DataConnectorConfigs>).