In [None]:
"""
[Step 1]
- Go to console.cloud.google.com
- Create a project
- Search for "YouTube Data API v3" 
- Click Enable

- Click Credentials on the bar on the left
- Click "Create Credentials" at the top of the screen 
- You will get an API key

- With the API key, create a file src/.env
- In the file: src/.env, 
	YOUTUBE_DATA_API_KEY=INSERT_YOUR_API_HERE
	(replace INSERT_YOUR_API_HERE with the API you obtained in the previous step)

- Add a .gitignore to ignore the .env file to avoid leaking your API key!    
"""

In [None]:
"""
[Step 2] Go to src/query_config.yaml and modify the parameters 
"""

In [None]:
"""
[Step 3] Read the following!

We want to extract all YouTube videos with 'wayang kulit' in the title using Google YouTube API.
Each day, we are provided with 10 000 credits to make API calls.
	- Getting one page of results (up to 50 results) costs 100 credits each.
	- Getting the details of a specific video costs 1 credit each.
This works out to at least 3 credits per video (≥2 for the search + 1 for the details) 
and about 6666 videos per day.

Note that for a given search, we can only access up to 10 pages (restriction imposed by Google).
This gives a maximum of 500 videos per search.
To circumvent this restriction, we choose a sufficiently narrow time window such that 
the search results are less than (10 pages x 50 results/page).

Given a defined time window in minutes, we collect all the results then shift
the time window backwards in time.

The code below does the following: 
	1. Get the list of YouTube videos with 'wayang kulit' in the title from 
		Google's YouTube API within the specified timeframe

	2. For each video in the list:
		a. Get the relevant information from the response
			- Video URL
			- Title
			- Duration
			- Name of the Channel 
			- Number of Likes 
			- Date 
			- Description
		b. Write the data into {local_i={i}_metadata} csv file

	3. If there's a next page, get results for the next page. 
		If not:
			- combine the csv file from {local_i={i}_metadata} csv file to the 
				main {output_data_path} csv file 
			- write to {output_completion_log_path} csv file
			- redefine the timeframe to something earlier 
			- go to step 2.

"""

In [None]:
"""
[Step 4]
There are several parameters for this python script.

	output_data_path
	output_completion_log_path
	temp_dir_file_path
	i_start
	i_end
	ref_datetime
	window_size_in_mins
    
[Important!]
The command goes through one set of results found 
Ensure that the number of videos in each result does not exceed 500! 
YouTube API does not return the 501st, 502nd,... results!
As a workaround, we narrow the search window.

[Outputs]
data/completion_results.csv - output of search results
data/video_metadata.csv 	- provides metadata on the results that were completed successfully
data/temp_dir/* 			- output of each pagefor each search results

[Mechanism]
1. The function searches one time window within a specified window size ranging from [now-window, now)
2. Running this command again will search [now-2*window,now-window)
   This is based on the metadata found in data/video_metadata.csv 
"""

In [None]:
# Calling the following command would default to its internal values.
!python3 -m src.query_youtube_api

In [None]:
# I recommend calling the command and adjusting the window_size_in_mins manually.
!python3 -m src.query_youtube_api --window_size_in_mins 1800
