# Tutorial

You will need to run the following commands in your terminal so we can use them later:
* pip install google-api-python-client
* pip install youtube_dl
--------------------
Might not get to today
* pip install librosa

### Step 1: Setting up API through Google Console Cloud
* Create a google account
* Create a new project in the Google Cloud Console (https://console.cloud.google.com)
* Search "APIs & Services" > "Credentials". Click on the "Create Credentials". Select "API key"
    * Grab that API key and put it in a json file, similar to what we did in class

In [None]:
{
    "developerKey": "key"
}

* Click "Library" on the left side.
* Search "Youtube" and click on "Youtube Data API v3".
* Click "Enable"

### Step 2: Using Youtube API
* Create a new file, I'm going to name mine 'youtube-main.py'
* We need to import a couple libraries

In [None]:
import json
import csv
import googleapiclient.discovery

* We're going to grab that key from a json file

In [None]:
if __name__ == "__main__":
    
    # get api key from json file
    with open('api_keys.json', 'r') as f:
        json_obj = json.load(f)
        api_key = json_obj["developerKey"]
        
    youtube = googleapiclient.discovery.build("youtube", "v3", developerKey=api_key)

* We're going to open our file `songs.csv` to put that information into a dictionary

In [None]:
# get songs from labels.csv
with open('songs.csv', 'r') as csvfile:
    reader = csv.DictReader(csvfile)
    songs_info = list(reader)

* Now we're going to create a function that get a video id for a video with the audio that we want 

In [None]:
def get_video_id(query):
    
    request = youtube.search().list(
        q=query,
        type="video",
        videoCategoryId="10",
        part="id"
    )
    
    response = request.execute()
    print("response is:", response)
    # for tutorial, maybe return either response and request
        # and show different indexes of reponse
    if len(response["items"]) > 0: 
        return response["items"][0]["id"]["videoId"]
    else:
        return None

* This returns the video id for us, for we're going to put that all together to get a url for the video we want to extract the audio from

In [None]:
video_url = "https://www.youtube.com/watch?v="
    
urls = []
for song in songs_info:
    #print(info)
    query = " ".join([song["song name"], song["artist"]])
    #print(query)
    video_id = get_video_id(query)
    if video_id is not None:
        url = "".join([video_url, video_id])
        urls.append({"song name": song["song name"], "artist": song["artist"], "genre": song["genre"], "url": url})

* Now we're going to save that to another file

In [None]:
with open("urls_songs.csv", "w", newline="") as f:
    header = ["song name", "artist", "genre", "url"]
    writer = csv.DictWriter(f, fieldnames=header)
    writer.writeheader()
    writer.writerows(urls) 

#### Step 3: Using Youtube-DL
* We will use youtube-dl to extract audio from a video
* First we will import a couple libraries

In [None]:
import youtube_dl # pip install youtube_dl, pretty quick
import csv

* We will then extract the data from our file and store the urls in a list

In [None]:
# get urls from urls.csv
with open('urls_test.csv', 'r') as csvfile:
    reader = csv.DictReader(csvfile)
    urls_info = list(reader)

print(urls_info)

urls = []
for row in urls_info:
    urls.append(row["url"])
    
print(urls)

* With youtube-dl, it is also a command line program, but to use it in a program, we need to specify our requirements for the download

In [None]:
ydl_opts = {
    'format': 'bestaudio/best',
    'outtmpl': 'audio/%(title)s.%(ext)s',
    'extractaudio': True,
    'postprocessors': [{
        'key': 'FFmpegExtractAudio',
        'preferredcodec': 'mp3',
        'preferredquality': '192',
    }],
    'noplaylist': True
}

* Now we'll actually extract the audio and move it to a file

In [None]:
for url in urls:
    print("Downloading audio from", url)
    youtube_dl.YoutubeDL(ydl_opts).download([url]) 

You might get an error that says

`youtube_dl.utils.DownloadError: ERROR: Unable to extract uploader id; please report this issue on https://yt-dl.org/bug. Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.`

If you get this error, you're going to want to look into it and find the file ending in `extractor/youtube.py`, open that file and navigate to where it says:

`'uploader_id': self._search_regex(r'/(?:channel|user)/([^/?&#]+)', owner_profile_url, 'uploader id') if owner_profile_url else None,` 

Replace that line with:

`'uploader_id': self._search_regex(r'/(?:channel/|user/|@)([^/?&#]+)', owner_profile_url, 'uploader id', default=None),`

Now run the program again and you should see your audio files downloading

If you got an error that says 

`youtube_dl.utils.DownloadError: ERROR: ffprobe/avprobe and ffmpeg/avconv not found. Please install one.`

Run this command: `sudo apt-get install ffmpeg`

#### Step 4: Using Librosa
* Librosa is a tool that helps us extract audio features that will help the classifier identify what genre the music should go into
* We will just do a couple:
    * tempo
    * spectral centroid
    * zero-crossing-rate

In [None]:
import librosa
import numpy as np

audio_file = 'audio/Whitney Houston - I Will Always Love You (Official 4K Video).mp3'
y, sr = librosa.load(audio_file)

# compute spectral centroid
centroid = librosa.feature.spectral_centroid(y=y, sr=sr)

# compute zero-crossing rate
zcr = librosa.feature.zero_crossing_rate(y)

# compute tempo
tempo, _ = librosa.beat.beat_track(y=y, sr=sr)

print("tempo:", tempo)
print("spectural centroid mean:", np.mean(centroid))
print("zero crossing rate:", np.mean(zcr))

##### Youtube-DL options information
* Here is some information about the following
    * format: 'bestaudio/best'
        * Specifies the format for the downloaded video or audio. In this case, it is set to 'bestaudio/best', which means that youtube-dl will try to download the best quality audio available.

    * outtmpl: 'audio/%(title)s.%(ext)s'
        * Specifies the output file path and name for the downloaded audio file. In this case, the downloaded audio file will be saved in a directory named 'audio' and will be named after the video title and extension.

    * extractaudio: True
        * Specifies that youtube-dl should extract the audio from the downloaded video file.

    * postprocessors: [{'key': 'FFmpegExtractAudio', 'preferredcodec': 'mp3', 'preferredquality': '192'}]
        * Specifies a list of post-processing steps that should be applied to the downloaded audio file. In this case, the 'FFmpegExtractAudio' key specifies that youtube-dl should extract the audio using FFmpeg, and the 'preferredcodec' and 'preferredquality' keys specify that the audio should be encoded as an MP3 file with a bitrate of 192 kbps (kilobits/second).

    * noplaylist: True
        * Specifies that youtube-dl should only download a single video, rather than downloading an entire playlist if the URL provided is a playlist.