## YouTube

YouTube will grant us access to its data without requiring authentication, but we must generate a secret ID or key.

Follow these steps to get set up:

1. [Get a google account](https://www.google.com/accounts) if you don't already have one.
2. Go to the [Google API Console](https://console.developers.google.com/) and create a project (e.g. msds692).
3. Enable the "YouTube Data API v3" API from your console.

[Never store your API key in your code](https://www.zdnet.com/article/over-100000-github-repos-have-leaked-api-or-cryptographic-keys/).

Take a look at the [API documentation](https://developers.google.com/youtube/v3/) and install some Python code that will simplify our tasks:

```bash
$ pip install google-api-python-client
```


In [12]:
import sys
import urllib
from googleapiclient.discovery import build

DEVELOPER_KEY = !cat ~/api_key/google_key

In [17]:
youtube = build("youtube", "v3", developerKey=?)

### Retrieve Video Information

In [18]:
video_id = 'pRpeEdMmmQ0'
video_response = youtube.videos().list(
            part="snippet,statistics",
            id=video_id).execute()

In [19]:
video_response

{'kind': 'youtube#videoListResponse',
 'etag': 'wM0gq9mDwijDitrz1sq-x0y4IsU',
 'items': [{'kind': 'youtube#video',
   'etag': 'wVos9nGyts2XDJoBtwZ4hv5pXXo',
   'id': 'pRpeEdMmmQ0',
   'snippet': {'publishedAt': '2010-06-04T22:30:35Z',
    'channelId': 'UCGnjeahCJW1AF34HBmQTJ-Q',
    'title': 'Shakira - Waka Waka (This Time for Africa) (The Official 2010 FIFA World Cup™ Song)',
    'description': 'Watch the official music video for "Waka Waka (This Time for Africa) [The Official 2010 FIFA World Cup (TM) Song]" by Shakira\nListen to Shakira: https://Shakira.lnk.to/listen_YD\n\nSubscribe to the official Shakira youtube channel: https://Shakira.lnk.to/subscribeYD\n\nWatch more of Shakira\'s Music Videos: https://Shakira.lnk.to/listen_YC/youtube\n\nFollow Shakira:\nFacebook: https://Shakira.lnk.to/followFI\nInstagram: https://Shakira.lnk.to/followII\nTwitter: https://Shakira.lnk.to/followTI\nWebsite: https://Shakira.lnk.to/followWI\nSpotify: https://Shakira.lnk.to/followSI\nYouTube: https:/

In [7]:
video_response['items'][0]['snippet']['title']

'Shakira - Waka Waka (This Time for Africa) (The Official 2010 FIFA World Cup™ Song)'

In [33]:
video_response['items'][0].keys()

dict_keys(['kind', 'etag', 'id', 'snippet', 'statistics'])

In [20]:
video_response['items'][0]['statistics']

{'viewCount': '3740213210',
 'likeCount': '21699574',
 'favoriteCount': '0',
 'commentCount': '1308541'}

### Search for Videos based on keywords

In [21]:
QUERY = "k-pop"

In [36]:
search_response = youtube.search().list(
    q=QUERY,            # search terms
    part="id,snippet",  # what we want back
    maxResults=20,      # how many results we want back
    type="video"        # only tell me about videos
).execute()

In [37]:
search_response.keys()

dict_keys(['kind', 'etag', 'nextPageToken', 'regionCode', 'pageInfo', 'items'])

In [38]:
[item['snippet']['title'] for item in search_response['items']]

['Travis Scott, Bad Bunny, The Weeknd - K-POP (Official Music Video)',
 'Travis Scott, Bad Bunny, The Weeknd - K-POP (Official Audio)',
 'ITZY “CAKE” M/V @ITZY',
 'NewJeans (뉴진스) &#39;Super Shy&#39; Official MV',
 '(여자)아이들((G)I-DLE) - &#39;퀸카 (Queencard)&#39; Official Music Video',
 'BLACKPINK THE GAME - ‘THE GIRLS’ MV',
 'Stray Kids &quot;특(S-Class)&quot; M/V',
 '정국 (Jung Kook) &#39;Seven (feat. Latto)&#39; Official MV',
 'How Lisa’s family is in Danger…#shorts#lisa#blackpink#kpop#kpopidol#fyp#fypシ',
 'SEVENTEEN (세븐틴) &#39;손오공&#39; Official MV',
 'KPOP PLAYLIST 2023 💖🐰 K-POP Lite',
 'IVE 아이브 &#39;I AM&#39; MV',
 'NewJeans (뉴진스) &#39;OMG&#39; Official MV (Performance ver.1)',
 'ENHYPEN (엔하이픈) &#39;Bite Me&#39; Official MV',
 'KPOP PLAYLIST 2023 💖💖 K-POP Lite',
 '(여자)아이들((G)I-DLE) - &#39;퀸카 (Queencard)&#39; M/V (Performance Ver.)',
 'aespa 에스파 &#39;Spicy&#39; MV',
 'FIFTY FIFTY (피프티피프티) - &#39;Cupid&#39;  Official MV',
 'JISOO - ‘꽃(FLOWER)’ M/V',
 '[𝑷𝒍𝒂𝒚𝒍𝒊𝒔𝒕] 24/7 💘4세대 걸그룹💘 플레이리스트 | K-P

### Retrieve Channel Information

In [22]:
channel_id = 'UCXUPKJO5MZQN11PqgIvyuvQ'
channel_response = youtube.channels().list(
            part="snippet,statistics",
            id=channel_id).execute()

In [23]:
channel_response

{'kind': 'youtube#channelListResponse',
 'etag': '5wisQwmFypXYp0BiQHNYO8y9awU',
 'pageInfo': {'totalResults': 1, 'resultsPerPage': 5},
 'items': [{'kind': 'youtube#channel',
   'etag': 'obkEhmxxOuJWjQTkNHBaFw_ynGE',
   'id': 'UCXUPKJO5MZQN11PqgIvyuvQ',
   'snippet': {'title': 'Andrej Karpathy',
    'description': 'My motivation for creating this channel: https://twitter.com/karpathy/status/1577746577463967745\n\nFAQ\nQ: How can I pay you? Do you have a Patreon or etc?\nA: As YouTube partner I do share in a small amount of the ad revenue on the videos, but I don\'t maintain any other extra payment channels. I would prefer that people "pay me back" by using the knowledge to build something great.\n',
    'customUrl': '@andrejkarpathy',
    'publishedAt': '2013-09-07T08:21:13Z',
    'thumbnails': {'default': {'url': 'https://yt3.ggpht.com/ytc/APkrFKYFOSxGL4HypEGJ_1rOLkzUlT7qvOiflHFqKTonUrs=s88-c-k-c0x00ffffff-no-rj',
      'width': 88,
      'height': 88},
     'medium': {'url': 'https://

### How to find channel id
you cannot lookup a channel using @AndrejKarpathy

In [28]:
query ='Andrej Karpathy'

In [25]:
search_response = youtube.search().list(
    q=query,           
    part="id,snippet",  
    maxResults=20,      
    type="channel"       
).execute()

In [64]:
search_response['items'][0]['id']

{'kind': 'youtube#channel', 'channelId': 'UCXUPKJO5MZQN11PqgIvyuvQ'}

## Exercise 1
Make a dataframe with the 10 most popular videos by Andrej Karpathy (his channel is ok). You should get the title, year, number of views and likes.

Hint: use chatgpt to understand how to get top videos.

In [38]:
query ='Andrej Karpathy'
search_response = youtube.search().list(
    q=query,            # search terms
    part="id,snippet",  # what we want back
    maxResults=10,      # how many results we want back
    type="video",
    order ='viewCount'
).execute()

## Exercise 2
Find Peter Attia's channel id. Find the top 10 videos. Get comments from all the videos.
https://www.youtube.com/@PeterAttiaMD/about

In [39]:
search_response

{'kind': 'youtube#searchListResponse',
 'etag': 'G2OywhnGPiEEJOqvknVkylkYmJ8',
 'nextPageToken': 'CAoQAA',
 'regionCode': 'US',
 'pageInfo': {'totalResults': 25044, 'resultsPerPage': 10},
 'items': [{'kind': 'youtube#searchResult',
   'etag': 'GooZIkea890foDmpR9WWDMs76vU',
   'id': {'kind': 'youtube#video', 'videoId': 'kCc8FmEb1nY'},
   'snippet': {'publishedAt': '2023-01-17T16:33:27Z',
    'channelId': 'UCXUPKJO5MZQN11PqgIvyuvQ',
    'title': 'Let&#39;s build GPT: from scratch, in code, spelled out.',
    'description': 'We build a Generatively Pretrained Transformer (GPT), following the paper "Attention is All You Need" and OpenAI\'s GPT-2 ...',
    'thumbnails': {'default': {'url': 'https://i.ytimg.com/vi/kCc8FmEb1nY/default.jpg',
      'width': 120,
      'height': 90},
     'medium': {'url': 'https://i.ytimg.com/vi/kCc8FmEb1nY/mqdefault.jpg',
      'width': 320,
      'height': 180},
     'high': {'url': 'https://i.ytimg.com/vi/kCc8FmEb1nY/hqdefault.jpg',
      'width': 480,
     

In [53]:
search_response['items'][0]['id']['videoId']

'kCc8FmEb1nY'

In [55]:
video_id = []
for item in search_response['items']:
    video_id.append(item[0]['id']['videoId'])
video_id

KeyError: 0