# Crawling Data Youtube Comment
- **Youtube:** Keras! Permainan Timnas Indonesia melawan Timnas Arab Saudi Sep 6, 2024 & Nov 19, 2024
- **Link:** https://www.youtube.com/watch?v=4F2oOGDyWeY & https://www.youtube.com/watch?v=LjOxZjSujFI

## Import Library and API

In [11]:
from googleapiclient.discovery import build
import pandas as pd
import os
from dotenv import load_dotenv

load_dotenv()

API_KEY = os.getenv('API_KEY') # Change with your API key
YOUTUBE_API_SERVICE_NAME = 'youtube'
YOUTUBE_API_VERSION = 'v3'

## Crawling Function

In [12]:
def get_comments(video_id):
    youtube = build(YOUTUBE_API_SERVICE_NAME, YOUTUBE_API_VERSION, developerKey=API_KEY)
    comments = []
    response = youtube.commentThreads().list(
        part='snippet',
        videoId=video_id,
        textFormat='plainText'
    ).execute()

    while response:
        for item in response['items']:
            comment = item['snippet']['topLevelComment']['snippet']['textDisplay']
            comments.append(comment)
        if 'nextPageToken' in response:
            response = youtube.commentThreads().list(
                part='snippet',
                videoId=video_id,
                pageToken=response['nextPageToken'],
                textFormat='plainText'
            ).execute()
        else:
            break
    return comments

## Define Variable and Save Crawling Result

In [13]:
video_id = 'LjOxZjSujFI'
comments = get_comments(video_id)
df = pd.DataFrame(comments, columns=['comment'])
df.to_csv('../data/comment1.csv', index=False)
df.head()

Unnamed: 0,comment
0,1 POIN PERDANA!!!! Berapa Nilai Untuk Pertandi...
1,GarudaQ sekarang sekelas argentina..vietnam bs...
2,Padahal mau lihat mancini di gbk tapi eh di pe...
3,Pffsoz
4,shin busukk indo laos 33 busuuuukkk


In [14]:
video_id = '4F2oOGDyWeY'
comments = get_comments(video_id)
df = pd.DataFrame(comments, columns=['comment'])
df.to_csv('../data/comment2.csv', index=False)
df.head()

Unnamed: 0,comment
0,yok kita trendingkan timnas indonesia ... Merd...
1,Buat dong acara timnas senior di RCTI biar tam...
2,Buat dong acara kusus timnas senior di RCTI bi...
3,Gila menit dari awal kena serangan terus..
4,Mar7


In [15]:
csv1 = pd.read_csv('../data/comment1.csv')
csv2 = pd.read_csv('../data/comment2.csv')

concatenated_data = pd.concat([csv1, csv2], ignore_index=True)

concatenated_data.to_csv('../data/youtube-comment.csv', index=False)