# Sentiment Analysis: Football Fans React to Real Madrid vs Man City

Data yang dikumpulkan berupa komentar dari video YouTube berjudul `[Real Madrid 3-1 Manchester City | Champions League 24/25 Match Highlights](https://www.youtube.com/watch?v=aXf0AJyAqAU), dengan total 7.187 komentar. Data ini digunakan untuk menganalisis sentimen suporter sepak bola Indonesia terhadap hasil pertandingan **Real Madrid** vs **Manchester City** di **UEFA Champions League (UCL) 2024/2025** lalu.  

Scraping dilakukan menggunakan YouTube API v3 dengan menerapkan pagination token, karena API membatasi pengambilan maksimal 100 komentar per permintaan.

# Import Library

In [5]:
import googleapiclient.discovery
import pandas as pd

# Scrapping 

In [6]:
# Inisialisasi API YouTube
api_key = "AIzaSyCgtGIxDs6LMkJ7oo9lOddfLGsq7_MFkP0"
youtube = googleapiclient.discovery.build("youtube", "v3", developerKey=api_key)

def get_comments(video_id, max_comments=7187):
    comments = []
    next_page_token = None

    # Request komentar
    while len(comments) < max_comments:
        request = youtube.commentThreads().list(
            part="snippet",
            videoId=video_id,
            maxResults=100,  # Maksimal per request
            pageToken=next_page_token
        )
        response = request.execute()
        
        # Ekstrak data komentar
        for item in response["items"]:
            comment_data = item["snippet"]["topLevelComment"]["snippet"]
            comments.append([
                comment_data["authorDisplayName"],
                comment_data["publishedAt"],
                comment_data["updatedAt"],
                comment_data["likeCount"],
                comment_data["textDisplay"]
            ])

        # Cek apakah masih ada halaman selanjutnya
        next_page_token = response.get("nextPageToken")
        if not next_page_token:
            break  # Stop jika tidak ada halaman lagi

    return comments[:max_comments]  # Ambil hanya jumlah yang dibutuhkan

# ID video youtube yang ingin discrapping
video_id = "JgccvZkTVaw"
comments = get_comments(video_id)

# Simpan dalam DataFrame
df = pd.DataFrame(comments, columns=['author', 'published_at', 'updated_at', 'like_count', 'comment'])
print(df)

                       author          published_at            updated_at  \
0           @dimaskentung7370  2025-02-25T14:29:57Z  2025-02-25T14:29:57Z   
1               @AhmadKuncuro  2025-02-25T12:22:51Z  2025-02-25T12:25:03Z   
2           @AbdulMutalib-t1j  2025-02-25T00:16:41Z  2025-02-25T00:16:41Z   
3                 @JeffSwedes  2025-02-24T14:21:55Z  2025-02-24T14:21:55Z   
4             @NickyHarrissed  2025-02-24T14:19:42Z  2025-02-24T14:19:42Z   
...                       ...                   ...                   ...   
5040             @ArdyDwi-z5m  2025-02-19T22:15:04Z  2025-02-19T22:15:04Z   
5041  @olamardikadwixpemb8263  2025-02-19T22:15:03Z  2025-02-19T22:15:03Z   
5042          @NdraAquarius14  2025-02-19T22:15:02Z  2025-02-19T22:15:02Z   
5043         @edothereddevils  2025-02-19T22:15:00Z  2025-02-19T22:15:00Z   
5044         @bertcreator.864  2025-02-19T22:14:58Z  2025-02-19T22:14:58Z   

      like_count                                            comment  
0    

Data komentar yang berhasil discrapping adalah 5.103 komentar. Tidak semua komentar dapat discrapping kemungkinan karena komentar diangap spam, disembunyikan oleh pengguna, atau tersaring oleh API.

In [7]:
df.head()

Unnamed: 0,author,published_at,updated_at,like_count,comment
0,@dimaskentung7370,2025-02-25T14:29:57Z,2025-02-25T14:29:57Z,0,Yang gendong prancis di final emang beda🥶
1,@AhmadKuncuro,2025-02-25T12:22:51Z,2025-02-25T12:25:03Z,0,Mbappe harus d naturalisasi
2,@AbdulMutalib-t1j,2025-02-25T00:16:41Z,2025-02-25T00:16:41Z,0,Siapa suruh jual Julian Alvarez 😃😃😃😃😃😃😃
3,@JeffSwedes,2025-02-24T14:21:55Z,2025-02-24T14:21:55Z,0,Bermain di NAGAPLAY138 itu menguntungkan berka...
4,@NickyHarrissed,2025-02-24T14:19:42Z,2025-02-24T14:19:42Z,0,"Coba main di NAGAPLAY138, di sana pelayanan te..."


# Simpan ke file CSV

In [8]:
df.to_csv("youtube_comments.csv", index=False, encoding="utf-8")