# Capstone Project: Sentiment Analysis for CMON

There are a total of 3 Jupyter notebooks for this project.


## Notebook 1 Contents:
- 1. Importing Libraries
- 2. Scrapping and Saving Train/Test Dataset


## Notebook 2 Contents:

- Problem Statement
- Executive Summary
- Conclusions and Recommendations
- 1. Importing Libraries
- 2. Importing Data
- 3. Data Cleansing
    - Remove moderator posts
    - Selecting the columns with useful text data
    - Filling up null values
    - Removing duplicated posts
    - No outliers removed, but "key words"  would be treated in Data Preprocessing

- 4. Data Preprocessing
    - Remove html tags using beautifulsoup
    - Lowercase all words and split up words
    - Remove non-letters: Remove special characters and numbers
    - Remove keywords that points to a specific subreddit
    - Remove stopwords: These are common words that are not useful for text classification
    - Lemmatize words: This will convert each word to its base form
    - Rejoin words back into a string

- 5. Exploratory Data Analysis
    - Wordcloud
    - Barcharts
    - Distribution of Meaningful Words 

- 6. Saving and exporting of Train/test set
- 7. Preparing and saving and exporting of holdout set in accordance to steps 1 to 4 above
- 8. Topic modelling
- 9. Success Evaluation
- 10. Findings
- 11. Conclusion and Recommendations
- 12. Next Steps

## Notebook 3 Contents:

- 1. Importing Libraries
- 2. Importing Data
- 3. Modeling
    - CountVectorizer & Logistic Regression
    - TF-IDF & Logistic Regression   
    - CountVectorizer & Naive Bayes 
    - TF-IDF & Naive Bayes  
    - CountVectorizer & SVC
    - TF-IDF & SVC
    - LSTM
- 4. Model Evaluation
- 5. Selection of Production Model
- 7. Identifying the Most Predictive Words
- 8. Applying Chosen Model on Holdout Set


## Data Collection

In [1]:
import pandas as pd
import requests
import re
from bs4 import BeautifulSoup

## Scraping for Ankh: Gods of Egypt

In [2]:
game_id = '285967'
page_number = 1

In [3]:
comment_list = []
name = ''
while page_number != 0:
    url = f'https://www.boardgamegeek.com/xmlapi/boardgame/{game_id}?comments=1&page={page_number}'
    print(f'Request response from {url}')
    res = requests.get(url)
    if res.status_code == 200:
        print('Request successful. Start scrapping comments...')
        soup = BeautifulSoup(res.content, 'lxml')
        name = soup.find('name').text
        comment = soup.find_all('comment')
        if len(comment) != 0: # there are some comments available in the url
            page_number += 1
            for i in range(len(comment)):
                comment_dict = {}
                comment_dict['username'] = comment[i].attrs['username']
                comment_dict['rating'] = comment[i].attrs['rating']
                comment_dict['comment'] = comment[i].text
                comment_list.append(comment_dict)
        else:
            print(f'No comments found in page {page_number}')
            print(f'Total {len(comment_list)} comments scraped for <{name}>. End of Scrapping')
            page_number = 0
    else:
        print('Getting response failed.')

Request response from https://www.boardgamegeek.com/xmlapi/boardgame/285967?comments=1&page=1
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/285967?comments=1&page=2
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/285967?comments=1&page=3
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/285967?comments=1&page=4
Request successful. Start scrapping comments...
No comments found in page 4
Total 222 comments scraped for <Ankh: Gods of Egypt>. End of Scrapping


In [4]:
comment_df = pd.DataFrame(comment_list)

In [5]:
name = ' '.join(re.findall(r"(?i)\b[a-z]+\b", name)) #remove non alphabets
comment_df.to_csv(f'{name} comments.csv')

## Scraping for Marvel United

In [6]:
game_id = '298047'
page_number = 1

In [7]:
comment_list = []
name = ''
while page_number != 0:
    url = f'https://www.boardgamegeek.com/xmlapi/boardgame/{game_id}?comments=1&page={page_number}'
    print(f'Request response from {url}')
    res = requests.get(url)
    if res.status_code == 200:
        print('Request successful. Start scrapping comments...')
        soup = BeautifulSoup(res.content, 'lxml')
        name = soup.find('name').text
        comment = soup.find_all('comment')
        if len(comment) != 0: # there are some comments available in the url
            page_number += 1
            for i in range(len(comment)):
                comment_dict = {}
                comment_dict['username'] = comment[i].attrs['username']
                comment_dict['rating'] = comment[i].attrs['rating']
                comment_dict['comment'] = comment[i].text
                comment_list.append(comment_dict)
        else:
            print(f'No comments found in page {page_number}')
            print(f'Total {len(comment_list)} comments scraped for <{name}>. End of Scrapping')
            page_number = 0
    else:
        print('Getting response failed.')

Request response from https://www.boardgamegeek.com/xmlapi/boardgame/298047?comments=1&page=1
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/298047?comments=1&page=2
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/298047?comments=1&page=3
Request successful. Start scrapping comments...
No comments found in page 3
Total 109 comments scraped for <Marvel United>. End of Scrapping


In [8]:
comment_df = pd.DataFrame(comment_list)

In [9]:
name = ' '.join(re.findall(r"(?i)\b[a-z]+\b", name)) #remove non alphabets
comment_df.to_csv(f'{name} comments.csv')

## Scraping for Sheriff of Nottingham 

In [10]:
game_id = '298638'
page_number = 1

In [11]:
comment_list = []
name = ''
while page_number != 0:
    url = f'https://www.boardgamegeek.com/xmlapi/boardgame/{game_id}?comments=1&page={page_number}'
    print(f'Request response from {url}')
    res = requests.get(url)
    if res.status_code == 200:
        print('Request successful. Start scrapping comments...')
        soup = BeautifulSoup(res.content, 'lxml')
        name = soup.find('name').text
        comment = soup.find_all('comment')
        if len(comment) != 0: # there are some comments available in the url
            page_number += 1
            for i in range(len(comment)):
                comment_dict = {}
                comment_dict['username'] = comment[i].attrs['username']
                comment_dict['rating'] = comment[i].attrs['rating']
                comment_dict['comment'] = comment[i].text
                comment_list.append(comment_dict)
        else:
            print(f'No comments found in page {page_number}')
            print(f'Total {len(comment_list)} comments scraped for <{name}>. End of Scrapping')
            page_number = 0
    else:
        print('Getting response failed.')

Request response from https://www.boardgamegeek.com/xmlapi/boardgame/298638?comments=1&page=1
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/298638?comments=1&page=2
Request successful. Start scrapping comments...
No comments found in page 2
Total 9 comments scraped for <Sheriff of Nottingham (2nd Edition)>. End of Scrapping


In [12]:
comment_df = pd.DataFrame(comment_list)

In [13]:
name = ' '.join(re.findall(r"(?i)\b[a-z]+\b", name)) #remove non alphabets
comment_df.to_csv(f'{name} comments.csv')

## Scraping for Trudvang Legends

In [14]:
game_id = '266064'
page_number = 1

In [15]:
comment_list = []
name = ''
while page_number != 0:
    url = f'https://www.boardgamegeek.com/xmlapi/boardgame/{game_id}?comments=1&page={page_number}'
    print(f'Request response from {url}')
    res = requests.get(url)
    if res.status_code == 200:
        print('Request successful. Start scrapping comments...')
        soup = BeautifulSoup(res.content, 'lxml')
        name = soup.find('name').text
        comment = soup.find_all('comment')
        if len(comment) != 0: # there are some comments available in the url
            page_number += 1
            for i in range(len(comment)):
                comment_dict = {}
                comment_dict['username'] = comment[i].attrs['username']
                comment_dict['rating'] = comment[i].attrs['rating']
                comment_dict['comment'] = comment[i].text
                comment_list.append(comment_dict)
        else:
            print(f'No comments found in page {page_number}')
            print(f'Total {len(comment_list)} comments scraped for <{name}>. End of Scrapping')
            page_number = 0
    else:
        print('Getting response failed.')

Request response from https://www.boardgamegeek.com/xmlapi/boardgame/266064?comments=1&page=1
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/266064?comments=1&page=2
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/266064?comments=1&page=3
Request successful. Start scrapping comments...
No comments found in page 3
Total 145 comments scraped for <Trudvang Legends>. End of Scrapping


In [16]:
comment_df = pd.DataFrame(comment_list)

In [17]:
name = ' '.join(re.findall(r"(?i)\b[a-z]+\b", name)) #remove non alphabets
comment_df.to_csv(f'{name} comments.csv')

## Scraping for God of War: The Card Game

In [18]:
game_id = '278120'
page_number = 1

In [19]:
comment_list = []
name = ''
while page_number != 0:
    url = f'https://www.boardgamegeek.com/xmlapi/boardgame/{game_id}?comments=1&page={page_number}'
    print(f'Request response from {url}')
    res = requests.get(url)
    if res.status_code == 200:
        print('Request successful. Start scrapping comments...')
        soup = BeautifulSoup(res.content, 'lxml')
        name = soup.find('name').text
        comment = soup.find_all('comment')
        if len(comment) != 0: # there are some comments available in the url
            page_number += 1
            for i in range(len(comment)):
                comment_dict = {}
                comment_dict['username'] = comment[i].attrs['username']
                comment_dict['rating'] = comment[i].attrs['rating']
                comment_dict['comment'] = comment[i].text
                comment_list.append(comment_dict)
        else:
            print(f'No comments found in page {page_number}')
            print(f'Total {len(comment_list)} comments scraped for <{name}>. End of Scrapping')
            page_number = 0
    else:
        print('Getting response failed.')

Request response from https://www.boardgamegeek.com/xmlapi/boardgame/278120?comments=1&page=1
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/278120?comments=1&page=2
Request successful. Start scrapping comments...
No comments found in page 2
Total 76 comments scraped for <God of War: Das Kartenspiel>. End of Scrapping


In [20]:
comment_df = pd.DataFrame(comment_list)

In [21]:
name = ' '.join(re.findall(r"(?i)\b[a-z]+\b", name)) #remove non alphabets
comment_df.to_csv(f'{name} comments.csv')

## Scraping for Bloodborne: The Board Game

In [22]:
game_id = '273330'
page_number = 1

In [23]:
comment_list = []
name = ''
while page_number != 0:
    url = f'https://www.boardgamegeek.com/xmlapi/boardgame/{game_id}?comments=1&page={page_number}'
    print(f'Request response from {url}')
    res = requests.get(url)
    if res.status_code == 200:
        print('Request successful. Start scrapping comments...')
        soup = BeautifulSoup(res.content, 'lxml')
        name = soup.find('name').text
        comment = soup.find_all('comment')
        if len(comment) != 0: # there are some comments available in the url
            page_number += 1
            for i in range(len(comment)):
                comment_dict = {}
                comment_dict['username'] = comment[i].attrs['username']
                comment_dict['rating'] = comment[i].attrs['rating']
                comment_dict['comment'] = comment[i].text
                comment_list.append(comment_dict)
        else:
            print(f'No comments found in page {page_number}')
            print(f'Total {len(comment_list)} comments scraped for <{name}>. End of Scrapping')
            page_number = 0
    else:
        print('Getting response failed.')

Request response from https://www.boardgamegeek.com/xmlapi/boardgame/273330?comments=1&page=1
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/273330?comments=1&page=2
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/273330?comments=1&page=3
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/273330?comments=1&page=4
Request successful. Start scrapping comments...
No comments found in page 4
Total 201 comments scraped for <Bloodborne: The Board Game>. End of Scrapping


In [24]:
comment_df = pd.DataFrame(comment_list)

In [25]:
name = ' '.join(re.findall(r"(?i)\b[a-z]+\b", name)) #remove non alphabets
comment_df.to_csv(f'{name} comments.csv')

## Scraping for Foodies

In [26]:
game_id = '280896'
page_number = 1

In [27]:
comment_list = []
name = ''
while page_number != 0:
    url = f'https://www.boardgamegeek.com/xmlapi/boardgame/{game_id}?comments=1&page={page_number}'
    print(f'Request response from {url}')
    res = requests.get(url)
    if res.status_code == 200:
        print('Request successful. Start scrapping comments...')
        soup = BeautifulSoup(res.content, 'lxml')
        name = soup.find('name').text
        comment = soup.find_all('comment')
        if len(comment) != 0: # there are some comments available in the url
            page_number += 1
            for i in range(len(comment)):
                comment_dict = {}
                comment_dict['username'] = comment[i].attrs['username']
                comment_dict['rating'] = comment[i].attrs['rating']
                comment_dict['comment'] = comment[i].text
                comment_list.append(comment_dict)
        else:
            print(f'No comments found in page {page_number}')
            print(f'Total {len(comment_list)} comments scraped for <{name}>. End of Scrapping')
            page_number = 0
    else:
        print('Getting response failed.')

Request response from https://www.boardgamegeek.com/xmlapi/boardgame/280896?comments=1&page=1
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/280896?comments=1&page=2
Request successful. Start scrapping comments...
No comments found in page 2
Total 67 comments scraped for <Foodies>. End of Scrapping


In [28]:
comment_df = pd.DataFrame(comment_list)

In [29]:
name = ' '.join(re.findall(r"(?i)\b[a-z]+\b", name)) #remove non alphabets
comment_df.to_csv(f'{name} comments.csv')

## Scraping for Munchkin Dungeon

In [30]:
game_id = '257001'
page_number = 1

In [31]:
comment_list = []
name = ''
while page_number != 0:
    url = f'https://www.boardgamegeek.com/xmlapi/boardgame/{game_id}?comments=1&page={page_number}'
    print(f'Request response from {url}')
    res = requests.get(url)
    if res.status_code == 200:
        print('Request successful. Start scrapping comments...')
        soup = BeautifulSoup(res.content, 'lxml')
        name = soup.find('name').text
        comment = soup.find_all('comment')
        if len(comment) != 0: # there are some comments available in the url
            page_number += 1
            for i in range(len(comment)):
                comment_dict = {}
                comment_dict['username'] = comment[i].attrs['username']
                comment_dict['rating'] = comment[i].attrs['rating']
                comment_dict['comment'] = comment[i].text
                comment_list.append(comment_dict)
        else:
            print(f'No comments found in page {page_number}')
            print(f'Total {len(comment_list)} comments scraped for <{name}>. End of Scrapping')
            page_number = 0
    else:
        print('Getting response failed.')

Request response from https://www.boardgamegeek.com/xmlapi/boardgame/257001?comments=1&page=1
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/257001?comments=1&page=2
Request successful. Start scrapping comments...
No comments found in page 2
Total 63 comments scraped for <Munchkin Dungeon>. End of Scrapping


In [32]:
comment_df = pd.DataFrame(comment_list)

In [33]:
name = ' '.join(re.findall(r"(?i)\b[a-z]+\b", name)) #remove non alphabets
comment_df.to_csv(f'{name} comments.csv')

## Scraping for Project: ELITE

In [34]:
game_id = '256999'
page_number = 1

In [35]:
comment_list = []
name = ''
while page_number != 0:
    url = f'https://www.boardgamegeek.com/xmlapi/boardgame/{game_id}?comments=1&page={page_number}'
    print(f'Request response from {url}')
    res = requests.get(url)
    if res.status_code == 200:
        print('Request successful. Start scrapping comments...')
        soup = BeautifulSoup(res.content, 'lxml')
        name = soup.find('name').text
        comment = soup.find_all('comment')
        if len(comment) != 0: # there are some comments available in the url
            page_number += 1
            for i in range(len(comment)):
                comment_dict = {}
                comment_dict['username'] = comment[i].attrs['username']
                comment_dict['rating'] = comment[i].attrs['rating']
                comment_dict['comment'] = comment[i].text
                comment_list.append(comment_dict)
        else:
            print(f'No comments found in page {page_number}')
            print(f'Total {len(comment_list)} comments scraped for <{name}>. End of Scrapping')
            page_number = 0
    else:
        print('Getting response failed.')

Request response from https://www.boardgamegeek.com/xmlapi/boardgame/256999?comments=1&page=1
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/256999?comments=1&page=2
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/256999?comments=1&page=3
Request successful. Start scrapping comments...
No comments found in page 3
Total 148 comments scraped for <Project: ELITE>. End of Scrapping


In [36]:
comment_df = pd.DataFrame(comment_list)

In [37]:
name = ' '.join(re.findall(r"(?i)\b[a-z]+\b", name)) #remove non alphabets
comment_df.to_csv(f'{name} comments.csv')

## Scraping for Starcadia Quest

In [38]:
game_id = '257193'
page_number = 1

In [39]:
comment_list = []
name = ''
while page_number != 0:
    url = f'https://www.boardgamegeek.com/xmlapi/boardgame/{game_id}?comments=1&page={page_number}'
    print(f'Request response from {url}')
    res = requests.get(url)
    if res.status_code == 200:
        print('Request successful. Start scrapping comments...')
        soup = BeautifulSoup(res.content, 'lxml')
        name = soup.find('name').text
        comment = soup.find_all('comment')
        if len(comment) != 0: # there are some comments available in the url
            page_number += 1
            for i in range(len(comment)):
                comment_dict = {}
                comment_dict['username'] = comment[i].attrs['username']
                comment_dict['rating'] = comment[i].attrs['rating']
                comment_dict['comment'] = comment[i].text
                comment_list.append(comment_dict)
        else:
            print(f'No comments found in page {page_number}')
            print(f'Total {len(comment_list)} comments scraped for <{name}>. End of Scrapping')
            page_number = 0
    else:
        print('Getting response failed.')

Request response from https://www.boardgamegeek.com/xmlapi/boardgame/257193?comments=1&page=1
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/257193?comments=1&page=2
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/257193?comments=1&page=3
Request successful. Start scrapping comments...
No comments found in page 3
Total 122 comments scraped for <Starcadia Quest>. End of Scrapping


In [40]:
comment_df = pd.DataFrame(comment_list)

In [41]:
name = ' '.join(re.findall(r"(?i)\b[a-z]+\b", name)) #remove non alphabets
comment_df.to_csv(f'{name} comments.csv')

## Scraping for Cthulhu-death-may-die

In [42]:
game_id = '253344'
page_number = 1

In [43]:
comment_list = []
name = ''
while page_number != 0:
    url = f'https://www.boardgamegeek.com/xmlapi/boardgame/{game_id}?comments=1&page={page_number}'
    print(f'Request response from {url}')
    res = requests.get(url)
    if res.status_code == 200:
        print('Request successful. Start scrapping comments...')
        soup = BeautifulSoup(res.content, 'lxml')
        name = soup.find('name').text
        comment = soup.find_all('comment')
        if len(comment) != 0: # there are some comments available in the url
            page_number += 1
            for i in range(len(comment)):
                comment_dict = {}
                comment_dict['username'] = comment[i].attrs['username']
                comment_dict['rating'] = comment[i].attrs['rating']
                comment_dict['comment'] = comment[i].text
                comment_list.append(comment_dict)
        else:
            print(f'No comments found in page {page_number}')
            print(f'Total {len(comment_list)} comments scraped for <{name}>. End of Scrapping')
            page_number = 0
    else:
        print('Getting response failed.')

Request response from https://www.boardgamegeek.com/xmlapi/boardgame/253344?comments=1&page=1
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/253344?comments=1&page=2
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/253344?comments=1&page=3
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/253344?comments=1&page=4
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/253344?comments=1&page=5
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/253344?comments=1&page=6
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/253344?comments=1&page=7
Request successful. Start scrapping comments...
Reques

In [44]:
comment_df = pd.DataFrame(comment_list)

In [45]:
name = ' '.join(re.findall(r"(?i)\b[a-z]+\b", name)) #remove non alphabets
comment_df.to_csv(f'{name} comments.csv')

## Scraping for Wacky Races

In [46]:
game_id = '256729'
page_number = 1

In [47]:
comment_list = []
name = ''
while page_number != 0:
    url = f'https://www.boardgamegeek.com/xmlapi/boardgame/{game_id}?comments=1&page={page_number}'
    print(f'Request response from {url}')
    res = requests.get(url)
    if res.status_code == 200:
        print('Request successful. Start scrapping comments...')
        soup = BeautifulSoup(res.content, 'lxml')
        name = soup.find('name').text
        comment = soup.find_all('comment')
        if len(comment) != 0: # there are some comments available in the url
            page_number += 1
            for i in range(len(comment)):
                comment_dict = {}
                comment_dict['username'] = comment[i].attrs['username']
                comment_dict['rating'] = comment[i].attrs['rating']
                comment_dict['comment'] = comment[i].text
                comment_list.append(comment_dict)
        else:
            print(f'No comments found in page {page_number}')
            print(f'Total {len(comment_list)} comments scraped for <{name}>. End of Scrapping')
            page_number = 0
    else:
        print('Getting response failed.')

Request response from https://www.boardgamegeek.com/xmlapi/boardgame/256729?comments=1&page=1
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/256729?comments=1&page=2
Request successful. Start scrapping comments...
No comments found in page 2
Total 88 comments scraped for <Los Autos Locos: El Juego de Mesa>. End of Scrapping


In [48]:
comment_df = pd.DataFrame(comment_list)

In [49]:
name = ' '.join(re.findall(r"(?i)\b[a-z]+\b", name)) #remove non alphabets
comment_df.to_csv(f'{name} comments.csv')

## Scraping for Blue Moon City

In [50]:
game_id = '21882'
page_number = 1

In [51]:
comment_list = []
name = ''
while page_number != 0:
    url = f'https://www.boardgamegeek.com/xmlapi/boardgame/{game_id}?comments=1&page={page_number}'
    print(f'Request response from {url}')
    res = requests.get(url)
    if res.status_code == 200:
        print('Request successful. Start scrapping comments...')
        soup = BeautifulSoup(res.content, 'lxml')
        name = soup.find('name').text
        comment = soup.find_all('comment')
        if len(comment) != 0: # there are some comments available in the url
            page_number += 1
            for i in range(len(comment)):
                comment_dict = {}
                comment_dict['username'] = comment[i].attrs['username']
                comment_dict['rating'] = comment[i].attrs['rating']
                comment_dict['comment'] = comment[i].text
                comment_list.append(comment_dict)
        else:
            print(f'No comments found in page {page_number}')
            print(f'Total {len(comment_list)} comments scraped for <{name}>. End of Scrapping')
            page_number = 0
    else:
        print('Getting response failed.')

Request response from https://www.boardgamegeek.com/xmlapi/boardgame/21882?comments=1&page=1
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/21882?comments=1&page=2
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/21882?comments=1&page=3
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/21882?comments=1&page=4
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/21882?comments=1&page=5
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/21882?comments=1&page=6
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/21882?comments=1&page=7
Request successful. Start scrapping comments...
Request respo

In [52]:
comment_df = pd.DataFrame(comment_list)

In [53]:
name = ' '.join(re.findall(r"(?i)\b[a-z]+\b", name)) #remove non alphabets
comment_df.to_csv(f'{name} comments.csv')

## Scraping for Victorian Masterminds

In [54]:
game_id = '189453'
page_number = 1

In [55]:
comment_list = []
name = ''
while page_number != 0:
    url = f'https://www.boardgamegeek.com/xmlapi/boardgame/{game_id}?comments=1&page={page_number}'
    print(f'Request response from {url}')
    res = requests.get(url)
    if res.status_code == 200:
        print('Request successful. Start scrapping comments...')
        soup = BeautifulSoup(res.content, 'lxml')
        name = soup.find('name').text
        comment = soup.find_all('comment')
        if len(comment) != 0: # there are some comments available in the url
            page_number += 1
            for i in range(len(comment)):
                comment_dict = {}
                comment_dict['username'] = comment[i].attrs['username']
                comment_dict['rating'] = comment[i].attrs['rating']
                comment_dict['comment'] = comment[i].text
                comment_list.append(comment_dict)
        else:
            print(f'No comments found in page {page_number}')
            print(f'Total {len(comment_list)} comments scraped for <{name}>. End of Scrapping')
            page_number = 0
    else:
        print('Getting response failed.')

Request response from https://www.boardgamegeek.com/xmlapi/boardgame/189453?comments=1&page=1
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/189453?comments=1&page=2
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/189453?comments=1&page=3
Request successful. Start scrapping comments...
No comments found in page 3
Total 197 comments scraped for <Genios Victorianos>. End of Scrapping


In [56]:
comment_df = pd.DataFrame(comment_list)

In [57]:
name = ' '.join(re.findall(r"(?i)\b[a-z]+\b", name)) #remove non alphabets
comment_df.to_csv(f'{name} comments.csv')

## Scraping for Narcos

In [58]:
game_id = '253106'
page_number = 1

In [59]:
comment_list = []
name = ''
while page_number != 0:
    url = f'https://www.boardgamegeek.com/xmlapi/boardgame/{game_id}?comments=1&page={page_number}'
    print(f'Request response from {url}')
    res = requests.get(url)
    if res.status_code == 200:
        print('Request successful. Start scrapping comments...')
        soup = BeautifulSoup(res.content, 'lxml')
        name = soup.find('name').text
        comment = soup.find_all('comment')
        if len(comment) != 0: # there are some comments available in the url
            page_number += 1
            for i in range(len(comment)):
                comment_dict = {}
                comment_dict['username'] = comment[i].attrs['username']
                comment_dict['rating'] = comment[i].attrs['rating']
                comment_dict['comment'] = comment[i].text
                comment_list.append(comment_dict)
        else:
            print(f'No comments found in page {page_number}')
            print(f'Total {len(comment_list)} comments scraped for <{name}>. End of Scrapping')
            page_number = 0
    else:
        print('Getting response failed.')

Request response from https://www.boardgamegeek.com/xmlapi/boardgame/253106?comments=1&page=1
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/253106?comments=1&page=2
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/253106?comments=1&page=3
Request successful. Start scrapping comments...
No comments found in page 3
Total 104 comments scraped for <Narcos: Desková hra>. End of Scrapping


In [60]:
comment_df = pd.DataFrame(comment_list)

In [61]:
name = ' '.join(re.findall(r"(?i)\b[a-z]+\b", name)) #remove non alphabets
comment_df.to_csv(f'{name} comments.csv')

## Scraping for Sugar Blast

In [62]:
game_id = '261449'
page_number = 1

In [63]:
comment_list = []
name = ''
while page_number != 0:
    url = f'https://www.boardgamegeek.com/xmlapi/boardgame/{game_id}?comments=1&page={page_number}'
    print(f'Request response from {url}')
    res = requests.get(url)
    if res.status_code == 200:
        print('Request successful. Start scrapping comments...')
        soup = BeautifulSoup(res.content, 'lxml')
        name = soup.find('name').text
        comment = soup.find_all('comment')
        if len(comment) != 0: # there are some comments available in the url
            page_number += 1
            for i in range(len(comment)):
                comment_dict = {}
                comment_dict['username'] = comment[i].attrs['username']
                comment_dict['rating'] = comment[i].attrs['rating']
                comment_dict['comment'] = comment[i].text
                comment_list.append(comment_dict)
        else:
            print(f'No comments found in page {page_number}')
            print(f'Total {len(comment_list)} comments scraped for <{name}>. End of Scrapping')
            page_number = 0
    else:
        print('Getting response failed.')

Request response from https://www.boardgamegeek.com/xmlapi/boardgame/261449?comments=1&page=1
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/261449?comments=1&page=2
Request successful. Start scrapping comments...
No comments found in page 2
Total 1 comments scraped for <Sugar Blast>. End of Scrapping


In [64]:
comment_df = pd.DataFrame(comment_list)

In [65]:
name = ' '.join(re.findall(r"(?i)\b[a-z]+\b", name)) #remove non alphabets
comment_df.to_csv(f'{name} comments.csv')

## Scraping for Kick-Ass

In [66]:
game_id = '250561'
page_number = 1

In [67]:
comment_list = []
name = ''
while page_number != 0:
    url = f'https://www.boardgamegeek.com/xmlapi/boardgame/{game_id}?comments=1&page={page_number}'
    print(f'Request response from {url}')
    res = requests.get(url)
    if res.status_code == 200:
        print('Request successful. Start scrapping comments...')
        soup = BeautifulSoup(res.content, 'lxml')
        name = soup.find('name').text
        comment = soup.find_all('comment')
        if len(comment) != 0: # there are some comments available in the url
            page_number += 1
            for i in range(len(comment)):
                comment_dict = {}
                comment_dict['username'] = comment[i].attrs['username']
                comment_dict['rating'] = comment[i].attrs['rating']
                comment_dict['comment'] = comment[i].text
                comment_list.append(comment_dict)
        else:
            print(f'No comments found in page {page_number}')
            print(f'Total {len(comment_list)} comments scraped for <{name}>. End of Scrapping')
            page_number = 0
    else:
        print('Getting response failed.')

Request response from https://www.boardgamegeek.com/xmlapi/boardgame/250561?comments=1&page=1
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/250561?comments=1&page=2
Request successful. Start scrapping comments...
No comments found in page 2
Total 82 comments scraped for <Ha/Ver: A Társasjáték>. End of Scrapping


In [68]:
comment_df = pd.DataFrame(comment_list)

In [69]:
name = ' '.join(re.findall(r"(?i)\b[a-z]+\b", name)) #remove non alphabets
comment_df.to_csv(f'{name} comments.csv')

## Scraping for Gizmos 

In [70]:
game_id = '246192'
page_number = 1

In [71]:
comment_list = []
name = ''
while page_number != 0:
    url = f'https://www.boardgamegeek.com/xmlapi/boardgame/{game_id}?comments=1&page={page_number}'
    print(f'Request response from {url}')
    res = requests.get(url)
    if res.status_code == 200:
        print('Request successful. Start scrapping comments...')
        soup = BeautifulSoup(res.content, 'lxml')
        name = soup.find('name').text
        comment = soup.find_all('comment')
        if len(comment) != 0: # there are some comments available in the url
            page_number += 1
            for i in range(len(comment)):
                comment_dict = {}
                comment_dict['username'] = comment[i].attrs['username']
                comment_dict['rating'] = comment[i].attrs['rating']
                comment_dict['comment'] = comment[i].text
                comment_list.append(comment_dict)
        else:
            print(f'No comments found in page {page_number}')
            print(f'Total {len(comment_list)} comments scraped for <{name}>. End of Scrapping')
            page_number = 0
    else:
        print('Getting response failed.')

Request response from https://www.boardgamegeek.com/xmlapi/boardgame/246192?comments=1&page=1
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/246192?comments=1&page=2
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/246192?comments=1&page=3
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/246192?comments=1&page=4
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/246192?comments=1&page=5
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/246192?comments=1&page=6
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/246192?comments=1&page=7
Request successful. Start scrapping comments...
Reques

In [72]:
comment_df = pd.DataFrame(comment_list)

In [73]:
name = ' '.join(re.findall(r"(?i)\b[a-z]+\b", name)) #remove non alphabets
comment_df.to_csv(f'{name} comments.csv')

## Scraping for Hate

In [74]:
game_id = '233868'
page_number = 1

In [75]:
comment_list = []
name = ''
while page_number != 0:
    url = f'https://www.boardgamegeek.com/xmlapi/boardgame/{game_id}?comments=1&page={page_number}'
    print(f'Request response from {url}')
    res = requests.get(url)
    if res.status_code == 200:
        print('Request successful. Start scrapping comments...')
        soup = BeautifulSoup(res.content, 'lxml')
        name = soup.find('name').text
        comment = soup.find_all('comment')
        if len(comment) != 0: # there are some comments available in the url
            page_number += 1
            for i in range(len(comment)):
                comment_dict = {}
                comment_dict['username'] = comment[i].attrs['username']
                comment_dict['rating'] = comment[i].attrs['rating']
                comment_dict['comment'] = comment[i].text
                comment_list.append(comment_dict)
        else:
            print(f'No comments found in page {page_number}')
            print(f'Total {len(comment_list)} comments scraped for <{name}>. End of Scrapping')
            page_number = 0
    else:
        print('Getting response failed.')

Request response from https://www.boardgamegeek.com/xmlapi/boardgame/233868?comments=1&page=1
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/233868?comments=1&page=2
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/233868?comments=1&page=3
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/233868?comments=1&page=4
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/233868?comments=1&page=5
Request successful. Start scrapping comments...
No comments found in page 5
Total 334 comments scraped for <HATE>. End of Scrapping


In [76]:
comment_df = pd.DataFrame(comment_list)

In [77]:
name = ' '.join(re.findall(r"(?i)\b[a-z]+\b", name)) #remove non alphabets
comment_df.to_csv(f'{name} comments.csv')

## Scraping the-world-of-smog

In [78]:
game_id = '209324'
page_number = 1

In [79]:
comment_list = []
name = ''
while page_number != 0:
    url = f'https://www.boardgamegeek.com/xmlapi/boardgame/{game_id}?comments=1&page={page_number}'
    print(f'Request response from {url}')
    res = requests.get(url)
    if res.status_code == 200:
        print('Request successful. Start scrapping comments...')
        soup = BeautifulSoup(res.content, 'lxml')
        name = soup.find('name').text
        comment = soup.find_all('comment')
        if len(comment) != 0: # there are some comments available in the url
            page_number += 1
            for i in range(len(comment)):
                comment_dict = {}
                comment_dict['username'] = comment[i].attrs['username']
                comment_dict['rating'] = comment[i].attrs['rating']
                comment_dict['comment'] = comment[i].text
                comment_list.append(comment_dict)
        else:
            print(f'No comments found in page {page_number}')
            print(f'Total {len(comment_list)} comments scraped for <{name}>. End of Scrapping')
            page_number = 0
    else:
        print('Getting response failed.')

Request response from https://www.boardgamegeek.com/xmlapi/boardgame/209324?comments=1&page=1
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/209324?comments=1&page=2
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/209324?comments=1&page=3
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/209324?comments=1&page=4
Request successful. Start scrapping comments...
No comments found in page 4
Total 250 comments scraped for <Moloch felemelkedése>. End of Scrapping


In [80]:
comment_df = pd.DataFrame(comment_list)

In [81]:
name = ' '.join(re.findall(r"(?i)\b[a-z]+\b", name)) #remove non alphabets
comment_df.to_csv(f'{name} comments.csv')

## Scraping for Rising-sun

In [82]:
game_id = '205896'
page_number = 1

In [83]:
comment_list = []
name = ''
while page_number != 0:
    url = f'https://www.boardgamegeek.com/xmlapi/boardgame/{game_id}?comments=1&page={page_number}'
    print(f'Request response from {url}')
    res = requests.get(url)
    if res.status_code == 200:
        print('Request successful. Start scrapping comments...')
        soup = BeautifulSoup(res.content, 'lxml')
        name = soup.find('name').text
        comment = soup.find_all('comment')
        if len(comment) != 0: # there are some comments available in the url
            page_number += 1
            for i in range(len(comment)):
                comment_dict = {}
                comment_dict['username'] = comment[i].attrs['username']
                comment_dict['rating'] = comment[i].attrs['rating']
                comment_dict['comment'] = comment[i].text
                comment_list.append(comment_dict)
        else:
            print(f'No comments found in page {page_number}')
            print(f'Total {len(comment_list)} comments scraped for <{name}>. End of Scrapping')
            page_number = 0
    else:
        print('Getting response failed.')

Request response from https://www.boardgamegeek.com/xmlapi/boardgame/205896?comments=1&page=1
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/205896?comments=1&page=2
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/205896?comments=1&page=3
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/205896?comments=1&page=4
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/205896?comments=1&page=5
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/205896?comments=1&page=6
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/205896?comments=1&page=7
Request successful. Start scrapping comments...
Reques

In [84]:
comment_df = pd.DataFrame(comment_list)

In [85]:
name = ' '.join(re.findall(r"(?i)\b[a-z]+\b", name)) #remove non alphabets
comment_df.to_csv(f'{name} comments.csv')

## Scraping for zombicide

In [86]:
game_id = '113924'
page_number = 1

In [87]:
comment_list = []
name = ''
while page_number != 0:
    url = f'https://www.boardgamegeek.com/xmlapi/boardgame/{game_id}?comments=1&page={page_number}'
    print(f'Request response from {url}')
    res = requests.get(url)
    if res.status_code == 200:
        print('Request successful. Start scrapping comments...')
        soup = BeautifulSoup(res.content, 'lxml')
        name = soup.find('name').text
        comment = soup.find_all('comment')
        if len(comment) != 0: # there are some comments available in the url
            page_number += 1
            for i in range(len(comment)):
                comment_dict = {}
                comment_dict['username'] = comment[i].attrs['username']
                comment_dict['rating'] = comment[i].attrs['rating']
                comment_dict['comment'] = comment[i].text
                comment_list.append(comment_dict)
        else:
            print(f'No comments found in page {page_number}')
            print(f'Total {len(comment_list)} comments scraped for <{name}>. End of Scrapping')
            page_number = 0
    else:
        print('Getting response failed.')

Request response from https://www.boardgamegeek.com/xmlapi/boardgame/113924?comments=1&page=1
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/113924?comments=1&page=2
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/113924?comments=1&page=3
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/113924?comments=1&page=4
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/113924?comments=1&page=5
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/113924?comments=1&page=6
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/113924?comments=1&page=7
Request successful. Start scrapping comments...
Reques

In [88]:
comment_df = pd.DataFrame(comment_list)

In [89]:
name = ' '.join(re.findall(r"(?i)\b[a-z]+\b", name)) #remove non alphabets
comment_df.to_csv(f'{name} comments.csv')

## Scraping for dream-on

In [90]:
game_id = '232980'
page_number = 1

In [91]:
comment_list = []
name = ''
while page_number != 0:
    url = f'https://www.boardgamegeek.com/xmlapi/boardgame/{game_id}?comments=1&page={page_number}'
    print(f'Request response from {url}')
    res = requests.get(url)
    if res.status_code == 200:
        print('Request successful. Start scrapping comments...')
        soup = BeautifulSoup(res.content, 'lxml')
        name = soup.find('name').text
        comment = soup.find_all('comment')
        if len(comment) != 0: # there are some comments available in the url
            page_number += 1
            for i in range(len(comment)):
                comment_dict = {}
                comment_dict['username'] = comment[i].attrs['username']
                comment_dict['rating'] = comment[i].attrs['rating']
                comment_dict['comment'] = comment[i].text
                comment_list.append(comment_dict)
        else:
            print(f'No comments found in page {page_number}')
            print(f'Total {len(comment_list)} comments scraped for <{name}>. End of Scrapping')
            page_number = 0
    else:
        print('Getting response failed.')

Request response from https://www.boardgamegeek.com/xmlapi/boardgame/232980?comments=1&page=1
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/232980?comments=1&page=2
Request successful. Start scrapping comments...
No comments found in page 2
Total 49 comments scraped for <Chyba śnisz!>. End of Scrapping


In [92]:
comment_df = pd.DataFrame(comment_list)

In [93]:
name = ' '.join(re.findall(r"(?i)\b[a-z]+\b", name)) #remove non alphabets
comment_df.to_csv(f'{name} comments.csv')

## Scraping for modern-art

In [94]:
game_id = '40381'
page_number = 1

In [95]:
comment_list = []
name = ''
while page_number != 0:
    url = f'https://www.boardgamegeek.com/xmlapi/boardgame/{game_id}?comments=1&page={page_number}'
    print(f'Request response from {url}')
    res = requests.get(url)
    if res.status_code == 200:
        print('Request successful. Start scrapping comments...')
        soup = BeautifulSoup(res.content, 'lxml')
        name = soup.find('name').text
        comment = soup.find_all('comment')
        if len(comment) != 0: # there are some comments available in the url
            page_number += 1
            for i in range(len(comment)):
                comment_dict = {}
                comment_dict['username'] = comment[i].attrs['username']
                comment_dict['rating'] = comment[i].attrs['rating']
                comment_dict['comment'] = comment[i].text
                comment_list.append(comment_dict)
        else:
            print(f'No comments found in page {page_number}')
            print(f'Total {len(comment_list)} comments scraped for <{name}>. End of Scrapping')
            page_number = 0
    else:
        print('Getting response failed.')

Request response from https://www.boardgamegeek.com/xmlapi/boardgame/40381?comments=1&page=1
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/40381?comments=1&page=2
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/40381?comments=1&page=3
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/40381?comments=1&page=4
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/40381?comments=1&page=5
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/40381?comments=1&page=6
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/40381?comments=1&page=7
Request successful. Start scrapping comments...
Request respo

In [96]:
comment_df = pd.DataFrame(comment_list)

In [97]:
name = ' '.join(re.findall(r"(?i)\b[a-z]+\b", name)) #remove non alphabets
comment_df.to_csv(f'{name} comments.csv')

## Scraping for Massive Darkness

In [98]:
game_id = '197070'
page_number = 1

In [99]:
comment_list = []
name = ''
while page_number != 0:
    url = f'https://www.boardgamegeek.com/xmlapi/boardgame/{game_id}?comments=1&page={page_number}'
    print(f'Request response from {url}')
    res = requests.get(url)
    if res.status_code == 200:
        print('Request successful. Start scrapping comments...')
        soup = BeautifulSoup(res.content, 'lxml')
        name = soup.find('name').text
        comment = soup.find_all('comment')
        if len(comment) != 0: # there are some comments available in the url
            page_number += 1
            for i in range(len(comment)):
                comment_dict = {}
                comment_dict['username'] = comment[i].attrs['username']
                comment_dict['rating'] = comment[i].attrs['rating']
                comment_dict['comment'] = comment[i].text
                comment_list.append(comment_dict)
        else:
            print(f'No comments found in page {page_number}')
            print(f'Total {len(comment_list)} comments scraped for <{name}>. End of Scrapping')
            page_number = 0
    else:
        print('Getting response failed.')

Request response from https://www.boardgamegeek.com/xmlapi/boardgame/197070?comments=1&page=1
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/197070?comments=1&page=2
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/197070?comments=1&page=3
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/197070?comments=1&page=4
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/197070?comments=1&page=5
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/197070?comments=1&page=6
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/197070?comments=1&page=7
Request successful. Start scrapping comments...
Reques

In [100]:
comment_df = pd.DataFrame(comment_list)

In [101]:
name = ' '.join(re.findall(r"(?i)\b[a-z]+\b", name)) #remove non alphabets
comment_df.to_csv(f'{name} comments.csv')

## Scraping for Ethnos

In [102]:
game_id = '206718'
page_number = 1

In [103]:
comment_list = []
name = ''
while page_number != 0:
    url = f'https://www.boardgamegeek.com/xmlapi/boardgame/{game_id}?comments=1&page={page_number}'
    print(f'Request response from {url}')
    res = requests.get(url)
    if res.status_code == 200:
        print('Request successful. Start scrapping comments...')
        soup = BeautifulSoup(res.content, 'lxml')
        name = soup.find('name').text
        comment = soup.find_all('comment')
        if len(comment) != 0: # there are some comments available in the url
            page_number += 1
            for i in range(len(comment)):
                comment_dict = {}
                comment_dict['username'] = comment[i].attrs['username']
                comment_dict['rating'] = comment[i].attrs['rating']
                comment_dict['comment'] = comment[i].text
                comment_list.append(comment_dict)
        else:
            print(f'No comments found in page {page_number}')
            print(f'Total {len(comment_list)} comments scraped for <{name}>. End of Scrapping')
            page_number = 0
    else:
        print('Getting response failed.')

Request response from https://www.boardgamegeek.com/xmlapi/boardgame/206718?comments=1&page=1
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/206718?comments=1&page=2
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/206718?comments=1&page=3
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/206718?comments=1&page=4
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/206718?comments=1&page=5
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/206718?comments=1&page=6
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/206718?comments=1&page=7
Request successful. Start scrapping comments...
Reques

In [104]:
comment_df = pd.DataFrame(comment_list)

In [105]:
name = ' '.join(re.findall(r"(?i)\b[a-z]+\b", name)) #remove non alphabets
comment_df.to_csv(f'{name} comments.csv')

## Scraping for The Others

In [109]:
game_id = '172047'
page_number = 1

In [110]:
comment_list = []
name = ''
while page_number != 0:
    url = f'https://www.boardgamegeek.com/xmlapi/boardgame/{game_id}?comments=1&page={page_number}'
    print(f'Request response from {url}')
    res = requests.get(url)
    if res.status_code == 200:
        print('Request successful. Start scrapping comments...')
        soup = BeautifulSoup(res.content, 'lxml')
        name = soup.find('name').text
        comment = soup.find_all('comment')
        if len(comment) != 0: # there are some comments available in the url
            page_number += 1
            for i in range(len(comment)):
                comment_dict = {}
                comment_dict['username'] = comment[i].attrs['username']
                comment_dict['rating'] = comment[i].attrs['rating']
                comment_dict['comment'] = comment[i].text
                comment_list.append(comment_dict)
        else:
            print(f'No comments found in page {page_number}')
            print(f'Total {len(comment_list)} comments scraped for <{name}>. End of Scrapping')
            page_number = 0
    else:
        print('Getting response failed.')

Request response from https://www.boardgamegeek.com/xmlapi/boardgame/172047?comments=1&page=1
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/172047?comments=1&page=2
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/172047?comments=1&page=3
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/172047?comments=1&page=4
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/172047?comments=1&page=5
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/172047?comments=1&page=6
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/172047?comments=1&page=7
Request successful. Start scrapping comments...
Reques

In [111]:
comment_df = pd.DataFrame(comment_list)

In [112]:
name = ' '.join(re.findall(r"(?i)\b[a-z]+\b", name)) #remove non alphabets
comment_df.to_csv(f'{name} comments.csv')

## Scraping for Blood Rage

In [113]:
game_id = '170216'
page_number = 1

In [114]:
comment_list = []
name = ''
while page_number != 0:
    url = f'https://www.boardgamegeek.com/xmlapi/boardgame/{game_id}?comments=1&page={page_number}'
    print(f'Request response from {url}')
    res = requests.get(url)
    if res.status_code == 200:
        print('Request successful. Start scrapping comments...')
        soup = BeautifulSoup(res.content, 'lxml')
        name = soup.find('name').text
        comment = soup.find_all('comment')
        if len(comment) != 0: # there are some comments available in the url
            page_number += 1
            for i in range(len(comment)):
                comment_dict = {}
                comment_dict['username'] = comment[i].attrs['username']
                comment_dict['rating'] = comment[i].attrs['rating']
                comment_dict['comment'] = comment[i].text
                comment_list.append(comment_dict)
        else:
            print(f'No comments found in page {page_number}')
            print(f'Total {len(comment_list)} comments scraped for <{name}>. End of Scrapping')
            page_number = 0
    else:
        print('Getting response failed.')

Request response from https://www.boardgamegeek.com/xmlapi/boardgame/170216?comments=1&page=1
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/170216?comments=1&page=2
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/170216?comments=1&page=3
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/170216?comments=1&page=4
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/170216?comments=1&page=5
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/170216?comments=1&page=6
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/170216?comments=1&page=7
Request successful. Start scrapping comments...
Reques

In [115]:
comment_df = pd.DataFrame(comment_list)

In [116]:
name = ' '.join(re.findall(r"(?i)\b[a-z]+\b", name)) #remove non alphabets
comment_df.to_csv(f'{name} comments.csv')

## Scraping for Arcadia Quest

In [117]:
game_id = '155068'
page_number = 1

In [118]:
comment_list = []
name = ''
while page_number != 0:
    url = f'https://www.boardgamegeek.com/xmlapi/boardgame/{game_id}?comments=1&page={page_number}'
    print(f'Request response from {url}')
    res = requests.get(url)
    if res.status_code == 200:
        print('Request successful. Start scrapping comments...')
        soup = BeautifulSoup(res.content, 'lxml')
        name = soup.find('name').text
        comment = soup.find_all('comment')
        if len(comment) != 0: # there are some comments available in the url
            page_number += 1
            for i in range(len(comment)):
                comment_dict = {}
                comment_dict['username'] = comment[i].attrs['username']
                comment_dict['rating'] = comment[i].attrs['rating']
                comment_dict['comment'] = comment[i].text
                comment_list.append(comment_dict)
        else:
            print(f'No comments found in page {page_number}')
            print(f'Total {len(comment_list)} comments scraped for <{name}>. End of Scrapping')
            page_number = 0
    else:
        print('Getting response failed.')

Request response from https://www.boardgamegeek.com/xmlapi/boardgame/155068?comments=1&page=1
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/155068?comments=1&page=2
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/155068?comments=1&page=3
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/155068?comments=1&page=4
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/155068?comments=1&page=5
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/155068?comments=1&page=6
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/155068?comments=1&page=7
Request successful. Start scrapping comments...
Reques

In [119]:
comment_df = pd.DataFrame(comment_list)

In [120]:
name = ' '.join(re.findall(r"(?i)\b[a-z]+\b", name)) #remove non alphabets
comment_df.to_csv(f'{name} comments.csv')

## Scraping for The Grizzled 

In [121]:
game_id = '171668'
page_number = 1

In [122]:
comment_list = []
name = ''
while page_number != 0:
    url = f'https://www.boardgamegeek.com/xmlapi/boardgame/{game_id}?comments=1&page={page_number}'
    print(f'Request response from {url}')
    res = requests.get(url)
    if res.status_code == 200:
        print('Request successful. Start scrapping comments...')
        soup = BeautifulSoup(res.content, 'lxml')
        name = soup.find('name').text
        comment = soup.find_all('comment')
        if len(comment) != 0: # there are some comments available in the url
            page_number += 1
            for i in range(len(comment)):
                comment_dict = {}
                comment_dict['username'] = comment[i].attrs['username']
                comment_dict['rating'] = comment[i].attrs['rating']
                comment_dict['comment'] = comment[i].text
                comment_list.append(comment_dict)
        else:
            print(f'No comments found in page {page_number}')
            print(f'Total {len(comment_list)} comments scraped for <{name}>. End of Scrapping')
            page_number = 0
    else:
        print('Getting response failed.')

Request response from https://www.boardgamegeek.com/xmlapi/boardgame/171668?comments=1&page=1
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/171668?comments=1&page=2
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/171668?comments=1&page=3
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/171668?comments=1&page=4
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/171668?comments=1&page=5
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/171668?comments=1&page=6
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/171668?comments=1&page=7
Request successful. Start scrapping comments...
Reques

In [123]:
comment_df = pd.DataFrame(comment_list)

In [124]:
name = ' '.join(re.findall(r"(?i)\b[a-z]+\b", name)) #remove non alphabets
comment_df.to_csv(f'{name} comments.csv')

## Scraping for  A Song of Ice and Fire

In [125]:
game_id = '223376'
page_number = 1

In [126]:
comment_list = []
name = ''
while page_number != 0:
    url = f'https://www.boardgamegeek.com/xmlapi/boardgame/{game_id}?comments=1&page={page_number}'
    print(f'Request response from {url}')
    res = requests.get(url)
    if res.status_code == 200:
        print('Request successful. Start scrapping comments...')
        soup = BeautifulSoup(res.content, 'lxml')
        name = soup.find('name').text
        comment = soup.find_all('comment')
        if len(comment) != 0: # there are some comments available in the url
            page_number += 1
            for i in range(len(comment)):
                comment_dict = {}
                comment_dict['username'] = comment[i].attrs['username']
                comment_dict['rating'] = comment[i].attrs['rating']
                comment_dict['comment'] = comment[i].text
                comment_list.append(comment_dict)
        else:
            print(f'No comments found in page {page_number}')
            print(f'Total {len(comment_list)} comments scraped for <{name}>. End of Scrapping')
            page_number = 0
    else:
        print('Getting response failed.')

Request response from https://www.boardgamegeek.com/xmlapi/boardgame/223376?comments=1&page=1
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/223376?comments=1&page=2
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/223376?comments=1&page=3
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/223376?comments=1&page=4
Request successful. Start scrapping comments...
No comments found in page 4
Total 214 comments scraped for <Canción de hielo y fuego: El juego de miniaturas>. End of Scrapping


In [127]:
comment_df = pd.DataFrame(comment_list)

In [128]:
name = ' '.join(re.findall(r"(?i)\b[a-z]+\b", name)) #remove non alphabets
comment_df.to_csv(f'{name} comments.csv')

## Scraping for  Wrath of kings

In [129]:
game_id = '146451'
page_number = 1

In [130]:
comment_list = []
name = ''
while page_number != 0:
    url = f'https://www.boardgamegeek.com/xmlapi/boardgame/{game_id}?comments=1&page={page_number}'
    print(f'Request response from {url}')
    res = requests.get(url)
    if res.status_code == 200:
        print('Request successful. Start scrapping comments...')
        soup = BeautifulSoup(res.content, 'lxml')
        name = soup.find('name').text
        comment = soup.find_all('comment')
        if len(comment) != 0: # there are some comments available in the url
            page_number += 1
            for i in range(len(comment)):
                comment_dict = {}
                comment_dict['username'] = comment[i].attrs['username']
                comment_dict['rating'] = comment[i].attrs['rating']
                comment_dict['comment'] = comment[i].text
                comment_list.append(comment_dict)
        else:
            print(f'No comments found in page {page_number}')
            print(f'Total {len(comment_list)} comments scraped for <{name}>. End of Scrapping')
            page_number = 0
    else:
        print('Getting response failed.')

Request response from https://www.boardgamegeek.com/xmlapi/boardgame/146451?comments=1&page=1
Request successful. Start scrapping comments...
Request response from https://www.boardgamegeek.com/xmlapi/boardgame/146451?comments=1&page=2
Request successful. Start scrapping comments...
No comments found in page 2
Total 39 comments scraped for <Wrath of Kings>. End of Scrapping


In [131]:
comment_df = pd.DataFrame(comment_list)

In [132]:
name = ' '.join(re.findall(r"(?i)\b[a-z]+\b", name)) #remove non alphabets
comment_df.to_csv(f'{name} comments.csv')