# WebScraping of Gaming Sites [(View)](https://www.y8.com/)

![image](https://i.imgur.com/dLB0NVQ.jpg)


- [Y8.com](https://www.y8.com/#:~:text=Y8.com%20is%20home%20for,HTML5%20games%20will%20suit%20you.) is home for gamers on any device. 
- Play phone games or get rich 3D graphics on desktops by playing WebGL Games. 
- Otherwise, if your preference is casual 2D worlds, then HTML5 games will suit you. 
- Founded in 2006, Y8.com is a game development site and a growing community that caters to thousands of free games formats
- Its supported by Adobe Flash Player, Unity 3D, HTML5, Android, and Java.
- Most of us indeed relished playing various browser games during Y8's glory days back in 2007-2008.
- Competitors of [Y8.com](https://www.y8.com/#:~:text=Y8.com%20is%20home%20for,HTML5%20games%20will%20suit%20you.) are: [agame.com](https://www.agame.com/), [friv.com](https://www.friv.com/), [poki.com](https://poki.com/), [twoplayergames.org](https://www.twoplayergames.org/)
- Hereby we have a list of gaming catagories based on that we can extract data information by using [WebScraping](https://www.geeksforgeeks.org/python-web-scraping-tutorial/).

### [Web Scraping ](https://www.geeksforgeeks.org/python-web-scraping-tutorial/)


# ![banner-image](https://i.imgur.com/owoOoCy.png)



- Technique of extracting large amounts of data from websites where the extracted data is saved in a local file on your device. 

- The simplest form of Web scraping is manually copying and pasting data from a web page into a text file or spreadsheet. 

- Sometimes this is the only solution when the websites set up barriers. 

- But in most cases, a huge amount of data is required which is difficult for a human to scrape. 

- Therefore, we have Web scraping tools to automate the process.

- we will use `requests` and `BeautifulSoup` libraries to scrape data from [Y8.com](https://www.y8.com/#:~:text=Y8.com%20is%20home%20for,HTML5%20games%20will%20suit%20you.)

Outline-Steps we've to process:

1. Download the web page using requests.
2. Parse the html document using BeautifulSoup.
3. Extract Game Name, Technologu used, Rating, Number of Players.
4. Compile extracted information into lists and dictionaries.
5. Read and save the extracted information as CSV files.
6. Combine data from multiple pages.

Libraries to be used in this:-

* [requests](https://requests.readthedocs.io/en/latest/)

* [Beautiful Soup](https://www.crummy.com/software/BeautifulSoup/bs4/doc/)

* [pandas](https://www.geeksforgeeks.org/introduction-to-pandas-in-python/)

### Step:1 - Download the WebPage using `requests`

We'll use a library called [requests](https://requests.readthedocs.io/en/latest/) to download web pages from the internet. Let's begin by installing and importing the library.

In [1]:
! pip install requests --upgrade --quiet

The above library `requests` is installed and we need to import the same

In [2]:
import requests

Need to Select the topic/categories from websites

![image](https://i.imgur.com/69JD5hb.jpg)

From the Gaming categories i have choose `Action-Adventure`

In [3]:
topic_url = 'https://www.y8.com/categories/action_adventure'

To download a page, we can use `.get` function from `requests`, which returns `response`

In [4]:
response = requests.get(topic_url)

In [5]:
type(response)

requests.models.Response

requests.get returns a response object with the page contents and some information indicating whether the request was successful, using a status code. 

Learn more about HTTP status codes here: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status

If the request was successful, `response.status_code` is set to a value between `200` and `299`.

In [6]:
response.status_code

200

The contents of the web page can be accessed using the `.text` property of the `response`

In [7]:
content = response.text

In [8]:
len(content)

287757

In [9]:
content

'<!DOCTYPE html>\n<html class="no-touch" lang="en"\n      dir="ltr">\n  <head>\n    <meta charset="utf-8">\n<script type="text/javascript">window.NREUM||(NREUM={});NREUM.info={"beacon":"bam.nr-data.net","errorBeacon":"bam.nr-data.net","licenseKey":"64ea7759b2","applicationID":"2972689","transactionName":"eltWERRfXFtVQB5WVU1RXwoUWVVEH0FZWkM=","queueTime":1,"applicationTime":275,"agent":""}</script>\n<script type="text/javascript">(window.NREUM||(NREUM={})).init={ajax:{deny_list:["bam.nr-data.net"]}};(window.NREUM||(NREUM={})).loader_config={xpid:"VwcCU1ZVGwEJU1NUDwg=",licenseKey:"64ea7759b2",applicationID:"2972689"};;(()=>{var e,t,r={9071:(e,t,r)=>{"use strict";r.d(t,{I:()=>n});var n=0,i=navigator.userAgent.match(/Firefox[\\/\\s](\\d+\\.\\d+)/);i&&(n=+i[1])},6562:(e,t,r)=>{"use strict";r.d(t,{P_:()=>p,Mt:()=>v,C5:()=>d,DL:()=>y,OP:()=>k,lF:()=>H,Yu:()=>E,Dg:()=>g,CX:()=>f,GE:()=>w,sU:()=>L});var n={};r.r(n),r.d(n,{agent:()=>x,match:()=>_,version:()=>O});var i=r(6797),o=r(909),a=r(8610);

Let's save the contents to a file with the `.html` extensions

In [10]:
with open('Games.html', 'w') as file:
    file.write(content)

From the above one you can view the file by click on the "File>Open" menu option within Jupyter and clicking on `Games.html` in the list of files displayed.

![img1](https://i.imgur.com/snQ44To.jpg)
![img2](https://i.imgur.com/kOXzO0z.jpg)

We have successfully downloaded the webpage

### Step:2 - Extracting information from HTML using Beautiful Soup

To extract information from the HTML source code of a webpage programmatically, we can use the [Beautiful Soup](https://www.crummy.com/software/BeautifulSoup/bs4/doc/) library.

In [100]:
! pip install BeautifulSoup4 --upgrade --quiet

The above library `BeautifulSoup4` is installed and we need to import the same

In [12]:
from bs4 import BeautifulSoup

From the below, read the contents of the file `Games.html` and create a `BeautifulSoup` object to parse the content

In [13]:
with open ('Games.html', 'r') as file:
    html_source = file.read()

In [14]:
len(html_source)

287757

In [15]:
html_source

'<!DOCTYPE html>\n<html class="no-touch" lang="en"\n      dir="ltr">\n  <head>\n    <meta charset="utf-8">\n<script type="text/javascript">window.NREUM||(NREUM={});NREUM.info={"beacon":"bam.nr-data.net","errorBeacon":"bam.nr-data.net","licenseKey":"64ea7759b2","applicationID":"2972689","transactionName":"eltWERRfXFtVQB5WVU1RXwoUWVVEH0FZWkM=","queueTime":1,"applicationTime":275,"agent":""}</script>\n<script type="text/javascript">(window.NREUM||(NREUM={})).init={ajax:{deny_list:["bam.nr-data.net"]}};(window.NREUM||(NREUM={})).loader_config={xpid:"VwcCU1ZVGwEJU1NUDwg=",licenseKey:"64ea7759b2",applicationID:"2972689"};;(()=>{var e,t,r={9071:(e,t,r)=>{"use strict";r.d(t,{I:()=>n});var n=0,i=navigator.userAgent.match(/Firefox[\\/\\s](\\d+\\.\\d+)/);i&&(n=+i[1])},6562:(e,t,r)=>{"use strict";r.d(t,{P_:()=>p,Mt:()=>v,C5:()=>d,DL:()=>y,OP:()=>k,lF:()=>H,Yu:()=>E,Dg:()=>g,CX:()=>f,GE:()=>w,sU:()=>L});var n={};r.r(n),r.d(n,{agent:()=>x,match:()=>_,version:()=>O});var i=r(6797),o=r(909),a=r(8610);

In [16]:
doc = BeautifulSoup(html_source, 'html.parser')

Inspect the `doc type` for your verification

In [17]:
type(doc)

bs4.BeautifulSoup

### Step:3 Use Beautiful Soup to parse and extract information

Let's find the First tag of a and parsing the documents

In [18]:
doc.find('a')

<a aria-label="logo" class="no-event" href="https://www.y8.com/"><img alt="Y8.com" height="48" src="https://img.y8.com/assets/y8/header-logo-b39e5071cb111465fc5a5aef6496121adfcb414692d067f967434d9d80418afc.svg" width="100"/>
</a>

Using reusable function in order parse and extract data from `Games.html` file

In [19]:
def get_topic_page(url):
    topic_url = 'https://www.y8.com/tags/' + url
    response = requests.get(topic_url)
    if response.status_code != 200:
        print('Status Code = ', response.status_code)
        raise Exception('failed to fetch web page' + topic_url)
        
    topic_doc = BeautifulSoup(response.text)
    return topic_doc

In [20]:
get_topic_page('action')

<!DOCTYPE html>
<html class="no-touch" dir="ltr" lang="en">
<head>
<meta charset="utf-8"/>
<meta content="width=device-width, initial-scale=1.0, maximum-scale=5.0, minimum-scale=1.0, minimal-ui" name="viewport"/>
<meta content="#FFF" name="theme-color"/>
<link href="https://img.y8.com" rel="preconnect"/>
<link href="https://cdn.y8.com" rel="preconnect"/>
<link href="https://fonts.googleapis.com" rel="preconnect"/>
<link href="https://fonts.gstatic.com" rel="preconnect"/>
<link href="https://account.y8.com" rel="preconnect"/>
<link href="/y8_manifest.json" rel="manifest"/>
<script id="gp-app-info">
//<![CDATA[

  window.appInfo = {
    cdn: 'https://cdn.y8.com',
    skinName: 'y8.com',
    skinShortName: 'y8',
    skinDomain: 'y8.com',
    currentLocale: 'en',
    currentKind: 'game',
    recaptchaV3SiteKey: '6LcCYbgUAAAAAC6jOB2wXW8L59EmH3oVZr0r-qxZ',

    accountServiceApiUrls: {
      pointsBonusTimerUrl: 'https://account.y8.com/points/bonus_timer',
      profilePointsTotalUrl: 'https

### Extract Name, Technology, Ratings and Players from the Games

#### To find related `div`, `h`, `p`, `span`, `class` Select the Game  and Right-Click on the `Inspect`

![img1](https://i.imgur.com/QeYBSND.jpg)
![img2](https://i.imgur.com/kW9b6hk.jpg)

In [21]:
div_tags = doc.find_all('div',class_='item thumb videobox grid-column')

In [22]:
div_tags

[<div class="item thumb videobox grid-column" data-item-id="68221" data-label-ids="1 Player,3D,Free Game,Default,Unity3D,Blocks,Adventure,Undead,Voxel,WebGL,Zombies,Mine" data-mp4-movie="https://img.y8.com/cloud/v2-y8-video-previews-001/videos/130170/e43eeeeefaf81f9ac5785e0d7e4d62f5fdfe1331.mp4?1554276634" data-ogv-movie="https://img.y8.com/cloud/v2-y8-video-previews-001/videos/130170/5c61f9722448a3e6809f04479e7eca83b1df7050.ogv?1554276634" data-poster-url="https://img.y8.com/assets/video_loader_180x135-63697df7850db644b0fe994bd8a23977d297e8e22941cb82c831a334ec57745a.gif" data-technologies='["unity_webgl","unity_webgl_32_bit"]' data-thumb-movie='["https://img.y8.com/cloud/v2-y8-video-previews-001/videos/130170/5a65f2a0509c0d2a583cff64e97c37767a91bd29.gif?1554276634","https://img.y8.com/cloud/v2-y8-video-previews-001/videos/130170/d2bd72e81c7006fe8e963986e8aa2c997ac55daa.gif?1554276634","https://img.y8.com/cloud/v2-y8-video-previews-001/videos/130170/05cab884a515c49418159399ce69727417e4

In [23]:
len(div_tags)

64

In [24]:
abc = div_tags[0]

In [25]:
abc

<div class="item thumb videobox grid-column" data-item-id="68221" data-label-ids="1 Player,3D,Free Game,Default,Unity3D,Blocks,Adventure,Undead,Voxel,WebGL,Zombies,Mine" data-mp4-movie="https://img.y8.com/cloud/v2-y8-video-previews-001/videos/130170/e43eeeeefaf81f9ac5785e0d7e4d62f5fdfe1331.mp4?1554276634" data-ogv-movie="https://img.y8.com/cloud/v2-y8-video-previews-001/videos/130170/5c61f9722448a3e6809f04479e7eca83b1df7050.ogv?1554276634" data-poster-url="https://img.y8.com/assets/video_loader_180x135-63697df7850db644b0fe994bd8a23977d297e8e22941cb82c831a334ec57745a.gif" data-technologies='["unity_webgl","unity_webgl_32_bit"]' data-thumb-movie='["https://img.y8.com/cloud/v2-y8-video-previews-001/videos/130170/5a65f2a0509c0d2a583cff64e97c37767a91bd29.gif?1554276634","https://img.y8.com/cloud/v2-y8-video-previews-001/videos/130170/d2bd72e81c7006fe8e963986e8aa2c997ac55daa.gif?1554276634","https://img.y8.com/cloud/v2-y8-video-previews-001/videos/130170/05cab884a515c49418159399ce69727417e45

To represent `Game Name` we can go on with `h4 tag` 

In [26]:
h4_tags = abc.find('h4')

In [27]:
h4_tags

<h4 class="item__title ltr">Mineclone 3</h4>

In [28]:
name = h4_tags.text.strip()

In [29]:
name

'Mineclone 3'

To represent `Technology` we can go on with `class = 'item__technology'`

In [30]:
tech = abc.find('div', class_='item__technology')

In [31]:
tech

<div class="item__technology">
<p class="unity_webgl">
          WebGL
        </p>
</div>

In [32]:
technology = tech.text.strip()

In [33]:
technology

'WebGL'

To represent `Ratings` we can go on with `class = 'item__number'`

In [34]:
rate = abc.find(class_='item__number')

In [35]:
rate

<span class="item__number">
          78%
        </span>

In [36]:
ratings = rate.text.strip()

In [37]:
ratings

'78%'

To represent `No.of Players` we can go on with `class = 'item__plays-count'`

In [38]:
play = abc.find('p', class_='item__plays-count')

In [39]:
play

<p class="item__plays-count">
          13,076,707 plays
      </p>

In [40]:
player = play.text.strip()

In [41]:
players = player[:-6].replace(',','')

In [42]:
players

'13076707'

Now we have to write and define reusable functions from the above cells

In [43]:
def parse_top_games(abc):
# <div> tag contains Game name, Technology, Ratings, No.of Players   
    h4_tags = abc.find('h4')
# Game name   
    name = h4_tags.text.strip()
#Technology    
    tech = abc.find('div', class_='item__technology')
    technology = tech.text.strip()
#Ratings
    rate = abc.find(class_='item__number')
    ratings = rate.text.strip()
#No.of.Players
    play = abc.find('p', class_='item__plays-count')
    player = play.text.strip()
    players = player[:-6].replace(',','')
#Returning the values in Dict format    
    return {
        
        'Game' : name,
        'Technology' : technology,
        'Ratings' : ratings,
        'Players' : players
    }

In [44]:
parse_top_games(div_tags[0])

{'Game': 'Mineclone 3',
 'Technology': 'WebGL',
 'Ratings': '78%',
 'Players': '13076707'}

In [45]:
parse_top_games(div_tags[1])

{'Game': 'Brawl Bash',
 'Technology': 'WebGL',
 'Ratings': '77%',
 'Players': '71009'}

We have extracted the data in successful manner.

### Compile extracted information into lists and dictionaries

In [46]:
top_games = [parse_top_games(a) for a in div_tags]

In [47]:
top_games

[{'Game': 'Mineclone 3',
  'Technology': 'WebGL',
  'Ratings': '78%',
  'Players': '13076707'},
 {'Game': 'Brawl Bash',
  'Technology': 'WebGL',
  'Ratings': '77%',
  'Players': '71009'},
 {'Game': 'Hard Life',
  'Technology': 'HTML5',
  'Ratings': '67%',
  'Players': '525360'},
 {'Game': 'The Backrooms',
  'Technology': 'WebGL',
  'Ratings': '51%',
  'Players': '104878'},
 {'Game': 'Light in the Dark',
  'Technology': 'HTML5',
  'Ratings': '76%',
  'Players': '7623'},
 {'Game': 'Police Chase Real Cop Driver',
  'Technology': 'WebGL',
  'Ratings': '79%',
  'Players': '1490237'},
 {'Game': 'Zombie Horde',
  'Technology': 'WebGL',
  'Ratings': '88%',
  'Players': '2992'},
 {'Game': 'Stick Fighter 3D',
  'Technology': 'WebGL',
  'Ratings': '69%',
  'Players': '167090'},
 {'Game': 'Creepy Evil Granny',
  'Technology': 'WebGL',
  'Ratings': '76%',
  'Players': '78681'},
 {'Game': 'Kogama: Piggy',
  'Technology': 'WebGL',
  'Ratings': '89%',
  'Players': '11019'},
 {'Game': 'Stickman Boost!'

This is the list of data for a particular page with url `action`

Now we have to define a reusable function to print list of games data by refering the above cells.

In [48]:
def get_top_games(doc):
    div_tags = doc.find_all('div', class_='item thumb videobox grid-column')
    top_games = [parse_top_games(a) for a in div_tags]
    return top_games

In [49]:
top_games

[{'Game': 'Mineclone 3',
  'Technology': 'WebGL',
  'Ratings': '78%',
  'Players': '13076707'},
 {'Game': 'Brawl Bash',
  'Technology': 'WebGL',
  'Ratings': '77%',
  'Players': '71009'},
 {'Game': 'Hard Life',
  'Technology': 'HTML5',
  'Ratings': '67%',
  'Players': '525360'},
 {'Game': 'The Backrooms',
  'Technology': 'WebGL',
  'Ratings': '51%',
  'Players': '104878'},
 {'Game': 'Light in the Dark',
  'Technology': 'HTML5',
  'Ratings': '76%',
  'Players': '7623'},
 {'Game': 'Police Chase Real Cop Driver',
  'Technology': 'WebGL',
  'Ratings': '79%',
  'Players': '1490237'},
 {'Game': 'Zombie Horde',
  'Technology': 'WebGL',
  'Ratings': '88%',
  'Players': '2992'},
 {'Game': 'Stick Fighter 3D',
  'Technology': 'WebGL',
  'Ratings': '69%',
  'Players': '167090'},
 {'Game': 'Creepy Evil Granny',
  'Technology': 'WebGL',
  'Ratings': '76%',
  'Players': '78681'},
 {'Game': 'Kogama: Piggy',
  'Technology': 'WebGL',
  'Ratings': '89%',
  'Players': '11019'},
 {'Game': 'Stickman Boost!'

In [50]:
topic_page_xyz = get_topic_page('action')
top_games_xyz = get_top_games(topic_page_xyz)
top_games_xyz[:5]

[{'Game': 'Mini Samurai: Kurofune',
  'Technology': 'WebGL',
  'Ratings': '87%',
  'Players': '18023'},
 {'Game': 'Diary of a Wimpy Kid: The Meltdown',
  'Technology': 'HTML5',
  'Ratings': '88%',
  'Players': '116225'},
 {'Game': 'Stained Act 1',
  'Technology': 'WebGL',
  'Ratings': '81%',
  'Players': '137812'},
 {'Game': 'LA Shark',
  'Technology': 'WebGL',
  'Ratings': '76%',
  'Players': '1454426'},
 {'Game': 'Madness: Off-Color',
  'Technology': 'HTML5',
  'Ratings': '87%',
  'Players': '216682'}]

From the above, we have extracted the list of games data

### Step:4 Read and save the extracted information as CSV files

Here we have defined a `write_csv function` to separate `keys` and `values` from `dictionary`

In [51]:
def write_csv(items, path):
    # Open the file in write mode
    with open(path, 'w') as f:
        # Return if there's nothing to write
        if len(items) == 0:
            return
        
        # Write the headers in the first line
        headers = list(items[0].keys())
        f.write(','.join(headers) + '\n')
        
        # Write one item per line
        for item in items:
            values = []
            for header in headers:
                values.append(str(item.get(header, "")))
            f.write(','.join(values) + "\n")

In [52]:
write_csv(top_games, 'Games.csv')

In [53]:
with open('Games.csv', 'r') as file:
    print(file.read())

Game,Technology,Ratings,Players
Mineclone 3,WebGL,78%,13076707
Brawl Bash,WebGL,77%,71009
Hard Life,HTML5,67%,525360
The Backrooms,WebGL,51%,104878
Light in the Dark,HTML5,76%,7623
Police Chase Real Cop Driver,WebGL,79%,1490237
Zombie Horde,WebGL,88%,2992
Stick Fighter 3D,WebGL,69%,167090
Creepy Evil Granny,WebGL,76%,78681
Kogama: Piggy,WebGL,89%,11019
Stickman Boost!,HTML5,72%,4979446
Fireboy And Watergirl Light Temple,HTML5,85%,83397838
Pants,HTML5,91%,12725
Samurai Warrios,WebGL,91%,29641
Cave Jumper,WebGL,77%,3190
Locoman,HTML5,67%,3138
Castle of Magic,HTML5,80%,4371
Minescrafter: Steve and Alex,HTML5,75%,18567
Lick 'em All,HTML5,66%,1786470
Bomb It 6,HTML5,88%,29547593
Chef Quest,HTML5,93%,1839
A Flying Machine,HTML5,75%,29825
Clickventure: Castaway,WebGL,80%,13015
Super Hot,WebGL,81%,2897474
Hall of Palettes,HTML5,82%,1605
Total Darkness,HTML5,92%,15271
Vex 5,HTML5,81%,847362
Trapdoor,HTML5,54%,699068
Impostor,HTML5,78%,4381048
Pixel Challenge,HTML5,68%,18067
Bomb It 4,HTML5,87%,

The below Image is belongs to `Games.csv` in .csv format

![img](https://i.imgur.com/qZ8iKkl.jpg)

As per the above, we have extracted and converted the data into `csv format` and saved the file as `.csv`

### Complete reusable function code in single cell

By joining all the above reusable functions we'll get a single function through which we and get information about games

In [54]:
import requests
from bs4 import BeautifulSoup
topic_url = 'https://www.y8.com/tags/'

# 
def scrape_topic_games(url, path = None):
    if path is None:
        path = url + '.csv'
    topic_page_doc = get_topic_page(url)
    top_games = get_top_games(topic_page_doc)
    write_csv(top_games, path)
    print('Top games for topic "{}" written to file "{}"'.format(url, path))
    return path

#
def get_topic_page(url):
    topic_url = 'https://www.y8.com/tags/' + url
    response = requests.get(topic_url)
    if response.status_code != 200:
        print('Status Code = ', response.status_code)
        raise Exception('failed to fetch web page' + topic_url)
        
    topic_doc = BeautifulSoup(response.text)
    return topic_doc

#
def get_top_games(doc):
    div_tags = doc.find_all('div', class_='item thumb videobox grid-column')
    top_games = [parse_top_games(a) for a in div_tags]
    return top_games

#
def parse_top_games(abc):
    
    h4_tags = abc.find('h4')
    
    name = h4_tags.text.strip()
    
    tech = abc.find('div', class_='item__technology')
    technology = tech.text.strip()
    
    rate = abc.find(class_='item__number')
    ratings = rate.text.strip()
    
    play = abc.find('p', class_='item__plays-count')
    player = play.text.strip()
    players = player[:-6].replace(',','')
    
    return {
        
        'Game' : name,
        'Technology' : technology,
        'Ratings' : ratings,
        'Players' : players
    }

#
def write_csv(items, path):
    # Open the file in write mode
    with open(path, 'w') as f:
        # Return if there's nothing to write
        if len(items) == 0:
            return
        
        # Write the headers in the first line
        headers = list(items[0].keys())
        f.write(','.join(headers) + '\n')
        
        # Write one item per line
        for item in items:
            values = []
            for header in headers:
                values.append(str(item.get(header, "")))
            f.write(','.join(values) + "\n")

In [55]:
scrape_topic_games('action')

Top games for topic "action" written to file "action.csv"


'action.csv'

In [56]:
!head action.csv

Game,Technology,Ratings,Players
Mini Samurai: Kurofune,WebGL,87%,18023
Diary of a Wimpy Kid: The Meltdown,HTML5,88%,116225
Stained Act 1,WebGL,81%,137812
LA Shark,WebGL,76%,1454426
Madness: Off-Color,HTML5,87%,216682
Shooter Rush,HTML5,81%,167605
Call of Mini Zombies,WebGL,81%,20864
Giant Wanted,WebGL,80%,8872
Alien Survival 2022,WebGL,72%,44558


In [57]:
scrape_topic_games('action?page=2')

Top games for topic "action?page=2" written to file "action?page=2.csv"


'action?page=2.csv'

In [58]:
!head action?page=2.csv

Game,Technology,Ratings,Players
Protektor,HTML5,88%,18425
Rum & Gun,WebGL,76%,103865
Dead Zed,HTML5,85%,5170996
The Patriots: Fight and Freedom,WebGL,81%,24524
Vex 3,HTML5,81%,34569171
Striker Dummies,WebGL,80%,530734
Fireboy and Watergirl Forest Temple,HTML5,78%,71365324
Battle for the Galaxy,WebGL,92%,1167792
Death Lab,HTML5,89%,108610


In [59]:
scrape_topic_games('action?page=3')

Top games for topic "action?page=3" written to file "action?page=3.csv"


'action?page=3.csv'

In [60]:
!head action?page=3.csv

Game,Technology,Ratings,Players
Jungle Treasure,HTML5,75%,27812
Chaos Roadkill,WebGL,71%,17814
Battle Royale Gangs,WebGL,90%,851571
Amazing Crime Strange Stickman - Rope Vice Vegas,WebGL,87%,1026302
Polygon Royale Shooter,WebGL,81%,15200
Uphill Halloween Racing,WebGL,77%,32966
Red Crucible 2,WebGL,88%,11478050
Mixed Macho Arts,HTML5,85%,368820
Dexomon,HTML5,75%,185110


In [61]:
scrape_topic_games('action?page=4')

Top games for topic "action?page=4" written to file "action?page=4.csv"


'action?page=4.csv'

In [62]:
!head action?page=4.csv

Game,Technology,Ratings,Players
Black Panther: Jungle Pursuit,HTML5,83%,36488
Die Alone,HTML5,76%,29816
Metal Guns Fury,HTML5,82%,164027
Drakensang Online,HTML5,85%,2943379
The Office Guy,HTML5,69%,19373
Ragdoll io,WebGL,82%,240149
Cube Battle Royale,HTML5,80%,258585
Zombie Train,WebGL,92%,4245
Space Marines,WebGL,78%,449699


In [63]:
scrape_topic_games('action?page=5')

Top games for topic "action?page=5" written to file "action?page=5.csv"


'action?page=5.csv'

In [64]:
!head action?page=5.csv

Game,Technology,Ratings,Players
Sun Wukong vs Robot,HTML5,72%,7911
Waaaar io,HTML5,66%,134093
Day of Danger - Henry Danger,HTML5,85%,15828
Lords of the Arena,HTML5,86%,230382
City Siege 4 - Alien Siege,HTML5,93%,187962
Epic War,HTML5,87%,307406
Masked Forces Vs Coronavirus,WebGL,88%,20983
City Siege,HTML5,86%,386746
Street Fight,HTML5,81%,130861


In [65]:
scrape_topic_games('action?page=6')

Top games for topic "action?page=6" written to file "action?page=6.csv"


'action?page=6.csv'

In [66]:
!head action?page=6.csv

Game,Technology,Ratings,Players
Captain Rogers: Defense of Karmax-3,HTML5,57%,104413
Furious Road,HTML5,78%,121741
Burnin' Rubber,HTML5,76%,42342
Real Shooting FPS Strike,WebGL,79%,30391
Block Pixel Cops,WebGL,77%,28810
City Siege Faction Island,HTML5,94%,7319
Imposter Killer,WebGL,71%,9493
Robot Wars,HTML5,81%,15501
Minerest,WebGL,83%,3836


In [67]:
scrape_topic_games('action?page=7')

Top games for topic "action?page=7" written to file "action?page=7.csv"


'action?page=7.csv'

In [68]:
!head action?page=7.csv

Game,Technology,Ratings,Players
CAD War 4,WebGL,81%,30564
Tank Arena Game,HTML5,78%,142183
Army Combat 3D,WebGL,85%,266130
Jet Boi,HTML5,71%,295187
FZ Tap Touch Run,HTML5,48%,13007
Panzer Hero,HTML5,65%,133739
Ezender Keeper,WebGL,79%,46651
300: Seize Your Glory,Unity 3D,91%,1029915
State Police "Police Bike City Simulator",WebGL,78%,56792


In [69]:
scrape_topic_games('action?page=8')

Top games for topic "action?page=8" written to file "action?page=8.csv"


'action?page=8.csv'

In [70]:
!head action?page=8.csv

Game,Technology,Ratings,Players
Ben 10: The Lost World,Unity 3D,85%,122730
Colossal Catastrophe – Tom and Jerry,Unity 3D,91%,41225
Thor's Hammer,Unity 3D,67%,7671
Stephen Karsch,HTML5,82%,13611
Crazy Car Driver,WebGL,80%,101402
Spin Shot,HTML5,84%,9138
Guardians of the Kingdoms,WebGL,65%,26102
Dynamons,HTML5,87%,1086151
Galaxy Aggression 2,HTML5,89%,5403


In [71]:
scrape_topic_games('action?page=9')

Top games for topic "action?page=9" written to file "action?page=9.csv"


'action?page=9.csv'

In [72]:
!head action?page=9.csv

Game,Technology,Ratings,Players
Space Ripper,WebGL,87%,4374
Pick Your Poison Remix,HTML5,81%,3527
Marine Invaders,WebGL,68%,30873
Me and Dungeons,WebGL,82%,11030
Air Warfare,HTML5,76%,60140
Grenade Toss,HTML5,62%,22174
Grandma Chainsaw Action,WebGL,89%,4993
Aliens Attack,HTML5,60%,18176
Galactic Missile Defense,HTML5,88%,9637


In [73]:
scrape_topic_games('action?page=10')

Top games for topic "action?page=10" written to file "action?page=10.csv"


'action?page=10.csv'

In [74]:
!head action?page=10.csv

Game,Technology,Ratings,Players
Mission Ammunition,WebGL,84%,13035
City Escape 2,WebGL,56%,4187
T-Rex N.Y Online,HTML5,89%,25957
Chop Chop Ninja Academy,HTML5,81%,12378
Aliens Gone Wild,HTML5,88%,10184
Smook Shoodoo,HTML5,67%,7092
Heist Run,WebGL,66%,7650
Strited 2D,WebGL,79%,5573
Masters of the Universe,HTML5,79%,13100


### Step:6 Combine data from multiple pages by importing pandas

In [75]:
import pandas as pd

By reading the csv file using pandas we can create a data frame.

Which is shown below.

In [76]:
games1 = pd.read_csv('action.csv', index_col=False)

In [77]:
games1

Unnamed: 0,Game,Technology,Ratings,Players
0,Mini Samurai: Kurofune,WebGL,87%,18023
1,Diary of a Wimpy Kid: The Meltdown,HTML5,88%,116225
2,Stained Act 1,WebGL,81%,137812
3,LA Shark,WebGL,76%,1454426
4,Madness: Off-Color,HTML5,87%,216682
...,...,...,...,...
59,Crime City 3D,WebGL,88%,17382987
60,Galactic Forces,WebGL,82%,324946
61,Warmerise Lite Version,WebGL,90%,10972447
62,Punch The Wall,WebGL,60%,20808


Find the below images from `action.csv` file

![img](https://i.imgur.com/ZDiPGKt.jpg)

In [78]:
games2 = pd.read_csv('action?page=2.csv')

In [79]:
games2

Unnamed: 0,Game,Technology,Ratings,Players
0,Protektor,HTML5,88%,18425
1,Rum & Gun,WebGL,76%,103865
2,Dead Zed,HTML5,85%,5170996
3,The Patriots: Fight and Freedom,WebGL,81%,24524
4,Vex 3,HTML5,81%,34569171
...,...,...,...,...
59,Streets Of Anarchy: Fists Of War,WebGL,73%,43437
60,Squid Escape: Bloody Revenge,WebGL,90%,18542
61,Zombie Apocalypse Tunnel Survival,WebGL,92%,50030
62,Inferno Meltdown,HTML5,88%,121292


Find the below images from `action?page=2.csv` file
![img](https://i.imgur.com/ZZ1MRXb.jpg)

In [80]:
games3 = pd.read_csv('action?page=3.csv')

In [81]:
games3

Unnamed: 0,Game,Technology,Ratings,Players
0,Jungle Treasure,HTML5,75%,27812
1,Chaos Roadkill,WebGL,71%,17814
2,Battle Royale Gangs,WebGL,90%,851571
3,Amazing Crime Strange Stickman - Rope Vice Vegas,WebGL,87%,1026302
4,Polygon Royale Shooter,WebGL,81%,15200
...,...,...,...,...
59,Piratebattle io,HTML5,73%,233186
60,Crime City 3D 2,WebGL,89%,8813506
61,Jack O Gunner,HTML5,60%,7424
62,Ponypocalypsis,WebGL,74%,16417


Find the below images from `action?page=3.csv` file
![img](https://i.imgur.com/f325nEb.jpg)

In [82]:
games4 = pd.read_csv('action?page=4.csv')

In [83]:
games4

Unnamed: 0,Game,Technology,Ratings,Players
0,Black Panther: Jungle Pursuit,HTML5,83%,36488
1,Die Alone,HTML5,76%,29816
2,Metal Guns Fury,HTML5,82%,164027
3,Drakensang Online,HTML5,85%,2943379
4,The Office Guy,HTML5,69%,19373
...,...,...,...,...
59,Bomb It 3,HTML5,90%,19419511
60,C-Zero,HTML5,89%,13946
61,Go Repo,HTML5,94%,13024
62,Stickman Armed Assassin: Cold Space,WebGL,90%,16833


Find the below images from `action?page=4.csv` file

![img](https://i.imgur.com/TvQHlLI.jpg)

In [84]:
games5 = pd.read_csv('action?page=5.csv')

In [85]:
games5

Unnamed: 0,Game,Technology,Ratings,Players
0,Sun Wukong vs Robot,HTML5,72%,7911
1,Waaaar io,HTML5,66%,134093
2,Day of Danger - Henry Danger,HTML5,85%,15828
3,Lords of the Arena,HTML5,86%,230382
4,City Siege 4 - Alien Siege,HTML5,93%,187962
...,...,...,...,...
59,ForceZ io,WebGL,87%,126231
60,Stones of Thanos,HTML5,69%,51689
61,Labyrneath II,HTML5,85%,47536
62,Carrot Mania Pirates,HTML5,69%,20568


Find the below images from `action?page=5.csv` file
![img](https://i.imgur.com/HDixTfG.jpg)

In [86]:
games6 = pd.read_csv('action?page=6.csv')

In [87]:
games6

Unnamed: 0,Game,Technology,Ratings,Players
0,Captain Rogers: Defense of Karmax-3,HTML5,57%,104413
1,Furious Road,HTML5,78%,121741
2,Burnin' Rubber,HTML5,76%,42342
3,Real Shooting FPS Strike,WebGL,79%,30391
4,Block Pixel Cops,WebGL,77%,28810
...,...,...,...,...
59,Way of Hero,HTML5,79%,794608
60,WW2 Modern War Tanks 1942,WebGL,83%,165457
61,Apple & Onion The Floor is Lava!,HTML5,78%,872937
62,Urban Counter Terrorist Warfare,WebGL,80%,97683


Find the below images from `action?page=6.csv` file
![img](https://i.imgur.com/Kk9BDNz.jpg)

In [88]:
games7 = pd.read_csv('action?page=7.csv')

In [89]:
games7

Unnamed: 0,Game,Technology,Ratings,Players
0,CAD War 4,WebGL,81%,30564
1,Tank Arena Game,HTML5,78%,142183
2,Army Combat 3D,WebGL,85%,266130
3,Jet Boi,HTML5,71%,295187
4,FZ Tap Touch Run,HTML5,48%,13007
...,...,...,...,...
59,Wile Coyote and Road Runner,Unity 3D,84%,33200
60,Seafight,HTML5,83%,899257
61,Flick Ninja 3D,HTML5,70%,22211
62,SnowBrawl Fight 3,Unity 3D,88%,25586


Find the below images from `action?page=7.csv` file
![img](https://i.imgur.com/HfaDaLp.jpg)

In [90]:
games8 = pd.read_csv('action?page=8.csv')

In [91]:
games8

Unnamed: 0,Game,Technology,Ratings,Players
0,Ben 10: The Lost World,Unity 3D,85%,122730
1,Colossal Catastrophe – Tom and Jerry,Unity 3D,91%,41225
2,Thor's Hammer,Unity 3D,67%,7671
3,Stephen Karsch,HTML5,82%,13611
4,Crazy Car Driver,WebGL,80%,101402
...,...,...,...,...
59,Black Hole Escape,WebGL,80%,4020
60,Robot Escape,WebGL,80%,5647
61,Spaceman in the Wizard Alien Nebula,WebGL,70%,3860
62,Discontinuum,WebGL,91%,4304


Find the below images from `action?page=8.csv` file
![img](https://i.imgur.com/JiGFdic.jpg)

In [92]:
games9 = pd.read_csv('action?page=9.csv')

In [93]:
games9

Unnamed: 0,Game,Technology,Ratings,Players
0,Space Ripper,WebGL,87%,4374
1,Pick Your Poison Remix,HTML5,81%,3527
2,Marine Invaders,WebGL,68%,30873
3,Me and Dungeons,WebGL,82%,11030
4,Air Warfare,HTML5,76%,60140
...,...,...,...,...
59,Cactus Mayhem,WebGL,71%,6165
60,Warzone Getaway 2020,HTML5,81%,300328
61,Swords of Brim,HTML5,63%,16285
62,Alpha E Corp,WebGL,72%,7059


Find the below images from `action?page=9.csv` file
![img](https://i.imgur.com/NvVen3E.jpg)

In [94]:
games10 = pd.read_csv('action?page=10.csv')

In [95]:
games10

Unnamed: 0,Game,Technology,Ratings,Players
0,Mission Ammunition,WebGL,84%,13035
1,City Escape 2,WebGL,56%,4187
2,T-Rex N.Y Online,HTML5,89%,25957
3,Chop Chop Ninja Academy,HTML5,81%,12378
4,Aliens Gone Wild,HTML5,88%,10184
...,...,...,...,...
59,The Last Man,HTML5,82%,26564
60,2Doom,HTML5,91%,12942
61,Food Gang Run,HTML5,85%,4969
62,Zedbeard,HTML5,40%,6408


Find the below images from `action?page=10.csv` file
![img](https://i.imgur.com/I2yl4tk.jpg)

In [96]:
concat_file = pd.concat([games1, games2, games3, games4, games5, games6, games7, games8, games9, games10], ignore_index=True)

`. csvfiles`of games1, games2, games3, games4, games5, games6, games7, games8, games9, games10

![img](https://i.imgur.com/Tjb8ObC.jpg)

In [97]:
concat_file

Unnamed: 0,Game,Technology,Ratings,Players
0,Mini Samurai: Kurofune,WebGL,87%,18023
1,Diary of a Wimpy Kid: The Meltdown,HTML5,88%,116225
2,Stained Act 1,WebGL,81%,137812
3,LA Shark,WebGL,76%,1454426
4,Madness: Off-Color,HTML5,87%,216682
...,...,...,...,...
635,The Last Man,HTML5,82%,26564
636,2Doom,HTML5,91%,12942
637,Food Gang Run,HTML5,85%,4969
638,Zedbeard,HTML5,40%,6408


### Summary

As of now we have:

1. Download the web page using requests.
2. Extracting information from HTML using Beautiful Soup.
3. Use Beautiful Soup to parse and extract information (i) Extract Name, Technology, Ratings and Players from the Games, (ii) Compile extracted information into lists and dictionaries,
4. Read and save the extracted information as CSV files (i) Complete reusable function code in single cell
5. Combine data from multiple pages by importing pandas

### Upcoming

* Different categories of Gaming
* Output the additional information & data's of the Game

### Reference Topic's

* [Web Scraping](https://www.edureka.co/blog/web-scraping-with-python)

* [requests](https://requests.readthedocs.io/en/latest/)

* [BeautifulSoup](https://brightdata.com/blog/how-tos/how-to-use-beautiful-soup-for-web-scraping-with-python?kw=&cpn=14745430544&cam=aw_all_products-all_geos-search_dsa_blog-kw_en-desktop_blog-how-tos__612826796308&utm_term=&utm_campaign=all_products-all_geos-search_dsa_blog-kw_en-desktop&utm_source=adwords&utm_medium=ppc&utm_content=blog-how-tos&hsa_acc=1393175403&hsa_cam=14745430544&hsa_grp=136943771953&hsa_ad=612826796308&hsa_src=g&hsa_tgt=dsa-1649388330704&hsa_kw=&hsa_mt=&hsa_net=adwords&hsa_ver=3&cq_src=google_ads&cq_cmp=14745430544&cq_term=&cq_plac=&cq_net=g&cq_plt=gp&gclid=Cj0KCQiA0oagBhDHARIsAI-BbgfujxVe38gVMoiKaSwNhbdXyYzi17PjrDNR3jPf3VOXh2CCxapHu0AaAgAHEALw_wcB)

* [Pandas](https://www.geeksforgeeks.org/introduction-to-pandas-in-python/)

In [98]:
import jovian

In [99]:
jovian.commit()

<IPython.core.display.Javascript object>

[jovian] Updating notebook "venkatasubramanian-natesan/online-pc-games-s" on https://jovian.com[0m
[jovian] Committed successfully! https://jovian.com/venkatasubramanian-natesan/online-pc-games-s[0m


'https://jovian.com/venkatasubramanian-natesan/online-pc-games-s'