### Web Scrapping using BeatifulSoup.

The goal of the project is to get the top repositories in top 20 topics that is there on github.
First we will be finding out the top topics on github

https://github.com/ > Top 20 Topics > Extracting top 20 repositories from each topic.

Following are the modules that we will be using to scrap github.
requests : request module used to gte the content of the website
html5lib
BeautifulSoup: html5lib and BeautifulSoup will be used to parse the html.

In [1]:
import pandas as pd
import numpy as np

In [2]:
!pip install requests



In [3]:
!pip install beautifulsoup4



In [4]:
!pip install html5lib



In [5]:
import requests
from bs4 import BeautifulSoup

In [6]:
url = 'https://github.com/topics'

In [7]:
r = requests.get(url)

In [8]:
htmlContent = r.content

In [None]:
htmlContent

In [10]:
# Now parsing the html content using beautifulsoup
soup = BeautifulSoup(htmlContent, 'html.parser')

In [None]:
print(soup)

### We will be preparing a csv that will contain all the topics with their names, url, description


#### Topic Title tags

In [None]:
title_class = 'f3 lh-condensed mb-0 mt-1 Link--primary'
title_tags = soup.find_all('p', {'class': title_class})

In [13]:
title_tags

[<p class="f3 lh-condensed mb-0 mt-1 Link--primary">3D</p>,
 <p class="f3 lh-condensed mb-0 mt-1 Link--primary">Ajax</p>,
 <p class="f3 lh-condensed mb-0 mt-1 Link--primary">Algorithm</p>,
 <p class="f3 lh-condensed mb-0 mt-1 Link--primary">Amp</p>,
 <p class="f3 lh-condensed mb-0 mt-1 Link--primary">Android</p>,
 <p class="f3 lh-condensed mb-0 mt-1 Link--primary">Angular</p>,
 <p class="f3 lh-condensed mb-0 mt-1 Link--primary">Ansible</p>,
 <p class="f3 lh-condensed mb-0 mt-1 Link--primary">API</p>,
 <p class="f3 lh-condensed mb-0 mt-1 Link--primary">Arduino</p>,
 <p class="f3 lh-condensed mb-0 mt-1 Link--primary">ASP.NET</p>,
 <p class="f3 lh-condensed mb-0 mt-1 Link--primary">Atom</p>,
 <p class="f3 lh-condensed mb-0 mt-1 Link--primary">Awesome Lists</p>,
 <p class="f3 lh-condensed mb-0 mt-1 Link--primary">Amazon Web Services</p>,
 <p class="f3 lh-condensed mb-0 mt-1 Link--primary">Azure</p>,
 <p class="f3 lh-condensed mb-0 mt-1 Link--primary">Babel</p>,
 <p class="f3 lh-condensed m

In [14]:
len(title_tags)

30

#### Creating a list that will contain title

In [45]:
topic_title_list = [title_tag.text for title_tag in title_tags]
topic_title_list

['3D',
 'Ajax',
 'Algorithm',
 'Amp',
 'Android',
 'Angular',
 'Ansible',
 'API',
 'Arduino',
 'ASP.NET',
 'Atom',
 'Awesome Lists',
 'Amazon Web Services',
 'Azure',
 'Babel',
 'Bash',
 'Bitcoin',
 'Bootstrap',
 'Bot',
 'C',
 'Chrome',
 'Chrome extension',
 'Command line interface',
 'Clojure',
 'Code quality',
 'Code review',
 'Compiler',
 'Continuous integration',
 'COVID-19',
 'C++']

#### Topic description tags

In [15]:
description_class = 'f5 color-fg-muted mb-0 mt-1'
description_tags = soup.find_all('p', {'class': description_class})

In [17]:
description_tags[:5]

[<p class="f5 color-fg-muted mb-0 mt-1">
           3D refers to the use of three-dimensional graphics, modeling, and animation in various industries.
         </p>,
 <p class="f5 color-fg-muted mb-0 mt-1">
           Ajax is a technique for creating interactive web applications.
         </p>,
 <p class="f5 color-fg-muted mb-0 mt-1">
           Algorithms are self-contained sequences that carry out a variety of tasks.
         </p>,
 <p class="f5 color-fg-muted mb-0 mt-1">
           Amp is a non-blocking concurrency library for PHP.
         </p>,
 <p class="f5 color-fg-muted mb-0 mt-1">
           Android is an operating system built by Google designed for mobile devices.
         </p>]

In [18]:
len(description_tags)

30

#### Topic Link tags

In [19]:
anchor_class = 'no-underline flex-1 d-flex flex-column' 
anchor_tags = soup.find_all('a', {'class': anchor_class})

In [20]:
anchor_tags[:5]

[<a class="no-underline flex-1 d-flex flex-column" href="/topics/3d">
 <p class="f3 lh-condensed mb-0 mt-1 Link--primary">3D</p>
 <p class="f5 color-fg-muted mb-0 mt-1">
           3D refers to the use of three-dimensional graphics, modeling, and animation in various industries.
         </p>
 </a>,
 <a class="no-underline flex-1 d-flex flex-column" href="/topics/ajax">
 <p class="f3 lh-condensed mb-0 mt-1 Link--primary">Ajax</p>
 <p class="f5 color-fg-muted mb-0 mt-1">
           Ajax is a technique for creating interactive web applications.
         </p>
 </a>,
 <a class="no-underline flex-1 d-flex flex-column" href="/topics/algorithm">
 <p class="f3 lh-condensed mb-0 mt-1 Link--primary">Algorithm</p>
 <p class="f5 color-fg-muted mb-0 mt-1">
           Algorithms are self-contained sequences that carry out a variety of tasks.
         </p>
 </a>,
 <a class="no-underline flex-1 d-flex flex-column" href="/topics/amphp">
 <p class="f3 lh-condensed mb-0 mt-1 Link--primary">Amp</p>
 <p cl

#### Creating list that will contain topics links

In [33]:
topic_link_list = []
for anchor in  anchor_tags:
    base_url = 'https://github.com'
    topic_url = base_url + anchor.get('href')
    topic_link_list.append(topic_url)


In [34]:
print(topic_link_list)

['https://github.com/topics/3d', 'https://github.com/topics/ajax', 'https://github.com/topics/algorithm', 'https://github.com/topics/amphp', 'https://github.com/topics/android', 'https://github.com/topics/angular', 'https://github.com/topics/ansible', 'https://github.com/topics/api', 'https://github.com/topics/arduino', 'https://github.com/topics/aspnet', 'https://github.com/topics/atom', 'https://github.com/topics/awesome', 'https://github.com/topics/aws', 'https://github.com/topics/azure', 'https://github.com/topics/babel', 'https://github.com/topics/bash', 'https://github.com/topics/bitcoin', 'https://github.com/topics/bootstrap', 'https://github.com/topics/bot', 'https://github.com/topics/c', 'https://github.com/topics/chrome', 'https://github.com/topics/chrome-extension', 'https://github.com/topics/cli', 'https://github.com/topics/clojure', 'https://github.com/topics/code-quality', 'https://github.com/topics/code-review', 'https://github.com/topics/compiler', 'https://github.com/t

In [27]:
url = 'https://github.com'+anchor_tags[0]['href']

In [29]:
print(url)

https://github.com/topics/3d


#### Topic description tags

In [37]:
topic_description_class = 'f5 color-fg-muted mb-0 mt-1'
topic_description_tags = soup.find_all('p', {'class': topic_description_class})

In [42]:
topic_description_tags[:5]

[<p class="f5 color-fg-muted mb-0 mt-1">
           3D refers to the use of three-dimensional graphics, modeling, and animation in various industries.
         </p>,
 <p class="f5 color-fg-muted mb-0 mt-1">
           Ajax is a technique for creating interactive web applications.
         </p>,
 <p class="f5 color-fg-muted mb-0 mt-1">
           Algorithms are self-contained sequences that carry out a variety of tasks.
         </p>,
 <p class="f5 color-fg-muted mb-0 mt-1">
           Amp is a non-blocking concurrency library for PHP.
         </p>,
 <p class="f5 color-fg-muted mb-0 mt-1">
           Android is an operating system built by Google designed for mobile devices.
         </p>]

In [39]:
len(topic_description_tags)

30

#### Creating a list that will contain topic description

In [41]:
topic_desc_list = []
for topic_description_tag in topic_description_tags:
    topic_desc_list.append(topic_description_tag.text.strip())
print(topic_desc_list)

['3D refers to the use of three-dimensional graphics, modeling, and animation in various industries.', 'Ajax is a technique for creating interactive web applications.', 'Algorithms are self-contained sequences that carry out a variety of tasks.', 'Amp is a non-blocking concurrency library for PHP.', 'Android is an operating system built by Google designed for mobile devices.', 'Angular is an open source web application platform.', 'Ansible is a simple and powerful automation engine.', 'An API (Application Programming Interface) is a collection of protocols and subroutines for building software.', 'Arduino is an open source platform for building electronic devices.', 'ASP.NET is a web framework for building modern web apps and services.', 'Atom is a open source text editor built with web technologies.', 'An awesome list is a list of awesome things curated by the community.', 'Amazon Web Services provides on-demand cloud computing platforms on a subscription basis.', 'Azure is a cloud co

#### Creating a list that will contain topics

In [32]:
topic_title_list = []
for title_tag in title_tags:
    topic_title_list.append(title_tag.text)
print(topic_title_list)

['3D', 'Ajax', 'Algorithm', 'Amp', 'Android', 'Angular', 'Ansible', 'API', 'Arduino', 'ASP.NET', 'Atom', 'Awesome Lists', 'Amazon Web Services', 'Azure', 'Babel', 'Bash', 'Bitcoin', 'Bootstrap', 'Bot', 'C', 'Chrome', 'Chrome extension', 'Command line interface', 'Clojure', 'Code quality', 'Code review', 'Compiler', 'Continuous integration', 'COVID-19', 'C++']


#### NOW CREATING A DATAFRAME THAT WILL CONTAINS ALL THE ITEMS: TOPIC_TITLES, TOPIC_DESC, TOPIC_LINK

In [46]:
dict = {'Topic Name': topic_title_list, 'Topic Description': topic_desc_list, 'Topic Links': topic_link_list}

In [48]:
topics_df = pd.DataFrame(dict)

In [49]:
topics_df

Unnamed: 0,Topic Name,Topic Description,Topic Links
0,3D,3D refers to the use of three-dimensional grap...,https://github.com/topics/3d
1,Ajax,Ajax is a technique for creating interactive w...,https://github.com/topics/ajax
2,Algorithm,Algorithms are self-contained sequences that c...,https://github.com/topics/algorithm
3,Amp,Amp is a non-blocking concurrency library for ...,https://github.com/topics/amphp
4,Android,Android is an operating system built by Google...,https://github.com/topics/android
5,Angular,Angular is an open source web application plat...,https://github.com/topics/angular
6,Ansible,Ansible is a simple and powerful automation en...,https://github.com/topics/ansible
7,API,An API (Application Programming Interface) is ...,https://github.com/topics/api
8,Arduino,Arduino is an open source platform for buildin...,https://github.com/topics/arduino
9,ASP.NET,ASP.NET is a web framework for building modern...,https://github.com/topics/aspnet


##### From here we can export to seperate csv file.

#### Now get top repositories pertaining to each topic.

##### Let say we want to get top repos under the topic 3d

In [52]:
url_3d = topic_link_list[0]

In [53]:
r = requests.get(url_3d)

In [54]:
htmlContent = r.content

In [None]:
htmlContent

In [56]:
soup = BeautifulSoup(htmlContent, 'html.parser')

In [None]:
print(soup)

In [58]:
# we can save the content of the 3d page after parsing into another file
with open('3d_content.html', 'w', encoding='utf-8') as f:
    f.write(r.text)

#### getting the repo tags. 
###### Info to be covered for each repo: Name of repo, username, link, stars

In [59]:
repo_class = 'f3 color-fg-muted text-normal lh-condensed'
repo_tags = soup.find_all('h3', {'class': repo_class})

In [60]:
repo_tags[:5]

[<h3 class="f3 color-fg-muted text-normal lh-condensed">
 <a data-hydro-click='{"event_type":"explore.click","payload":{"click_context":"REPOSITORY_CARD","click_target":"OWNER","click_visual_representation":"REPOSITORY_OWNER_HEADING","actor_id":null,"record_id":97088,"originating_url":"https://github.com/topics/3d","user_id":null}}' data-hydro-click-hmac="4bdbc49d3c05ae7f70b531fbce709a384200b0768554e0172950286a8db30940" data-turbo="false" data-view-component="true" href="/mrdoob">
             mrdoob
 </a>          /
           <a class="text-bold wb-break-word" data-hydro-click='{"event_type":"explore.click","payload":{"click_context":"REPOSITORY_CARD","click_target":"REPOSITORY","click_visual_representation":"REPOSITORY_NAME_HEADING","actor_id":null,"record_id":576201,"originating_url":"https://github.com/topics/3d","user_id":null}}' data-hydro-click-hmac="517d3d5cb9d89752156923904a4238816bc9b51ab7772f3e3644ce897d8dd4e5" data-turbo="false" data-view-component="true" href="/mrdoob/thr

In [62]:
repo_tags[0]

<h3 class="f3 color-fg-muted text-normal lh-condensed">
<a data-hydro-click='{"event_type":"explore.click","payload":{"click_context":"REPOSITORY_CARD","click_target":"OWNER","click_visual_representation":"REPOSITORY_OWNER_HEADING","actor_id":null,"record_id":97088,"originating_url":"https://github.com/topics/3d","user_id":null}}' data-hydro-click-hmac="4bdbc49d3c05ae7f70b531fbce709a384200b0768554e0172950286a8db30940" data-turbo="false" data-view-component="true" href="/mrdoob">
            mrdoob
</a>          /
          <a class="text-bold wb-break-word" data-hydro-click='{"event_type":"explore.click","payload":{"click_context":"REPOSITORY_CARD","click_target":"REPOSITORY","click_visual_representation":"REPOSITORY_NAME_HEADING","actor_id":null,"record_id":576201,"originating_url":"https://github.com/topics/3d","user_id":null}}' data-hydro-click-hmac="517d3d5cb9d89752156923904a4238816bc9b51ab7772f3e3644ce897d8dd4e5" data-turbo="false" data-view-component="true" href="/mrdoob/three.js

In [63]:
repo_tags[0].find_all('a')

[<a data-hydro-click='{"event_type":"explore.click","payload":{"click_context":"REPOSITORY_CARD","click_target":"OWNER","click_visual_representation":"REPOSITORY_OWNER_HEADING","actor_id":null,"record_id":97088,"originating_url":"https://github.com/topics/3d","user_id":null}}' data-hydro-click-hmac="4bdbc49d3c05ae7f70b531fbce709a384200b0768554e0172950286a8db30940" data-turbo="false" data-view-component="true" href="/mrdoob">
             mrdoob
 </a>,
 <a class="text-bold wb-break-word" data-hydro-click='{"event_type":"explore.click","payload":{"click_context":"REPOSITORY_CARD","click_target":"REPOSITORY","click_visual_representation":"REPOSITORY_NAME_HEADING","actor_id":null,"record_id":576201,"originating_url":"https://github.com/topics/3d","user_id":null}}' data-hydro-click-hmac="517d3d5cb9d89752156923904a4238816bc9b51ab7772f3e3644ce897d8dd4e5" data-turbo="false" data-view-component="true" href="/mrdoob/three.js">
             three.js
 </a>]

In [68]:
a = repo_tags[0].find_all('a')

In [72]:
a

[<a data-hydro-click='{"event_type":"explore.click","payload":{"click_context":"REPOSITORY_CARD","click_target":"OWNER","click_visual_representation":"REPOSITORY_OWNER_HEADING","actor_id":null,"record_id":97088,"originating_url":"https://github.com/topics/3d","user_id":null}}' data-hydro-click-hmac="4bdbc49d3c05ae7f70b531fbce709a384200b0768554e0172950286a8db30940" data-turbo="false" data-view-component="true" href="/mrdoob">
             mrdoob
 </a>,
 <a class="text-bold wb-break-word" data-hydro-click='{"event_type":"explore.click","payload":{"click_context":"REPOSITORY_CARD","click_target":"REPOSITORY","click_visual_representation":"REPOSITORY_NAME_HEADING","actor_id":null,"record_id":576201,"originating_url":"https://github.com/topics/3d","user_id":null}}' data-hydro-click-hmac="517d3d5cb9d89752156923904a4238816bc9b51ab7772f3e3644ce897d8dd4e5" data-turbo="false" data-view-component="true" href="/mrdoob/three.js">
             three.js
 </a>]

##### username of the repo

In [75]:
a[0].text.strip() #this is the username

'mrdoob'

##### Name of the repo

In [76]:
a[1].text.strip() #this is the name of the repo.

'three.js'

##### Repo link

In [79]:
#link of the repo
a[1].get('href')

'/mrdoob/three.js'

In [80]:
base_url = 'https://github.com'
repo_link = base_url+a[1].get('href')
print(repo_link)

https://github.com/mrdoob/three.js


##### Stars of the repos

In [82]:
stars_id = 'repo-stars-counter-star'
stars_id_tag = soup.find_all('span', {'id': stars_id})

In [84]:
len(stars_id_tag)

20

In [87]:
stars_3d = stars_id_tag[0].text

In [89]:
stars_3d

'90.6k'

In [90]:
stars_3d[-1]

'k'

In [91]:
stars_3d[:-1]

'90.6'

In [88]:
type(stars_3d)

str

In [94]:
# Now creating a function that will take string and convert 90.6k to 90600
def parsing_star_count(star):
    star = star.strip()
    if(star[-1] == 'k'):
        return int(float(star[:-1])*1000)

In [95]:
parsing_star_count(stars_3d)

90600

#### Now creating a function that will fetch username, name of repo, repo link, stars count 
#### The function will take 2 arguments namely: repo tags, that will contains the list of repos under a spoecific topic and the star tags


In [96]:
def get_repo_info(repo_tags, star_tags):
    a_tags = repo_tags.find_all('a')
    username = a_tags[0].text.strip()
    repo_name = a_tags[1].text.strip()
    base_url = 'https://github.com'
    repo_url = base_url+a_tags[1].get('href')
    stars = parsing_star_count(star_tags.text.strip())
    return username, repo_name, repo_url, stars

In [99]:
get_repo_info(repo_tags[1], stars_id_tag[1])

('pmndrs',
 'react-three-fiber',
 'https://github.com/pmndrs/react-three-fiber',
 22100)

##### Now we will create a dict that will contains all the repos partaining to specific topic. And will have all the info such as repo name, username, repo link,  star count

In [100]:
repo_dict = {
    'Repo Name':[],
    'Username':[],
    'Repo URL': [],
    'Stars Count': []
}
for i in range(len(repo_tags)):
    repo_info = get_repo_info(repo_tags[i], stars_id_tag[i])
    print(repo_info)

('mrdoob', 'three.js', 'https://github.com/mrdoob/three.js', 90600)
('pmndrs', 'react-three-fiber', 'https://github.com/pmndrs/react-three-fiber', 22100)
('libgdx', 'libgdx', 'https://github.com/libgdx/libgdx', 21300)
('BabylonJS', 'Babylon.js', 'https://github.com/BabylonJS/Babylon.js', 19800)
('ssloy', 'tinyrenderer', 'https://github.com/ssloy/tinyrenderer', 16600)
('aframevr', 'aframe', 'https://github.com/aframevr/aframe', 15200)
('lettier', '3d-game-shaders-for-beginners', 'https://github.com/lettier/3d-game-shaders-for-beginners', 14900)
('FreeCAD', 'FreeCAD', 'https://github.com/FreeCAD/FreeCAD', 13700)
('CesiumGS', 'cesium', 'https://github.com/CesiumGS/cesium', 10200)
('metafizzy', 'zdog', 'https://github.com/metafizzy/zdog', 9700)
('timzhang642', '3D-Machine-Learning', 'https://github.com/timzhang642/3D-Machine-Learning', 8800)
('isl-org', 'Open3D', 'https://github.com/isl-org/Open3D', 8400)
('blender', 'blender', 'https://github.com/blender/blender', 8100)
('a1studmuffin', '

In [105]:
repo_dict = {
    'Repo Name':[],
    'Username':[],
    'Repo URL': [],
    'Stars Count': []
}
for i in range(len(repo_tags)):
    repo_info = get_repo_info(repo_tags[i], stars_id_tag[i])
    repo_dict['Repo Name'].append(repo_info[1])
    repo_dict['Username'].append(repo_info[0])
    repo_dict['Repo URL'].append(repo_info[2])
    repo_dict['Stars Count'].append(repo_info[3])

In [106]:
repo_dict

{'Repo Name': ['three.js',
  'react-three-fiber',
  'libgdx',
  'Babylon.js',
  'tinyrenderer',
  'aframe',
  '3d-game-shaders-for-beginners',
  'FreeCAD',
  'cesium',
  'zdog',
  '3D-Machine-Learning',
  'Open3D',
  'blender',
  'SpaceshipGenerator',
  'BlenderGIS',
  'Fyrox',
  'openscad',
  'model-viewer',
  'spritejs',
  'webglstudio.js'],
 'Username': ['mrdoob',
  'pmndrs',
  'libgdx',
  'BabylonJS',
  'ssloy',
  'aframevr',
  'lettier',
  'FreeCAD',
  'CesiumGS',
  'metafizzy',
  'timzhang642',
  'isl-org',
  'blender',
  'a1studmuffin',
  'domlysz',
  'FyroxEngine',
  'openscad',
  'google',
  'spritejs',
  'jagenjo'],
 'Repo URL': ['https://github.com/mrdoob/three.js',
  'https://github.com/pmndrs/react-three-fiber',
  'https://github.com/libgdx/libgdx',
  'https://github.com/BabylonJS/Babylon.js',
  'https://github.com/ssloy/tinyrenderer',
  'https://github.com/aframevr/aframe',
  'https://github.com/lettier/3d-game-shaders-for-beginners',
  'https://github.com/FreeCAD/FreeCAD

In [107]:
topic_repos_df = pd.DataFrame(repo_dict)

In [108]:
topic_repos_df

Unnamed: 0,Repo Name,Username,Repo URL,Stars Count
0,three.js,mrdoob,https://github.com/mrdoob/three.js,90600
1,react-three-fiber,pmndrs,https://github.com/pmndrs/react-three-fiber,22100
2,libgdx,libgdx,https://github.com/libgdx/libgdx,21300
3,Babylon.js,BabylonJS,https://github.com/BabylonJS/Babylon.js,19800
4,tinyrenderer,ssloy,https://github.com/ssloy/tinyrenderer,16600
5,aframe,aframevr,https://github.com/aframevr/aframe,15200
6,3d-game-shaders-for-beginners,lettier,https://github.com/lettier/3d-game-shaders-for...,14900
7,FreeCAD,FreeCAD,https://github.com/FreeCAD/FreeCAD,13700
8,cesium,CesiumGS,https://github.com/CesiumGS/cesium,10200
9,zdog,metafizzy,https://github.com/metafizzy/zdog,9700


#### Till now we have covered topic of 3d and we have gathered all the info pertaining to 3d. Likewise we will be doing for the other topics as well. 

In [118]:
def get_topic_page(topic_link_list):
    r = requests.get(topic_link_list)
    if(r.status_code != 200):
        raise Exeption("Failed to load page").format()
    htmlContent = r.content
    soup = BeautifulSoup(htmlContent, 'html.parser')
    return soup

def get_repo_info(repo_tags, star_tags):
    a_tags = repo_tags.find_all('a')
    username = a_tags[0].text.strip()
    repo_name = a_tags[1].text.strip()
    base_url = 'https://github.com'
    repo_url = base_url+a_tags[1].get('href')
    stars = parsing_star_count(star_tags.text.strip())
    return username, repo_name, repo_url, stars

def get_topic_repos(soup):
    repo_class = 'f3 color-fg-muted text-normal lh-condensed'
    repo_tags = soup.find_all('h3', {'class': repo_class})
    stars_id = 'repo-stars-counter-star'
    stars_id_tag = soup.find_all('span', {'id': stars_id})
    
    repo_dict = {
        'Repo Name':[],
        'Username':[],
        'Repo URL': [],
        'Stars Count': []
    }
    for i in range(len(repo_tags)):
        repo_info = get_repo_info(repo_tags[i], stars_id_tag[i])
        repo_dict['Repo Name'].append(repo_info[1])
        repo_dict['Username'].append(repo_info[0])
        repo_dict['Repo URL'].append(repo_info[2])
        repo_dict['Stars Count'].append(repo_info[3])
    return pd.DataFrame(repo_dict)

In [121]:
df_topic_1 = get_topic_repos(get_topic_page(topic_link_list[0]))

In [122]:
df_topic_1

Unnamed: 0,Repo Name,Username,Repo URL,Stars Count
0,three.js,mrdoob,https://github.com/mrdoob/three.js,90600
1,react-three-fiber,pmndrs,https://github.com/pmndrs/react-three-fiber,22100
2,libgdx,libgdx,https://github.com/libgdx/libgdx,21300
3,Babylon.js,BabylonJS,https://github.com/BabylonJS/Babylon.js,19800
4,tinyrenderer,ssloy,https://github.com/ssloy/tinyrenderer,16600
5,aframe,aframevr,https://github.com/aframevr/aframe,15200
6,3d-game-shaders-for-beginners,lettier,https://github.com/lettier/3d-game-shaders-for...,14900
7,FreeCAD,FreeCAD,https://github.com/FreeCAD/FreeCAD,13700
8,cesium,CesiumGS,https://github.com/CesiumGS/cesium,10200
9,zdog,metafizzy,https://github.com/metafizzy/zdog,9700


In [125]:
df_topic_2 = get_topic_repos(get_topic_page(topic_link_list[1]))

In [126]:
df_topic_2

Unnamed: 0,Repo Name,Username,Repo URL,Stars Count
0,Blog,ljianshu,https://github.com/ljianshu/Blog,7500
1,infinite-scroll,metafizzy,https://github.com/metafizzy/infinite-scroll,7200
2,unfetch,developit,https://github.com/developit/unfetch,5500
3,tabulator,olifolkerd,https://github.com/olifolkerd/tabulator,5300
4,form,jquery-form,https://github.com/jquery-form/form,5200
5,elFinder,Studio-42,https://github.com/Studio-42/elFinder,4400
6,wretch,elbywan,https://github.com/elbywan/wretch,3800
7,learn-to-send-email-via-google-script-html-no-...,dwyl,https://github.com/dwyl/learn-to-send-email-vi...,3000
8,reqwest,ded,https://github.com/ded/reqwest,2900
9,bliss,LeaVerou,https://github.com/LeaVerou/bliss,2400


Transfer the resulted dataframe to csv

In [None]:
df_topic_2.to_csv("name_of_file.csv", index=False)