In [1]:
from google.colab import drive
drive.mount('/gdrive', force_remount=True)

Mounted at /gdrive


- HTML web scrapping using **BeautifulSoup(!pip3 install bs4) python library**.
- Send an HTTP request to the URL of the webpage you want to access. The server responds to the request by returning the HTML content of the webpage. For this task, we will use a third-party HTTP library for **python-requests(!pip3 install requests)**.
- In order to work with the HTML, we will have to get the HTML as a string. We can easily get HTML data by using get() function in **requests module(!pip3 install requests)**.
- Once the HTML is fetched using requests the next step will be to **parse(!pip3 install html5lib)** the HTML content.

- A really nice thing about the BeautifulSoup library is that it is built on the top of the HTML parsing libraries like **html5lib, lxml, html.parser**, etc. So  BeautifulSoup object and specify the parser library can be created at the same time.

In [21]:
import pandas as pd
import requests
from bs4 import BeautifulSoup
url = "http://www.nepalstock.com/todaysprice"
# Make the request and check object type
r = requests.get(url)
type(r)

#Extract HTML from Response object and print
html = r.text

# Convert to a beautiful soup object
soup = BeautifulSoup(html, "html5lib")


In [12]:
# Verifying tables and their classes
print('Classes of each table:')
for table in soup.find_all('table'):
    print(table.get('class'))


Classes of each table:
['table', 'table-condensed', 'table-hover']


The method read_html returns a list of Dataframes containing HTML elements that satisfy our attribute specifications. In this case, we are looking for a table that includes the classes: 'table', 'table-condensed', 'table-hover'.

In [14]:
# Creating list with all tables
tables = soup.find_all('table')

#  Looking for the table with the classes 'wikitable' and 'sortable'
table = soup.find('table', class_='table table-condensed table-hover')

In [16]:
df_pandas = pd.read_html(url, attrs = {'class': 'table table-condensed table-hover'},  flavor='bs4')#pass inside if need  thousands ='.'
#print(df_pandas)


In [18]:
data = df_pandas[0][1:50]
data


Unnamed: 0,0,1,2,3,4,5,6,7,8,9
1,S.N.,Traded Companies,No. Of Transaction,Max Price,Min Price,Closing Price,Traded Shares,Amount,Previous Closing,Difference Rs.
2,1,10% Sanima Bank Limited Debenture,1,946.00,946.00,946.00,25.00,23650.00,965.30,-19.30
3,2,10.5 % NEPAL INVESTMENT DEBENTURE 2082,1,958.60,958.60,958.60,25.00,23965.00,965.00,-6.40
4,3,8.5% Nepal Investment Bank Debenture 2084,1,850.00,850.00,850.00,35.00,29750.00,850.00,0.00
5,4,Aarambha Chautari Laghubitta Bittiya Sanstha L...,25,909.00,896.00,897.50,734.00,662025.00,909.00,-11.50
6,5,Agricultural Development Bank Limited,139,367.00,363.00,364.90,10347.00,3767210.40,367.00,-2.10
7,6,Ajod Insurance Limited,64,540.60,512.00,525.00,3144.00,1637161.50,530.00,-5.00
8,7,Ankhu Khola Jalvidhyut Company Ltd,170,234.50,225.40,229.00,28945.00,6630448.20,230.00,-1.00
9,8,Api Power Company Ltd.,409,303.00,291.10,293.20,91619.00,26953505.30,301.00,-7.80
10,9,Arun Kabeli Power Ltd.,524,485.00,461.00,467.00,81485.00,38225562.70,479.00,-12.00


In [10]:
data.drop(data.columns[[0]], axis = 1, inplace = True)
data


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  errors=errors,


Unnamed: 0,1,2,3,4,5,6,7,8,9
1,Traded Companies,No. Of Transaction,Max Price,Min Price,Closing Price,Traded Shares,Amount,Previous Closing,Difference Rs.
2,10% Sanima Bank Limited Debenture,1,946.00,946.00,946.00,25.00,23650.00,965.30,-19.30
3,10.5 % NEPAL INVESTMENT DEBENTURE 2082,1,958.60,958.60,958.60,25.00,23965.00,965.00,-6.40
4,8.5% Nepal Investment Bank Debenture 2084,1,850.00,850.00,850.00,35.00,29750.00,850.00,0.00
5,Aarambha Chautari Laghubitta Bittiya Sanstha L...,25,909.00,896.00,897.50,734.00,662025.00,909.00,-11.50
6,Agricultural Development Bank Limited,139,367.00,363.00,364.90,10347.00,3767210.40,367.00,-2.10
7,Ajod Insurance Limited,64,540.60,512.00,525.00,3144.00,1637161.50,530.00,-5.00
8,Ankhu Khola Jalvidhyut Company Ltd,170,234.50,225.40,229.00,28945.00,6630448.20,230.00,-1.00
9,Api Power Company Ltd.,409,303.00,291.10,293.20,91619.00,26953505.30,301.00,-7.80
10,Arun Kabeli Power Ltd.,524,485.00,461.00,467.00,81485.00,38225562.70,479.00,-12.00


In [None]:
# saving the dataframe
data.to_csv('nepse.csv')