# Benjamin Graham and Warren Buffett Model Stock Exchange

## Introduction:
* There are about 4000 stocks which are actively traded on the stock exchanges at BSE and NSE.
* We can extract public financial data from sites like to find which are the fundamentally strong stocks. 
* On what stocks would the father of value investing, Benjamin Graham and Warren Buffett the most successful investors in the world make their investments on.

## Benjamin Graham and Warren Buffett Model:
1. **Step 1:** Filtering out all companies with sales less than Rs 250 cr. Companies with sales lower than this are very small companies and might not have the business stability and access to finance that is required for a safe investment. This eliminates the basic business risk.
2. **Step 2:** Filtering out all companies with debt to equity greater than 30%. Companies with low leverage are safer.
3. **Step 3:** Filtering out all companies with interest coverage ratio of less than 4. Companies with high interest coverage ratio have a highly reduced bankruptcy risk.
4. **Step 4:** Filtering out all companies with ROE less than 15% since they are earning less than their cost of capital. High ROE companies have a robust business model, which generates increased earnings for the company typically.
5. **Step 5:** Filtering out all companies with PE ratio greater than 25 since they are too expensive even for a high-quality company. This enables us to pick companies which are relatively cheaper as against their actual value. He points out that applying these filters enables us to reduce and even eliminate a lot of fundamental risks while ensuring a robust business model, strong earning potential and a good buying price.

## Implementation of the Model.

In [80]:
# importing required libraries

import requests
import re
from bs4 import BeautifulSoup
import pandas as pd
import numpy as np

In [81]:
# Read the data from the csv files

data=pd.read_csv("Companies.csv")
data.head()

Unnamed: 0,Company,Link
0,ADANIPORTS,https://www.moneycontrol.com/financials/adanip...
1,ASIANPAINT,https://www.moneycontrol.com/financials/asianp...
2,BAJAJ-AUTO,https://www.moneycontrol.com/financials/bajaja...
3,BAJFINANCE,https://www.moneycontrol.com/financials/bajajf...
4,BHARTIARTL,https://www.moneycontrol.com/financials/bharti...


### Analysing the dataset / Data cleaning.

In [82]:
# Checking for null values

data.isnull().sum()

Company    0
Link       0
dtype: int64

In [83]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 41 entries, 0 to 40
Data columns (total 2 columns):
 #   Column   Non-Null Count  Dtype 
---  ------   --------------  ----- 
 0   Company  41 non-null     object
 1   Link     41 non-null     object
dtypes: object(2)
memory usage: 784.0+ bytes


In [84]:
data.describe()

Unnamed: 0,Company,Link
count,41,41
unique,41,41
top,DRREDDY,https://www.moneycontrol.com/financials/bharti...
freq,1,1


* The data looks clean and ready to be used.

### Extracting the links of the companies from the data.

**Note:** The data is being extracted from [Money Control Website](https://www.moneycontrol.com/)

In [85]:
companynames=list(data.iloc[:,0])
links=list(data.iloc[:,1])

#links for old format page
linkso=[]
for i in range(0,len(links)):
    a=links[i].find('VI')
    b=links[i][:a]+links[i][a+2:]
    linkso.append(b)

#links for profit-loss page
linkso_pl=[]
for i in range(0,len(linkso)):
    a=linkso[i].find('balance-sheet')
    b=linkso[i][:a]+'profit-loss'+linkso[i][a+13:]
    linkso_pl.append(b)

## Implementation of Step - 1 

### Scrapping Net Sales
<br>

**Note:** 
* As the details of net sales was not provided, the following banks were excluded from the file:
> HDFC bank, ICICI bank, Indusland bank, Kotak Mahindra Bank, Yes Bank, SBI
* As interestt cover ratio was not provided, the following were also excluded.
> Bajaj Finserv and Infosys 


In [None]:
netsales_list=[]
for i in range(0,len(linkso_pl)):
    pageold=requests.get(linkso_pl[i])
    soup = BeautifulSoup(pageold.text, 'html.parser')
    about=soup.findAll('table')
    about=str(about)
    a=re.findall('<td>Net Sales</td>\n.*</td>',about)
    netsales=re.findall('>[-+]?[0-9].*<',str(a))
    netsales=netsales[0][1:len(netsales[0])-1]
    if(netsales.find(',') != -1):
        b=netsales.find(',')
        netsales_list.append(float(netsales[:b]+netsales[b+1:]))
        b=-1
    else:
        netsales_list.append(float(netsales))

* The Net Sales of each company is saved in the netsales list in Crores.

In [None]:
# values in cr.
netsales_list

#### Filtering. 
* The companies with less than 250 cr. Net Sales are to be filtered out/removed.

In [None]:
filtersales=[]
filtercomp=[]
filterlink=[]
filterolink=[]
for i in range(0,len(netsales_list)):
    if(netsales_list[i]<250.00):
        filtersales.append(netsales_list[i])
        filtercomp.append(companynames[i])
        filterlink.append(links[i])
        filterolink.append(linkso[i])
for i in range(0,len(filtersales)):
    netsales_list.remove(filtersales[i])
    companynames.remove(filtercomp[i])
    links.remove(filterlink[i])
    linkso.remove(filterolink[i])

## Implementatin of Step - 2

### Scraping debt equity Ratio

* **Some of the data to be extracted is in old page format.**
* **The links of the old page is stored in linkso_r list.**

In [None]:
#links for ratio page
linkso_r=[]
for i in range(0,len(linkso)):
    a=linkso[i].find('balance-sheet')
    b=linkso[i][:a]+'ratios'+linkso[i][a+13:]
    linkso_r.append(b)
    
debtequity_ratio=[]
for i in range(0,len(linkso_r)):
    pageold=requests.get(linkso_r[i])
    soup = BeautifulSoup(pageold.text, 'html.parser')
    about=soup.findAll('table')
    about=str(about)
    a=re.findall('<td>Debt Equity Ratio</td>\n.*</td>',about)
    der=re.findall('>[-+]?[0-9].*<',str(a))
    if(len(der)==0):
        der=0
    else:
        der=der[0][1:len(der[0])-1]
    debtequity_ratio.append(float(der))


In [None]:
debtequity_ratio

#### Companies with Debt Equity Ratio greater than 0.3 are filtered/removed.

In [None]:
filtersales=[]  
filterder=[]
filtercomp=[]
filterlink=[]
filterolink=[]
for i in range(0,len(debtequity_ratio)):
    if(debtequity_ratio[i]>0.3):
        filterder.append(debtequity_ratio[i])
        filtercomp.append(companynames[i])
        filterlink.append(links[i])
        filterolink.append(linkso[i])
        filtersales.append(netsales_list[i])
for i in range(0,len(filterder)):
    debtequity_ratio.remove(filterder[i])
    companynames.remove(filtercomp[i])
    links.remove(filterlink[i])
    linkso.remove(filterolink[i])
    netsales_list.remove(filtersales[i])


## Implementation of Step - 3

### Scraping Interest Coverage Ratio.

In [None]:
#links for ratio page
linkso_r=[]

for i in range(0,len(linkso)):
    a=linkso[i].find('balance-sheet')
    b=linkso[i][:a]+'ratios'+linkso[i][a+13:]
    linkso_r.append(b)

icr_list=[]

for i in range(0,len(linkso_r)): 
    pageold=requests.get(linkso_r[i])
    soup = BeautifulSoup(pageold.text, 'html.parser')
    about=soup.findAll('table')
    about=str(about)
    a=re.findall('<td>Interest Cover</td>\n.*</td>',about)
    #icr = re.findall("[^a-zA-Z:]([-+]?\d+[\.]?\d*)", str(a))
    icr=re.findall('>[-+]?[0-9].*<',str(a))
    icr=icr[0][1:len(icr[0])-1]
    if(icr.find(',') != -1):
        b=icr.find(',')
        icr_list.append(float(icr[:b]+icr[b+1:]))
        b=-1
    else:
        icr_list.append(float(icr))

In [None]:
icr_list

#### Filtering out all companies with interest coverage ratio of less than 4.

In [None]:
filtericr=[]
filtercomp=[]
filterlink=[]
filterolink=[]
filtersales=[]
filterder=[]
for i in range(0,len(icr_list)):
    if(icr_list[i]<4):
        filtericr.append(icr_list[i])
        filtercomp.append(companynames[i])
        filterlink.append(links[i])
        filterolink.append(linkso[i])
        filtersales.append(netsales_list[i])
        filterder.append(debtequity_ratio[i])
for i in range(0,len(filtericr)):
    icr_list.remove(filtericr[i])
    companynames.remove(filtercomp[i])
    links.remove(filterlink[i])
    linkso.remove(filterolink[i])   
    netsales_list.remove(filtersales[i])
    debtequity_ratio.remove(filterder[i])

## Implementation of Step - 4

* ROE stands for **Return on Equity.**
* ROE is calculated by using the formula showm below.

![roe-return-on-equity.png](attachment:roe-return-on-equity.png)

* Net Income can be scrapped directly from the website.

**Scraping Total Income**

In [None]:
# links for profit loss page is stored in links_pl
links_pl=[]
for i in range(0,len(links)):
    a=links[i].find('balance-sheet')
    b=links[i][:a]+'profit-loss'+links[i][a+13:]
    links_pl.append(b)

ti_list=[]
for i in range(0,len(links_pl)):
    pageold=requests.get(links_pl[i])
    soup = BeautifulSoup(pageold.text, 'html.parser')
    about=soup.findAll('table')
    about=str(about)
    a=re.findall('<td>Total Revenue</td>\n.*</td>',about)
    ti=re.findall('>[0-9].*<',str(a))
    ti=ti[0][1:len(ti[0])-1]
    if(ti.find(',') != -1):
        b=ti.find(',')
        ti_list.append(float(ti[:b]+ti[b+1:]))
        b = -1
    else:
        ti_list.append(float(ti))


* Total Income has been scrapped and stored in the ti_list variable.

In [None]:
ti_list

**Scraping Shareholder's Equity Share Capital**

* Total assests can be extracted directly from the website.
* Total Liabilities is **Total Non-Current Liabilities + Total Current Liabilities.**
* Total Non-Current Liabilities and Total Current Liabilities can be extracted from the website and later used in the formula.

In [13]:
tot_assets=[]
current_lia=[]
noncurrent_lia=[]

for i in range(0,len(links)):
    pageold=requests.get(links[i])
    soup = BeautifulSoup(pageold.text, 'html.parser')
    about=soup.findAll('table')
    about=str(about)
    a=re.findall('<td>Total Assets</td>\n.*</td>',about)
    totas=re.findall('>[-+]?[0-9].*<',str(a))
    totas=totas[0][1:len(totas[0])-1]
    if(totas.find(',') != -1):
        b=totas.find(',')
        tot_assets.append(float(totas[:b]+totas[b+1:]))
        b=-1
    else:
        tot_assets.append(float(totas))
    
    a=re.findall('<td>Total Non-Current Liabilities</td>\n.*</td>',about)
    totncl=re.re.findall('>[-+]?[0-9].*<',str(a))
    totncl=totncl[0][1:len(totncl[0])-1]
    if(totncl.find(',') != -1):
        b=totncl.find(',')
        noncurrent_lia.append(float(totncl[:b]+totncl[b+1:]))
        b = -1
    else:
        noncurrent_lia.append(float(totncl))
    
    a=re.findall('<td>Total Current Liabilities</td>\n.*</td>',about)
    totcl=re.findall('>[-+]?[0-9].*<',str(a))
    totcl=totcl[0][1:len(totcl[0])-1]
    if(totcl.find(',') != -1):
        b=totcl.find(',')
        current_lia.append(float(totcl[:b]+totcl[b+1:]))
        b=-1
    else:
        current_lia.append(float(totcl))
    


**Calculating ROE from the scrapped data.**

In [None]:
roe_list=np.array(ti_list)/(np.array(tot_assets)-np.array(current_lia)-np.array(noncurrent_lia))
roe_list=roe_list.tolist()

### Filtering ROE values.

## Implementation of Step - 5

In [14]:
''' Scraping for PE'''

links_r=[]
for i in range(0,len(links)):
    a=links[i].find('balance-sheet')
    b=links[i][:a]+'ratios'+links[i][a+13:]
    links_r.append(b)

pricbv=[]
bkvshr=[]
for i in range(0,len(links_r)):
    pageold=requests.get(links_r[i])
    soup = BeautifulSoup(pageold.text, 'html.parser')
    about=soup.findAll('table')
    about=str(about)
    a=re.findall('<td>Price/BV.*</td>\n.*</td>',about)
    pbv=re.findall('>[-+]?[0-9].*<',str(a))
    pbv=pbv[0][1:len(pbv[0])-1]
    if(len(pbv)>6):
        a=pbv.find(',')
        pricbv.append(float(pbv[:a]+pbv[a+1:]))
    else:
        pricbv.append(float(pbv))
        
    a=re.findall('<td>Book Value.*/Share.*</td>\n.*</td>',about)
    a=a[0]
    pbvs=re.findall('>[-+]?[0-9].*<',str(a))
    pbvs=pbvs[0][1:len(pbvs[0])-1]
    if(len(pbvs)>6):
        a=pbvs.find(',')
        bkvshr.append(float(pbvs[:a]+pbvs[a+1:]))
    else:
        bkvshr.append(float(pbvs))

eps_list=[]
for i in range(0,len(links_pl)):
    pageold=requests.get(links_pl[i])
    soup = BeautifulSoup(pageold.text, 'html.parser')
    about=soup.findAll('table')
    about=str(about)
    a=re.findall('<td>Basic EPS.*</td>\n.*</td>',about)
    eps=re.findall('>[-+]?[0-9].*<',str(a))
    eps=eps[0][1:len(eps[0])-1]
    if(len(eps)>6):
        a=eps.find(',')
        eps_list.append(float(eps[:a]+eps[a+1:]))
    else:
        eps_list.append(float(eps))


pe=np.array(bkvshr)*np.array(pricbv)/np.array(eps_list)
pe=pe.tolist()

In [15]:
''' Filter PE values '''
filtersales=[]  
filterder=[]
filtericr=[]
filterroe=[]
filterpe=[]
filtercomp=[]
filterlink=[]
filterolink=[]
for i in range(0,len(pe)):
    if(pe[i]>25):
        filtersales.append(netsales_list[i])
        filterder.append(debtequity_ratio[i])
        filtericr.append(icr_list[i])
        filterroe.append(roe_list[i])
        filterpe.append(pe[i])
        filtercomp.append(companynames[i])
        filterlink.append(links[i])

for i in range(0,len(filterpe)):
    netsales_list.remove(filtersales[i])
    debtequity_ratio.remove(filterder[i])
    icr_list.remove(filtericr[i])
    roe_list.remove(filterroe[i])
    pe.remove(filterpe[i])
    companynames.remove(filtercomp[i])
    links.remove(filterlink[i])

In [16]:
final=list(zip(companynames,netsales_list,debtequity_ratio,icr_list,roe_list,pe))

filtered_list=pd.DataFrame(final,columns=['Company','Net Sales in cr.','Debt to Equity Ratio','Interest Coverage Ratio','Return On Equity (ROE)','P/E Ratio'])


In [17]:
print(filtered_list)

       Company  Net Sales in cr.  Debt to Equity Ratio  \
0   BAJAJ-AUTO          30249.96                  0.02   
1        CIPLA          12374.01                  0.00   
2    COALINDIA            934.30                  0.00   
3         GAIL          75126.30                  0.00   
4      HCLTECH          26012.00                  0.00   
5   HEROMOTOCO          33650.54                  0.07   
6     INFRATEL           6821.70                  0.00   
7          M&M          53614.00                  0.11   
8         ONGC         109609.42                  0.00   
9        TECHM          27219.60                  0.00   
10       WIPRO          48123.80                  0.10   

    Interest Coverage Ratio  Return On Equity (ROE)  P/E Ratio  
0                   1420.90                1.464620  18.024956  
1                    147.90                0.820657  22.553348  
2                    586.50                0.820364  14.040622  
3                     68.93                