# Value Investing Program
# Introduction

Inspired by Sean Seah Book -- Gone Fishing with Warren Buffetthttp://www.aceprofitsacademy.com/wp-content/uploads/2016/09/Gone-Fishing-with-Buffett.pdf

In here we are going to try to scrape financial data:
Input: List of the companies

Web scraping: 
* Find the shareprice by year and the following metrics:
    * EPS
    * ROE
    * ROA
    * Long term debt
    * Total Income
    * Debt to Equity
    * Interest Coverage Ratio

Methods:
* Given list of the companies, find out the feasibility to invest
    * Been in market minimal 10 years
    * Have the track records (EPS per year)
    * Have efficiency (ROE > 15%) -- Net income / shareholder equity
    * Determine manipulation (ROA > 7%) -- Net income / Total Asset
    * Have small long term debt (Long term debt <5* total income)
    * Low Debt to Equity
    * Ability to pay interest: (Interest Coverage Ratio >3) -- EBIT / Interest expenses

Outputs:
* Ranking of each company in terms of return rate given the value investing methodology
    * Find EPS Annual Compounded Growth Rate
    * Estimate EPS 10 years from now
    * Estimate stock price 10 years from now (Stock Price EPS * Average PE)
    * Determine target by price today based on returns(discount rate 15%/20%)
    * Add margin of safety (Safety net 15%)

Additional:
* Qualitative Assessment of the companies
    * Advantages in business (product differentiation, branding, low price producer, high switching cost, legal barriers to entry)
    * Ability of foolhardy management (even a fool can run)
    * Avoid price competitive business    

# Web scraping Russell Data Using Beautiful Soup

In [15]:
# Reading russel stocks plans and getting the 2000 stocks
dfrussel = pd.read_html('http://www.beatthemarketanalyzer.com/blog/russell-2000-stock-tickers-list/')[0];

In [24]:
dfrussel.columns=['tickers','names']

In [29]:
tickersrussel = dfrussel.tickers
tickersshortrussel = tickers[:10]
tickersmediumrussel = tickers[:50]
tickersmediumlargerussel = tickers[:300]

## Scraping Wikipedia SP500 Data Using Beautiful Soup

In [84]:
import bs4 as bs
import pickle
import requests
import pandas as pd

# This will keep tickers + gics industries & sub industries
def save_sp500_stocks_info():
    resp = requests.get('http://en.wikipedia.org/wiki/List_of_S%26P_500_companies')
    soup = bs.BeautifulSoup(resp.text, 'lxml')
    table = soup.find('table', {'class': 'wikitable sortable'})
    stocks_info=[]
    tickers = []
    securities = []
    gics_industries = []
    gics_sub_industries = []
    for row in table.findAll('tr')[1:]:
        ticker = row.findAll('td')[0].text
        security = row.findAll('td')[1].text
        gics_industry = row.findAll('td')[3].text
        gics_sub_industry = row.findAll('td')[4].text

        tickers.append(ticker.lower())
        securities.append(security)
        gics_industries.append(gics_industry.lower())
        gics_sub_industries.append(gics_sub_industry.lower())
    
    stocks_info.append(tickers)
    stocks_info.append(securities)
    stocks_info.append(gics_industries)
    stocks_info.append(gics_sub_industries)
    return stocks_info

stocks_info = save_sp500_stocks_info()
stocks_info_df = pd.DataFrame(stocks_info).T
stocks_info_df.columns=['tickers','security','gics_industry','gics_sub_industry']
stocks_info_df.set_index('tickers',inplace=True)

# Extract just the tickers list
tickers= stocks_info[0]

In [85]:
stocks_info_df['labels'] = stocks_info_df[['security', 'gics_industry','gics_sub_industry']].apply(lambda x: ' '.join(x), axis=1)

In [86]:
stocks_info_df

Unnamed: 0_level_0,security,gics_industry,gics_sub_industry,labels
tickers,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
mmm,3M Company,industrials,industrial conglomerates,3M Company industrials industrial conglomerates
abt,Abbott Laboratories,health care,health care equipment,Abbott Laboratories health care health care eq...
abbv,AbbVie Inc.,health care,pharmaceuticals,AbbVie Inc. health care pharmaceuticals
acn,Accenture plc,information technology,it consulting & other services,Accenture plc information technology it consul...
atvi,Activision Blizzard,information technology,home entertainment software,Activision Blizzard information technology hom...
ayi,Acuity Brands Inc,industrials,electrical components & equipment,Acuity Brands Inc industrials electrical compo...
adbe,Adobe Systems Inc,information technology,application software,Adobe Systems Inc information technology appli...
amd,Advanced Micro Devices Inc,information technology,semiconductors,Advanced Micro Devices Inc information technol...
aap,Advance Auto Parts,consumer discretionary,automotive retail,Advance Auto Parts consumer discretionary auto...
aes,AES Corp,utilities,independent power producers & energy traders,AES Corp utilities independent power producers...


In [87]:
# Put into csv
stocks_info_df.to_csv('D:\Investment\SP500andData.csv')

In [88]:
# Create a list of dict based on tickers and labels
dictlist = []
for index, row in stocks_info_df.iterrows():
    dictlist.append({'value':index, 'label':row['labels']})

In [89]:
dictlist

[{'label': u'3M Company industrials industrial conglomerates',
  'value': u'mmm'},
 {'label': u'Abbott Laboratories health care health care equipment',
  'value': u'abt'},
 {'label': u'AbbVie Inc. health care pharmaceuticals', 'value': u'abbv'},
 {'label': u'Accenture plc information technology it consulting & other services',
  'value': u'acn'},
 {'label': u'Activision Blizzard information technology home entertainment software',
  'value': u'atvi'},
 {'label': u'Acuity Brands Inc industrials electrical components & equipment',
  'value': u'ayi'},
 {'label': u'Adobe Systems Inc information technology application software',
  'value': u'adbe'},
 {'label': u'Advanced Micro Devices Inc information technology semiconductors',
  'value': u'amd'},
 {'label': u'Advance Auto Parts consumer discretionary automotive retail',
  'value': u'aap'},
 {'label': u'AES Corp utilities independent power producers & energy traders',
  'value': u'aes'},
 {'label': u'Aetna Inc health care managed health car

In [90]:
tickersnotfound= [u'aet',
u'amg',
 u'afl',
 u'are',
 u'all',
 u'axp',
 u'aig',
 u'amt',
 u'amp',
 u'abc',
 u'antm',
 u'aon',
 u'aiv',u'ajg', u'aiz', u'avb',u'bac',
 u'bk',
 u'bbt',
 u'blk',
 u'bxp',
 u'cof',
 u'cboe',
 u'cnc',
 u'schw',
 u'chtr',
 u'chk',
 u'cb',
 u'ci',
 u'cinf',
 u'c',
 u'cfg',
 u'cme',
 u'cma',
 u'cxo',
 u'cci',
 u'xray',
 u'dvn',
 u'dlr',
 u'dfs',
 u'dxc',
 u'etfc',
 u'ebay',
 u'ea',
 u'eqix',
 u'eqr',
 u'ess',
 u're',
 u'exr',
 u'fb',
 u'frt',
 u'fitb',
 u'fe',
 u'ben',
 u'ggp',
 u'gs',
 u'hig',
 u'hcp',
 u'holx',
 u'hst',
 u'hum',
 u'hban',
 u'info',
 u'ice',
 u'incy',
 u'ivz',
 u'irm',
 u'jci',
 u'jpm',
 u'key',
 u'kim',
 u'lnc',
 u'l',
 u'mtb',
 u'mac',
 u'mmc',
 u'met',
 u'maa','ms','ndac','navi',
 u'ntrs',
 u'nrg',
 u'oxy',
 u'pbct',
 u'prgo',
 u'pnc',
 u'pfg']

tickers = [x for x in tickers if x not in tickersnotfound]
tickersshort = tickers[:10]
tickersmedium = tickers[:50]
tickersmediumlarge = tickers[:300]

## Scraping marketwatch Data Using Beautiful Soup

### Formatting all the values to numerical

In [91]:
def format(list):
    newlist=[]
    posornegnumber = 1
    for text in list:
        if text.endswith(')'):
            text = text[1:-1] # remove the parentheses
            posornegnumber = -1
            
        if text.endswith('%'):
#             Then please make it into comma float
            endtext = float(text[:-1])/100.0 * posornegnumber 
        elif text.endswith('B'):
#             Then please times 1000000000
#             Change it into integer
            endtext = int(float(text[:-1])*1000000000)* posornegnumber 
        elif text.endswith('M'):
#             Then please times 1000000
#             Change it into integer
            endtext = int(float(text[:-1])*1000000)* posornegnumber 
        elif ',' in text:
#             Then please remove the ,
#             Then change it into int
            endtext = int(float(text.replace(",","")))* posornegnumber 

        elif text.endswith('-'):
#             Insert 0
            endtext = 0
        else:
#             change to float
            endtext = float(text)* posornegnumber 
        newlist.append(endtext)
    return newlist   

### Extracting Financial Reporting (Balance Sheet and Income Statement)

In [92]:
%%time
import pandas as pd
from urllib import urlopen
from bs4 import BeautifulSoup

dflist = []
tickersnotfound =[] 
counter = 0

for ticker in tickersmediumlarge: 
    try:
        urlfinancials = 'http://www.marketwatch.com/investing/stock/'+ticker+'/financials'
        urlbalancesheet = 'http://www.marketwatch.com/investing/stock/'+ticker+'/financials/balance-sheet'

        text_soup_financials = BeautifulSoup(urlopen(urlfinancials).read()) #read in
        text_soup_balancesheet = BeautifulSoup(urlopen(urlbalancesheet).read()) #read in

        # Income statement
        titlesfinancials = text_soup_financials.findAll('td', {'class': 'rowTitle'})
        epslist=[]
        netincomelist = []
        longtermdebtlist = [] 
        interestexpenselist = []
        ebitdalist= []

        for title in titlesfinancials:
            if 'EPS (Basic)' in title.text:
                epslist.append ([td.text for td in title.findNextSiblings(attrs={'class': 'valueCell'}) if td.text])
            if 'Net Income' in title.text:
                netincomelist.append ([td.text for td in title.findNextSiblings(attrs={'class': 'valueCell'}) if td.text])
            if 'Interest Expense' in title.text:
                interestexpenselist.append ([td.text for td in title.findNextSiblings(attrs={'class': 'valueCell'}) if td.text])
            if 'EBITDA' in title.text:
                ebitdalist.append ([td.text for td in title.findNextSiblings(attrs={'class': 'valueCell'}) if td.text])


        # Balance sheet
        titlesbalancesheet = text_soup_balancesheet.findAll('td', {'class': 'rowTitle'})
        equitylist=[]
        for title in titlesbalancesheet:
            if 'Total Shareholders\' Equity' in title.text:
                equitylist.append( [td.text for td in title.findNextSiblings(attrs={'class': 'valueCell'}) if td.text])
            if 'Long-Term Debt' in title.text:
                longtermdebtlist.append( [td.text for td in title.findNextSiblings(attrs={'class': 'valueCell'}) if td.text])

        # Variables        
        eps = epslist[0]
        epsgrowth = epslist[1]
        netincome = netincomelist[0]
        shareholderequity = equitylist[0]
        roa = equitylist[1]

        longtermdebt = longtermdebtlist[0]
        interestexpense = interestexpenselist[0]
        ebitda = ebitdalist[0]
        # Don't forget to add in roe, interest coverage ratio

        ## Make it into Dataframes
        df= pd.DataFrame({'eps': eps,'epsgrowth': epsgrowth,'netincome': netincome,'shareholderequity': shareholderequity,'roa': 
                      roa,'longtermdebt': longtermdebt,'interestexpense': interestexpense,'ebitda': ebitda},index=[2012,2013,2014,2015,2016])

        # Format all the number in dataframe
        dfformatted = df.apply(format)

        # Adding roe, interest coverage ratio
        dfformatted['roe'] = dfformatted.netincome/dfformatted.shareholderequity
        dfformatted['interestcoverageratio'] = dfformatted.ebitda/dfformatted.interestexpense

    #     Insert ticker and df
        dflist.append((ticker,dfformatted))
        
        counter+=1
        print(ticker, ' has been processed')
    except:
        tickersnotfound.append(ticker)
        print(ticker,' ticker is not found')

(u'ms', ' ticker is not found')
(u'ndaq', ' ticker is not found')
(u'navi', ' ticker is not found')
(u'ntrs', ' ticker is not found')
(u'nrg', ' ticker is not found')
(u'oxy', ' ticker is not found')
(u'pbct', ' ticker is not found')
(u'prgo', ' ticker is not found')
(u'pnc', ' ticker is not found')
(u'pfg', ' ticker is not found')
Wall time: 12min 9s


In [118]:
len(dflist)

290

In [120]:
dflist[0][1].reset_index()

Unnamed: 0,index,ebitda,eps,epsgrowth,interestexpense,longtermdebt,netincome,roa,shareholderequity,roe,interestcoverageratio
0,2012,7760000000,6.4,0.0,221000000,4990000000,4510000000,0.5188,17580000000,0.256542,35.113122
1,2013,8029999999,6.83,0.0672,0,4380000000,4720000000,0.5217,17500000000,0.269714,inf
2,2014,8529999999,7.63,0.1171,101000000,6790000000,5000000000,0.4192,13110000000,0.381388,84.455446
3,2015,8310000000,7.73,0.0125,229000000,8800000000,4840000000,0.3476,11430000000,0.423447,36.28821
4,2016,8490000000,8.35,0.0809,207000000,10720000000,5060000000,0.313,10300000000,0.491262,41.014493


In [121]:
len(dflist)

290

## Time
40 seconds for 10 tickers

3 minutes for 50 tickers


## Determining legibility
Find whether this particular stocks is legitimate using this and filter accordingly
    1. EPS increases over the year (consistent)
    2. ROE > 0.15
    3. ROA > 0.07 (also consider debt to equity cause Assets = liabilities + equity)
    4. Long term debt < 5 * income
    5. Interest Coverage Ratio > 3

In [122]:
def eligibilitycheck(df):
    ticker,dfformatted = df
    
    legiblestock = True
    reasonlist=[]

    # EPS increases over the year (consistent)
#     Counting 2 or more negative growth
    countnegativegrowth =0
    for growth in dfformatted.epsgrowth:
        if growth<0:
            countnegativegrowth+=1
        if countnegativegrowth>=2:
            legiblestock = False
            reasonlist.append('there are 2 negative growth '+str(growth))
            break
    # ROE > 0.15
    if dfformatted.roe.mean()<0.15:
            legiblestock = False
            reasonlist.append('roe mean is less than 0.13 '+ str(dfformatted.roe.mean()))
    # ROA > 0.07 (also consider debt to equity cause Assets = liabilities + equity)
    if dfformatted.roa.mean()<0.07:
            legiblestock = False
            reasonlist.append('roa mean is less than 0.07 ' + str(dfformatted.roa.mean()))
    # Long term debt < 5 * income
    if dfformatted.longtermdebt.tail(1).values[0]>5*dfformatted.netincome.tail(1).values[0]:
            legiblestock = False
            reasonlist.append('longtermdebt is 5 times the netincome ')
    # Interest Coverage Ratio > 3
    if dfformatted.interestcoverageratio.tail(1).values[0]<3:
            legiblestock = False
            reasonlist.append('Interestcoverageratio is less than 3 ')
#     print ticker,legiblestock,reasonlist
    return ticker,legiblestock

In [123]:
dflist[:2]

[(u'mmm',
            ebitda   eps  epsgrowth  interestexpense  longtermdebt   netincome  \
  2012  7760000000  6.40     0.0000        221000000    4990000000  4510000000   
  2013  8029999999  6.83     0.0672                0    4380000000  4720000000   
  2014  8529999999  7.63     0.1171        101000000    6790000000  5000000000   
  2015  8310000000  7.73     0.0125        229000000    8800000000  4840000000   
  2016  8490000000  8.35     0.0809        207000000   10720000000  5060000000   
  
           roa  shareholderequity       roe  interestcoverageratio  
  2012  0.5188        17580000000  0.256542              35.113122  
  2013  0.5217        17500000000  0.269714                    inf  
  2014  0.4192        13110000000  0.381388              84.455446  
  2015  0.3476        11430000000  0.423447              36.288210  
  2016  0.3130        10300000000  0.491262              41.014493  ),
 (u'abt',
            ebitda   eps  epsgrowth  interestexpense  longtermdebt   

In [124]:
selectiondflist = []
for df in dflist:
    if eligibilitycheck(df)[1]:
        selectiondflist.append(df)

In [125]:
len(dflist)

290

In [126]:
len(selectiondflist)

69

In [127]:
# What are the tickers of these?
tickersselections = [x[0] for x in selectiondflist]

In [128]:
tickersselections

[u'mmm',
 u'acn',
 u'ayi',
 u'alk',
 u'alb',
 u'algn',
 u'mo',
 u'ame',
 u'amgn',
 u'aph',
 u'an',
 u'avy',
 u'bbby',
 u'biib',
 u'ba',
 u'bf.b',
 u'chrw',
 u'cah',
 u'cbg',
 u'cmg',
 u'chd',
 u'ctas',
 u'csco',
 u'ctsh',
 u'cost',
 u'dlph',
 u'dg',
 u'dd',
 u'ecl',
 u'ew',
 u'emr',
 u'efx',
 u'el',
 u'expd',
 u'ffiv',
 u'fast',
 u'fisv',
 u'fl',
 u'ftv',
 u'gpc',
 u'gild',
 u'gt',
 u'gww',
 u'has',
 u'hsic',
 u'hsy',
 u'hd',
 u'hon',
 u'hrl',
 u'iff',
 u'intu',
 u'isrg',
 u'jbht',
 u'jnj',
 u'lly',
 u'low',
 u'lyb',
 u'ma',
 u'mtd',
 u'kors',
 u'msft',
 u'mnst',
 u'noc',
 u'orly',
 u'omc',
 u'payx',
 u'pypl',
 u'pcln',
 u'pg']

## Scraping for latest shareprice using selectiondflist

In [129]:
import pandas as pd
import datetime
import pandas_datareader.data as web
from pandas import Series, DataFrame


days_per_year = 365.24

# start = datetime.datetime.now()-datetime.timedelta(days=(5*days_per_year))
start = datetime.datetime.now()-datetime.timedelta(days=2)
end = datetime.datetime.now()

# To pull individual stock
# df = web.DataReader("AAPL", 'google', start, end)
# df.tail()

# To pull group stocks
dfcomp = web.DataReader(tickersselections,'google',
                               start=start, 
                               end=end)['Close']

In [130]:
dfcomp

Unnamed: 0_level_0,acn,alb,algn,alk,ame,amgn,an,aph,avy,ayi,...,mo,msft,mtd,noc,omc,orly,payx,pcln,pg,pypl
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2017-07-14,126.7,117.57,154.3,92.63,61.9,177.13,42.01,74.55,94.02,204.7,...,73.93,72.78,604.76,264.83,81.28,184.22,57.1,1950.87,87.1,57.16


# Using selectiondflist to calculate stocks price value

Outputs:
1. Ranking of each company in terms of return rate given the value investing methodology
    a. Find EPS Annual Compounded Growth Rate
    b. Estimate EPS 10 years from now
    c. Estimate stock price 10 years from now (Stock Price EPS * Average PE)
    d. Determine target by price today based on returns(discount rate 15%/20%)
    e. Add margin of safety (Safety net 15%)


In [131]:
selectiondflist

[(u'mmm',
            ebitda   eps  epsgrowth  interestexpense  longtermdebt   netincome  \
  2012  7760000000  6.40     0.0000        221000000    4990000000  4510000000   
  2013  8029999999  6.83     0.0672                0    4380000000  4720000000   
  2014  8529999999  7.63     0.1171        101000000    6790000000  5000000000   
  2015  8310000000  7.73     0.0125        229000000    8800000000  4840000000   
  2016  8490000000  8.35     0.0809        207000000   10720000000  5060000000   
  
           roa  shareholderequity       roe  interestcoverageratio  
  2012  0.5188        17580000000  0.256542              35.113122  
  2013  0.5217        17500000000  0.269714                    inf  
  2014  0.4192        13110000000  0.381388              84.455446  
  2015  0.3476        11430000000  0.423447              36.288210  
  2016  0.3130        10300000000  0.491262              41.014493  ),
 (u'acn',
            ebitda   eps  epsgrowth  interestexpense  longtermdebt   

In [132]:
import numpy as np
dfprice = pd.DataFrame(columns =['ticker','annualgrowthrate','lasteps','futureeps'])
i=0
for tuple in selectiondflist:
    ticker, df = tuple
    
    # Find EPS Annual Compounded Growth Rate
    annualgrowthrate =  df.epsgrowth.mean() #growth rate
    
    # Estimate stock price 10 years from now (Stock Price EPS * Average PE)
    lasteps = df.eps.tail(1).values[0] #presentvalue
    years  = 10 #period
    
    futureeps = abs(np.fv(annualgrowthrate,years,0,lasteps))
        
    dfprice.loc[i] = [ticker,annualgrowthrate,lasteps,futureeps]
    i+=1
    
dfprice.set_index('ticker',inplace=True)

In [133]:
dfprice['lastshareprice']=dfcomp.tail().T
dfprice['peratio'] = dfprice['lastshareprice']/dfprice['lasteps']
dfprice['futureshareprice'] = dfprice['futureeps']*dfprice['peratio']

In [134]:
discountrate = 0.2
margin = 0.15

dfprice['presentshareprice'] = abs(np.pv(discountrate,years,0,fv=dfprice['futureshareprice']))
dfprice['marginalizedprice'] = dfprice['presentshareprice']*(1-0.15) 

In [135]:
dfprice

Unnamed: 0_level_0,annualgrowthrate,lasteps,futureeps,lastshareprice,peratio,futureshareprice,presentshareprice,marginalizedprice
ticker,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
mmm,0.05554,8.35,14.336180,211.77,25.361677,363.589551,58.721742,49.913481
acn,0.11874,6.58,20.207731,126.70,19.255319,389.106314,62.842842,53.416416
ayi,0.20242,6.68,42.202519,204.70,30.643713,1293.241867,208.865782,177.535914
alk,0.26626,6.59,69.841619,92.63,14.056146,981.703965,158.550671,134.768070
alb,0.12390,4.53,14.567161,117.57,25.953642,378.070882,61.060558,51.901474
algn,0.33502,2.38,42.801075,154.30,64.831933,2774.876413,448.158033,380.934328
mo,0.39988,7.28,210.396964,73.93,10.155220,2136.627406,345.077255,293.315666
ame,0.03336,2.20,3.054493,61.90,28.136364,85.942328,13.880166,11.798141
amgn,0.13682,10.32,37.204610,177.13,17.163760,638.570979,103.132778,87.662862
aph,0.09334,2.67,6.517239,74.55,27.921348,181.970094,29.389186,24.980808


In [136]:
stocks_info_df

Unnamed: 0_level_0,security,gics_industry,gics_sub_industry,labels
tickers,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
mmm,3M Company,industrials,industrial conglomerates,3M Company industrials industrial conglomerates
abt,Abbott Laboratories,health care,health care equipment,Abbott Laboratories health care health care eq...
abbv,AbbVie Inc.,health care,pharmaceuticals,AbbVie Inc. health care pharmaceuticals
acn,Accenture plc,information technology,it consulting & other services,Accenture plc information technology it consul...
atvi,Activision Blizzard,information technology,home entertainment software,Activision Blizzard information technology hom...
ayi,Acuity Brands Inc,industrials,electrical components & equipment,Acuity Brands Inc industrials electrical compo...
adbe,Adobe Systems Inc,information technology,application software,Adobe Systems Inc information technology appli...
amd,Advanced Micro Devices Inc,information technology,semiconductors,Advanced Micro Devices Inc information technol...
aap,Advance Auto Parts,consumer discretionary,automotive retail,Advance Auto Parts consumer discretionary auto...
aes,AES Corp,utilities,independent power producers & energy traders,AES Corp utilities independent power producers...


In [137]:
mergestocksdf = pd.merge(stocks_info_df, dfprice, how='inner', left_index=True,right_index=True)

In [138]:
mergestocksdf['Buy'] = mergestocksdf.lastshareprice<=mergestocksdf.marginalizedprice

In [139]:
mergestocksdf.head()

Unnamed: 0,security,gics_industry,gics_sub_industry,labels,annualgrowthrate,lasteps,futureeps,lastshareprice,peratio,futureshareprice,presentshareprice,marginalizedprice,Buy
mmm,3M Company,industrials,industrial conglomerates,3M Company industrials industrial conglomerates,0.05554,8.35,14.33618,211.77,25.361677,363.589551,58.721742,49.913481,False
acn,Accenture plc,information technology,it consulting & other services,Accenture plc information technology it consul...,0.11874,6.58,20.207731,126.7,19.255319,389.106314,62.842842,53.416416,False
ayi,Acuity Brands Inc,industrials,electrical components & equipment,Acuity Brands Inc industrials electrical compo...,0.20242,6.68,42.202519,204.7,30.643713,1293.241867,208.865782,177.535914,False
alk,Alaska Air Group Inc,industrials,airlines,Alaska Air Group Inc industrials airlines,0.26626,6.59,69.841619,92.63,14.056146,981.703965,158.550671,134.76807,True
alb,Albemarle Corp,materials,specialty chemicals,Albemarle Corp materials specialty chemicals,0.1239,4.53,14.567161,117.57,25.953642,378.070882,61.060558,51.901474,False


In [140]:
mergestocksdf.to_csv('D:\Investment\stocksanalysis.csv')

In [141]:
mergestocksdf[mergestocksdf.Buy==True].sort_values('marginalizedprice',ascending=True)

Unnamed: 0,security,gics_industry,gics_sub_industry,labels,annualgrowthrate,lasteps,futureeps,lastshareprice,peratio,futureshareprice,presentshareprice,marginalizedprice,Buy
alk,Alaska Air Group Inc,industrials,airlines,Alaska Air Group Inc industrials airlines,0.26626,6.59,69.841619,92.63,14.056146,981.703965,158.550671,134.76807,True
pypl,PayPal,information technology,data processing & outsourced services,PayPal information technology data processing ...,0.34664,1.16,22.749565,57.16,49.275862,1121.004418,181.048472,153.891201,True
ew,Edwards Lifesciences,health care,health care equipment,Edwards Lifesciences health care health care e...,0.2624,2.67,27.44613,116.92,43.790262,1201.873239,194.109238,164.992852,True
mo,Altria Group Inc,consumer staples,tobacco,Altria Group Inc consumer staples tobacco,0.39988,7.28,210.396964,73.93,10.15522,2136.627406,345.077255,293.315666,True
cah,Cardinal Health Inc.,health care,health care distributors,Cardinal Health Inc. health care health care d...,0.41348,4.36,138.797987,78.11,17.915138,2486.585038,401.597366,341.357761,True
biib,Biogen Inc.,health care,biotechnology,Biogen Inc. health care biotechnology,0.2552,16.95,164.550425,280.81,16.566962,2726.100581,440.280463,374.238394,True
algn,Align Technology,health care,health care supplies,Align Technology health care health care supplies,0.33502,2.38,42.801075,154.3,64.831933,2774.876413,448.158033,380.934328,True
gild,Gilead Sciences,health care,biotechnology,Gilead Sciences health care biotechnology,0.7003,10.08,2035.710798,70.57,7.000992,14251.995141,2301.776783,1956.510265,True
ftv,Fortive Corp,industrials,industrial machinery,Fortive Corp industrials industrial machinery,0.88332,2.52,1414.626291,64.56,25.619048,36241.378306,5853.184928,4975.207189,True
gt,Goodyear Tire & Rubber,consumer discretionary,tires & rubber,Goodyear Tire & Rubber consumer discretionary ...,1.46786,4.81,40304.196238,36.27,7.540541,303915.425684,49084.037974,41721.432278,True


In [117]:
dfprice.reset_index().to_csv('D:\Investment\russel\russelanalysis.csv')

IOError: [Errno 22] invalid mode ('w') or filename: 'D:\\Investment\russel\russelanalysis.csv'