<a id='top'></a>
# How to slice and dice the data
Below are a series of examples on how to slice and dice the data that is stored in the *.sqlite* file generated by the [MorningStar.com](https://www.morningstar.com) scraper. 

##### NOTE: 
- The data used to generate the codes below comes from the *.sqlite* that is generated by the scraper once it has been installed and ran locally on your machine. See the [README]() for instructions on how to run install and run the scraper.
- Navigation links as the ones in the list of content below as well as other links throught this document will only work if you are using [Jupyter](https://jupyter.org/) to view this document.


**Content** 

1. [Required modules and matplotlib backend](#modules)
1. [Creating a master (bridge table) DataFrame instance using the DataFrames class](#master)
1. [Creating DataFrame instances with the dataframes methods](#methods)
1. [Data statistics and sample code](#stats)
1. [Applying value investing criteria to filter common stocks](#value) *(in progress)*
1. [Additional sample / test code](#additional) *(in progress)*

<a id="modules"></a>
## Required modules and matplotlib backend

In [1]:
%matplotlib notebook

In [2]:
import matplotlib.pyplot as plt
import matplotlib

In [3]:
from importlib import reload
import pandas as pd
import numpy as np

# Import dataframes module from project folder.
# This module contains a class that reads the database tables and assigns the data to pandas.DataFrame objects
import dataframes
reload(dataframes) #reload if changes have been made to module file

<module 'dataframes' from '/home/cbrandao/lib/python/msTables/dataframes.py'>

[return to the top](#top)
<a id="master"></a>
## Creating a master DataFrame instance using the dataframes class
The DataFrames class is part of the [dataframes module](dataframes.py)

In [4]:
db_file_name = 'mstables2' # Change the file name here as needed
df = dataframes.DataFrames('db/{}.sqlite'.format(db_file_name))

Creating intial DataFrames from file db/mstables2.sqlite...
Creating DataFrame 'colheaders' ...
Creating DataFrame 'timerefs' ...
Creating DataFrame 'urls' ...
Creating DataFrame 'securitytypes' ...
Creating DataFrame 'tickers' ...
Creating DataFrame 'sectors' ...
Creating DataFrame 'industries' ...
Creating DataFrame 'stockstyles' ...
Creating DataFrame 'exchanges' ...
Creating DataFrame 'countries' ...
Creating DataFrame 'companies' ...
Creating DataFrame 'currencies' ...
Creating DataFrame 'stocktypes' ...
Creating DataFrame 'master' ...
Initial DataFrames created.


### Creating Master DataFrame instance from reference tables
By merging `df.master` (*Master* bridge table) with other reference tables (e.g. `df.tickers`, `df.exchanges`, etc.)
##### DataFrame Instance

In [5]:
# Merge Tables
df_master0 = (df.master
# Ticker Symbols
 .merge(df.tickers, left_on='ticker_id', right_on='id').drop(['id'], axis=1)
# Company / Security Name
 .merge(df.companies, left_on='company_id', right_on='id').drop(['id', 'company_id'], axis=1)
# Exchanges
 .merge(df.exchanges, left_on='exchange_id', right_on='id').drop(['id'], axis=1)
# Industries
 .merge(df.industries, left_on='industry_id', right_on='id').drop(['id', 'industry_id'], axis=1)
# Sectors
 .merge(df.sectors, left_on='sector_id', right_on='id').drop(['id', 'sector_id'], axis=1)
# Countries
 .merge(df.countries, left_on='country_id', right_on='id').drop(['id', 'country_id'], axis=1)
# Security Types
 .merge(df.securitytypes, left_on='security_type_id', right_on='id').drop(['id', 'security_type_id'], axis=1)
# Stock Types
 .merge(df.stocktypes, left_on='stock_type_id', right_on='id').drop(['id', 'stock_type_id'], axis=1)
# Stock Style Types
 .merge(df.styles, left_on='style_id', right_on='id').drop(['id', 'style_id'], axis=1)
# Quote Header Info
 .merge(df.quoteheader(), on=['ticker_id', 'exchange_id']).rename(columns={'fpe':'Forward_PE'})
# Currency
 .merge(df.currencies, left_on='currency_id', right_on='id').drop(['id', 'currency_id'], axis=1)
# Fiscal Year End
 .merge(df.timerefs, left_on='fyend_id', right_on='id').drop(['fyend_id'], axis=1)
             .rename(columns={'dates':'fy_end'})
# Updated Date
 .merge(df.timerefs, left_on='update_date_id', right_on='id').drop(['update_date_id'], axis=1)
             .rename(columns={'dates':'updated_date'})
)

# Change date columns to TimeFrames
df_master0['fy_end'] = pd.to_datetime(df_master0['fy_end'])
df_master0['updated_date'] = pd.to_datetime(df_master0['updated_date'])

# Create df_master and apply filters
df_master = df_master0.copy()
df_master[['lastprice', 'day_hi', 'day_lo', '_52wk_hi', '_52wk_lo']] = (
    df_master[['lastprice', 'day_hi', 'day_lo', '_52wk_hi', '_52wk_lo']]
    .fillna(value=0.0)
)
df_master = (df_master
             .where((df_master['openprice'] > 0.0) &
                    (df_master['lastprice'] > 0.0)
                   ).dropna(axis=0, how='all'))

In [6]:
df_master.head()

Unnamed: 0,ticker_id,exchange_id,ticker,company,exchange,exchange_sym,industry,sector,country,country_c2,...,aprvol,avevol,Forward_PE,pb,ps,pc,currency,currency_code,fy_end,updated_date
0,1.0,374.0,OGCP,Empire State Realty OP LP Operating Partnershi...,NYSE ARCA,ARCX,REIT - Diversified,Real Estate,United States,US,...,8736.0,3548.0,,2.4,6.4,16.9,United States Dollar,USD,2019-12-31,2019-03-27
1,2.0,374.0,FISK,Empire State Realty OP LP Operating Partnershi...,NYSE ARCA,ARCX,REIT - Diversified,Real Estate,United States,US,...,1389.0,2647.0,,2.4,6.4,16.8,United States Dollar,USD,2019-12-31,2019-03-27
2,3.0,374.0,ESBA,Empire State Realty OP LP Operating Partnershi...,NYSE ARCA,ARCX,REIT - Diversified,Real Estate,United States,US,...,17845.0,21680.0,,2.3,6.3,16.6,United States Dollar,USD,2019-12-31,2019-03-27
3,18686.0,302.0,ARE,Alexandria Real Estate Equities Inc,"NEW YORK STOCK EXCHANGE, INC.",XNYS,REIT - Office,Real Estate,United States,US,...,416784.0,721792.0,56.2,2.2,11.0,25.7,United States Dollar,USD,2019-12-31,2019-03-27
4,19275.0,302.0,LPT,Liberty Property Trust,"NEW YORK STOCK EXCHANGE, INC.",XNYS,REIT - Office,Real Estate,United States,US,...,627477.0,887562.0,,2.1,10.1,18.7,United States Dollar,USD,2019-12-31,2019-03-27


##### DataFrame Length

In [7]:
print('Master DataFrame contains {:,.0f} records.'.format(len(df_master)))

Master DataFrame contains 103,507 records.


##### DataFrame Columns

In [8]:
df_master.columns

Index(['ticker_id', 'exchange_id', 'ticker', 'company', 'exchange',
       'exchange_sym', 'industry', 'sector', 'country', 'country_c2',
       'country_c3', 'security_type_code', 'security_type', 'stock_type',
       'style', 'openprice', 'lastprice', 'day_hi', 'day_lo', '_52wk_hi',
       '_52wk_lo', 'yield', 'aprvol', 'avevol', 'Forward_PE', 'pb', 'ps', 'pc',
       'currency', 'currency_code', 'fy_end', 'updated_date'],
      dtype='object')

<br></br>
[return to the top](#top)
<a id='methods'></a>
## Creating DataFrame instances with dataframes methods
Class DataFrames from [dataframe.py](dataframe.py) contains the following methods that return a pd.DataFrame object for the specified database table:

- `quoteheader` - [MorningStar (MS) Quote Header](#quote)
- `valuation` - [MS Valuation table with Price Ratios (P/E, P/S, P/B, P/C) for the past 10 yrs](#val)
- `keyratios` - [MS Ratio - Key Financial Ratios & Values](#keyratios)
- `finhealth` - [MS Ratio - Financial Health](#finhealth)
- `profitability` - [MS Ratio - Profitability](#prof)
- `growth` - [MS Ratio - Growth](#growth)
- `cfhealth` - [MS Ratio - Cash Flow Health](#cfh)
- `efficiency` - [MS Ratio - Efficiency](#eff)
- `annualIS` - [MS Annual Income Statements](#isa)
- `quarterlyIS` - [MS Quarterly Income Statements](#isq)
- `annualBS` - [MS Annual Balance Sheets](#bsa)
- `quarterlyBS` - [MS Quarterly Balance Sheets](#bsq)
- `annualCF` - [MS Annual Cash Flow Statements](#cfa)
- `quarterlyCF` - [MS Quarterly Cash Flow Statements](#cfq)
- `pricehistory` - MSpricehistory

[return to the top](#top)
<a id='quote'></a>
### Quote Header 
##### DataFrame Instance

In [102]:
df_quote = df.quoteheader()

In [103]:
df_quote.head()

Unnamed: 0,ticker_id,exchange_id,openprice,lastprice,day_hi,day_lo,_52wk_hi,_52wk_lo,yield,aprvol,avevol,fpe,pb,ps,pc,currency_id
0,1,374,15.84,15.82,15.84,15.82,17.72,12.16,2.65,8736.0,3548.0,,2.4,6.4,16.9,104.0
1,2,374,15.73,15.73,15.73,15.73,17.68,13.68,2.67,1389.0,2647.0,,2.4,6.4,16.8,104.0
2,3,374,15.65,15.72,15.72,15.56,17.79,11.99,2.69,17845.0,21680.0,,2.3,6.3,16.6,104.0
3,4,482,94.99,94.77,95.22,93.06,115.36,87.87,1.26,933703.0,819856.0,19.6,3.3,3.9,20.0,104.0
4,5,1,0.0,0.0,0.0,0.0,0.0,0.0,,184.0,184.0,,,,,104.0


##### DataFrame Length

In [104]:
print('DataFrame contains {:,.0f} records.'.format(len(df_quote)))

DataFrame contains 117,590 records.


<a id='val'></a>
[return to list of methods](#methods),
[return to the top](#top)
### Price Ratios (P/E, P/S, P/B, P/C)
##### DataFrame Instance

In [9]:
df_valuation = df.valuation().reset_index()

In [10]:
df_valuation.head()

Unnamed: 0,exchange_id,ticker_id,PE_2009,PE_2010,PE_2011,PE_2012,PE_2013,PE_2014,PE_2015,PE_2016,...,PC_2010,PC_2011,PC_2012,PC_2013,PC_2014,PC_2015,PC_2016,PC_2017,PC_2018,PC_TTM
0,374,1,,,,,,18.6,69.0,57.8,...,,,,48.3,-153.8,24.3,25.8,26.3,21.5,16.9
1,374,2,,,,,,18.2,69.0,58.8,...,,,,50.5,-151.5,24.3,26.2,26.4,20.3,16.8
2,374,3,,,,,,18.3,69.4,58.8,...,,,,48.3,-153.8,24.4,26.2,26.7,20.6,16.8
3,482,4,,22.2,17.4,16.6,27.0,29.7,26.6,31.7,...,16.2,11.1,12.9,19.6,23.1,19.9,26.8,52.9,21.8,20.0
4,1,5,,,,,,,,,...,,,,,,,,,,


##### DataFrame Length

In [11]:
print('DataFrame contains {:,.0f} records.'.format(len(df_valuation)))

DataFrame contains 79,934 records.


##### DataFrame Columnns

In [12]:
df_valuation.columns

Index(['exchange_id', 'ticker_id', 'PE_2009', 'PE_2010', 'PE_2011', 'PE_2012',
       'PE_2013', 'PE_2014', 'PE_2015', 'PE_2016', 'PE_2017', 'PE_2018',
       'PE_TTM', 'PS_2009', 'PS_2010', 'PS_2011', 'PS_2012', 'PS_2013',
       'PS_2014', 'PS_2015', 'PS_2016', 'PS_2017', 'PS_2018', 'PS_TTM',
       'PB_2009', 'PB_2010', 'PB_2011', 'PB_2012', 'PB_2013', 'PB_2014',
       'PB_2015', 'PB_2016', 'PB_2017', 'PB_2018', 'PB_TTM', 'PC_2009',
       'PC_2010', 'PC_2011', 'PC_2012', 'PC_2013', 'PC_2014', 'PC_2015',
       'PC_2016', 'PC_2017', 'PC_2018', 'PC_TTM'],
      dtype='object')

<a id='keyratios'></a>
[return to list of methods](#methods),
[return to the top](#top)
### Key Ratios
##### DataFrame Instance

In [13]:
df_keyratios = (df_master.merge(df.keyratios(), on=['ticker_id', 'exchange_id']))

In [14]:
df_keyratios.head()

Unnamed: 0,ticker_id,exchange_id,ticker,company,exchange,exchange_sym,industry,sector,country,country_c2,...,Y1,Y2,Y3,Y4,Y5,Y6,Y7,Y8,Y9,Y10
0,1.0,374.0,OGCP,Empire State Realty OP LP Operating Partnershi...,NYSE ARCA,ARCX,REIT - Diversified,Real Estate,United States,US,...,2010-12-01,2011-12-01,2012-12-01,2013-12-01,2014-12-01,2015-12-01,2016-12-01,2017-12-01,2018-12-01,TTM
1,2.0,374.0,FISK,Empire State Realty OP LP Operating Partnershi...,NYSE ARCA,ARCX,REIT - Diversified,Real Estate,United States,US,...,2010-12-01,2011-12-01,2012-12-01,2013-12-01,2014-12-01,2015-12-01,2016-12-01,2017-12-01,2018-12-01,TTM
2,3.0,374.0,ESBA,Empire State Realty OP LP Operating Partnershi...,NYSE ARCA,ARCX,REIT - Diversified,Real Estate,United States,US,...,2010-12-01,2011-12-01,2012-12-01,2013-12-01,2014-12-01,2015-12-01,2016-12-01,2017-12-01,2018-12-01,TTM
3,18686.0,302.0,ARE,Alexandria Real Estate Equities Inc,"NEW YORK STOCK EXCHANGE, INC.",XNYS,REIT - Office,Real Estate,United States,US,...,2010-12-01,2011-12-01,2012-12-01,2013-12-01,2014-12-01,2015-12-01,2016-12-01,2017-12-01,2018-12-01,TTM
4,19275.0,302.0,LPT,Liberty Property Trust,"NEW YORK STOCK EXCHANGE, INC.",XNYS,REIT - Office,Real Estate,United States,US,...,2010-12-01,2011-12-01,2012-12-01,2013-12-01,2014-12-01,2015-12-01,2016-12-01,2017-12-01,2018-12-01,TTM


##### DataFrame Length

In [15]:
print('DataFrame contains {:,.0f} records.'.format(len(df_keyratios)))

DataFrame contains 68,307 records.


##### DataFrame Columnns

In [16]:
df_labels_kratios = (df_keyratios
                     .loc[0, [col for col in df_keyratios.columns if 'Y' not in col and col.startswith('i')]]
                     .replace(df.colheaders['header']))
df_labels_kratios

industry          REIT - Diversified
i0                           Revenue
i1                      Gross_Margin
i2                  Operating_Income
i3                  Operating_Margin
i4                        Net_Income
i5                Earnings_Per_Share
i6                         Dividends
i91                     Payout_Ratio
i7                            Shares
i8              Book_Value_Per_Share
i9               Operating_Cash_Flow
i10                     Cap_Spending
i11                   Free_Cash_Flow
i90         Free_Cash_Flow_Per_Share
i80                  Working_Capital
Name: 0, dtype: object

<a id='finhealth'></a>
[return to list of methods](#methods),
[return to the top](#top)
### Financial Health
##### DataFrame Instance

In [17]:
df_finhealth = df.finhealth()

In [59]:
df_finhealth.head()

Unnamed: 0,ticker_id,exchange_id,fh_balsheet,fh_Y0,fh_Y1,fh_Y2,fh_Y3,fh_Y4,fh_Y5,fh_Y6,...,i68_lfh_Y1,i68_lfh_Y2,i68_lfh_Y3,i68_lfh_Y4,i68_lfh_Y5,i68_lfh_Y6,i68_lfh_Y7,i68_lfh_Y8,i68_lfh_Y9,i68_lfh_Y10
0,1,374,324,40,41,42,43,44,11,12,...,,,,1.2,1.17,1.19,0.81,0.85,0.96,0.96
1,2,374,324,40,41,42,43,44,11,12,...,,,,1.2,1.17,1.19,0.81,0.85,0.96,0.96
2,3,374,324,40,41,42,43,44,11,12,...,,,,1.2,1.17,1.19,0.81,0.85,0.96,0.96
3,4,482,324,40,41,42,43,44,11,12,...,,,,,,,,0.4,0.28,0.28
4,5,1,324,697,698,699,700,701,668,669,...,,,,,4.34,3.53,2.07,,0.28,1.34


##### DataFrame Length

In [60]:
print('DataFrame contains {:,.0f} records.'.format(len(df_finhealth)))

DataFrame contains 77,177 records.


##### DataFrame Columns

In [63]:
(df_finhealth.loc[0, [col for col in df_finhealth.columns if 'Y' not in col and '_id' not in col]]
 .replace(df.colheaders['header']))

fh_balsheet         Balance Sheet Items (in %)
i45              Cash & Short-Term Investments
i46                        Accounts Receivable
i47                                  Inventory
i48                       Other Current Assets
i49                       Total Current Assets
i50                                   Net PP&E
i51                                Intangibles
i52                     Other Long-Term Assets
i53                               Total Assets
i54                           Accounts Payable
i55                            Short-Term Debt
i56                              Taxes Payable
i57                        Accrued Liabilities
i58               Other Short-Term Liabilities
i59                  Total Current Liabilities
i60                             Long-Term Debt
i61                Other Long-Term Liabilities
i62                          Total Liabilities
i63                  Total Stockholders Equity
i64                 Total Liabilities & Equity
lfh_liquidity

<a id='prof'></a>
[return to list of methods](#methods),
[return to the top](#top)
### Profitability
##### DataFrame Instance

In [64]:
df_profitability = df.profitability()

In [65]:
df_profitability.head()

Unnamed: 0,ticker_id,exchange_id,pr_margins,pr_Y0,pr_Y1,pr_Y2,pr_Y3,pr_Y4,pr_Y5,pr_Y6,...,i95_pr_pro_Y1,i95_pr_pro_Y2,i95_pr_pro_Y3,i95_pr_pro_Y4,i95_pr_pro_Y5,i95_pr_pro_Y6,i95_pr_pro_Y7,i95_pr_pro_Y8,i95_pr_pro_Y9,i95_pr_pro_Y10
0,1,374,279,40,41,42,43,44,11,12,...,,2.05,1.89,4.52,2.13,2.24,2.59,2.82,2.53,2.53
1,2,374,279,40,41,42,43,44,11,12,...,,2.05,1.89,4.52,2.13,2.24,2.59,2.82,2.53,2.53
2,3,374,279,40,41,42,43,44,11,12,...,,2.05,1.89,4.52,2.13,2.24,2.59,2.82,2.53,2.53
3,4,482,279,40,41,42,43,44,11,12,...,184.73,274.13,,,,,,8.85,15.1,15.1
4,5,1,279,697,698,699,700,701,668,669,...,,,,,,,,,,


##### DataFrame Length

In [66]:
print('DataFrame contains {:,.0f} records.'.format(len(df_profitability)))

NameError: name 'df_prof0' is not defined

##### DataFrame Columns

In [67]:
# Financial Health DataFrame Columns
(df_profitability.loc[0, [col for col in df_profitability.columns if 'Y' not in col and '_id' not in col]]
 .replace(df.colheaders['header']))

pr_margins              Margins % of Sales
i12                                Revenue
i13                                   COGS
i14                           Gross Margin
i15                                   SG&A
i16                                    R&D
i17                                  Other
i18                       Operating Margin
i19                    Net Int Inc & Other
i20                             EBT Margin
pr_profit                    Profitability
i21                             Tax Rate %
i22                           Net Margin %
i23               Asset Turnover (Average)
i24                     Return on Assets %
i25           Financial Leverage (Average)
i26                     Return on Equity %
i27           Return on Invested Capital %
i95                      Interest Coverage
Name: 0, dtype: object

<a id='growth'></a>
[return to list of methods](#methods),
[return to the top](#top)
### Growth
##### DataFrame Instance

In [68]:
df_growth = df.growth()

In [69]:
df_growth.head()

Unnamed: 0,ticker_id,exchange_id,gr_Y0,gr_Y1,gr_Y2,gr_Y3,gr_Y4,gr_Y5,gr_Y6,gr_Y7,...,i39_gr_Y1,i39_gr_Y2,i39_gr_Y3,i39_gr_Y4,i39_gr_Y5,i39_gr_Y6,i39_gr_Y7,i39_gr_Y8,i39_gr_Y9,i39_gr_Y10
0,1,374,40,41,42,43,44,11,12,13,...,,,,,,,,,,
1,2,374,40,41,42,43,44,11,12,13,...,,,,,,,,,,
2,3,374,40,41,42,43,44,11,12,13,...,,,,,,,,,,
3,4,482,40,41,42,43,44,11,12,13,...,,,,,,,,,11.73,
4,5,1,697,698,699,700,701,668,669,670,...,,,,,,,,,,


##### DataFrame Length

In [71]:
print('DataFrame contains {:,.0f} records.'.format(len(df_growth)))

DataFrame contains 77,177 records.


##### DataFrame Columns

In [72]:
# Financial Health DataFrame Columns
(df_growth.loc[0, [col for col in df_growth.columns if 'Y' not in col and '_id' not in col]]
 .replace(df.colheaders['header']))

gr_revenue               Revenue %
i28                 Year over Year
i29                 3-Year Average
i30                 5-Year Average
i31                10-Year Average
gr_operating    Operating Income %
i32                 Year over Year
i33                 3-Year Average
i34                 5-Year Average
i35                10-Year Average
gr_ni                 Net Income %
i81                 Year over Year
i82                 3-Year Average
i83                 5-Year Average
i84                10-Year Average
gr_eps                       EPS %
i36                 Year over Year
i37                 3-Year Average
i38                 5-Year Average
i39                10-Year Average
Name: 0, dtype: object

<a id='cfh'></a>
[return to list of methods](#methods),
[return to the top](#top)
### Cash Flow Health
##### DataFrame Instance

In [120]:
df_cfhealth = df.cfhealth()

In [121]:
df_cfhealth.head()

Unnamed: 0,ticker_id,exchange_id,cf_cashflow,cf_Y0,cf_Y1,cf_Y2,cf_Y3,cf_Y4,cf_Y5,cf_Y6,...,i44_cf_Y1,i44_cf_Y2,i44_cf_Y3,i44_cf_Y4,i44_cf_Y5,i44_cf_Y6,i44_cf_Y7,i44_cf_Y8,i44_cf_Y9,i44_cf_Y10
0,1,374,318,40,41,42,43,44,11,12,...,,-0.22,0.14,-0.91,-2.41,0.77,0.34,-0.27,2.4,2.4
1,2,374,318,40,41,42,43,44,11,12,...,,-0.22,0.14,-0.91,-2.41,0.77,0.34,-0.27,2.4,2.4
2,3,374,318,40,41,42,43,44,11,12,...,,-0.22,0.14,-0.91,-2.41,0.77,0.34,-0.27,2.4,2.4
3,4,482,318,40,41,42,43,44,11,12,...,1.12,1.25,1.08,1.11,1.12,1.0,0.99,0.84,1.17,1.17
4,5,1,318,697,698,699,700,701,668,669,...,,,,1.0,0.8,1.13,0.5,1.16,0.83,0.5


##### DataFrame Length

In [122]:
print('DataFrame contains {:,.0f} records.'.format(len(df_cfhealth)))

DataFrame contains 77,177 records.


##### DataFrame Columns

In [123]:
# Financial Health DataFrame Columns
(df_cfhealth.loc[0, [col for col in df_cfhealth.columns if 'Y' not in col and '_id' not in col]]
 .replace(df.colheaders['header']))

cf_cashflow                    Cash Flow Ratios
i40            Operating Cash Flow Growth % YOY
i41                 Free Cash Flow Growth % YOY
i42                      Cap Ex as a % of Sales
i43                      Free Cash Flow/Sales %
i44                   Free Cash Flow/Net Income
Name: 0, dtype: object

<a id='eff'></a>
[return to list of methods](#methods),
[return to the top](#top)
### Efficiency
##### DataFrame Instance

In [78]:
df_efficiency = df.efficiency()

In [79]:
df_efficiency.head()

Unnamed: 0,ticker_id,exchange_id,ef_efficiency,ef_Y0,ef_Y1,ef_Y2,ef_Y3,ef_Y4,ef_Y5,ef_Y6,...,i76_ef_Y1,i76_ef_Y2,i76_ef_Y3,i76_ef_Y4,i76_ef_Y5,i76_ef_Y6,i76_ef_Y7,i76_ef_Y8,i76_ef_Y9,i76_ef_Y10
0,1,374,350,40,41,42,43,44,11,12,...,,,0.25,0.18,0.22,0.2,0.19,0.18,0.18,0.18
1,2,374,350,40,41,42,43,44,11,12,...,,,0.25,0.18,0.22,0.2,0.19,0.18,0.18,0.18
2,3,374,350,40,41,42,43,44,11,12,...,,,0.25,0.18,0.22,0.2,0.19,0.18,0.18,0.18
3,4,482,350,40,41,42,43,44,11,12,...,1.06,1.75,1.54,1.46,1.5,1.65,1.53,0.78,0.52,0.52
4,5,1,350,697,698,699,700,701,668,669,...,,,,0.03,0.32,0.22,0.3,,0.03,0.06


##### DataFrame Length

In [80]:
print('DataFrame contains {:,.0f} records.'.format(len(df_efficiency)))

DataFrame contains 77,177 records.


##### DataFrame Columns

In [81]:
# Financial Health DataFrame Columns
(df_efficiency.loc[0, [col for col in df_efficiency.columns if 'Y' not in col and '_id' not in col]]
 .replace(df.colheaders['header']))

ef_efficiency                Efficiency
i69              Days Sales Outstanding
i70                      Days Inventory
i71                     Payables Period
i72               Cash Conversion Cycle
i73                Receivables Turnover
i74                  Inventory Turnover
i75               Fixed Assets Turnover
i76                      Asset Turnover
Name: 0, dtype: object

<a id='isa'></a>
[return to list of methods](#methods),
[return to the top](#top)
### Annual Income Statement
##### DataFrame Instance

In [29]:
df_annualIS0 = df.annualIS()

In [30]:
df_annualIS = (df_master 
 .merge(df_annualIS0, on=['ticker_id', 'exchange_id'])
 .merge(df.timerefs, left_on='Year_Y_6', right_on='id').drop('Year_Y_6', axis=1).rename(columns={'dates':'Y6'})
 .merge(df.timerefs, left_on='Year_Y_5', right_on='id').drop('Year_Y_5', axis=1).rename(columns={'dates':'Y5'})
 .merge(df.timerefs, left_on='Year_Y_4', right_on='id').drop('Year_Y_4', axis=1).rename(columns={'dates':'Y4'})
 .merge(df.timerefs, left_on='Year_Y_3', right_on='id').drop('Year_Y_3', axis=1).rename(columns={'dates':'Y3'})
 .merge(df.timerefs, left_on='Year_Y_2', right_on='id').drop('Year_Y_2', axis=1).rename(columns={'dates':'Y2'})
 .merge(df.timerefs, left_on='Year_Y_1', right_on='id').drop('Year_Y_1', axis=1).rename(columns={'dates':'Y1'})
)
df_annualIS.loc[:, 'Y5':'Y1'] = df_annualIS.loc[:, 'Y5':'Y1'].astype('datetime64')

In [31]:
df_annualIS.head()

Unnamed: 0,ticker_id,exchange_id,ticker,company,exchange,exchange_sym,industry,sector,country,country_c2,...,label_tts4,label_tts5,currency_id,fye_month,Y6,Y5,Y4,Y3,Y2,Y1
0,1.0,374.0,OGCP,Empire State Realty OP LP Operating Partnershi...,NYSE ARCA,ARCX,REIT - Diversified,Real Estate,United States,US,...,,,104.0,12.0,TTM,2018-12-01,2017-12-01,2016-12-01,2015-12-01,2014-12-01
1,2.0,374.0,FISK,Empire State Realty OP LP Operating Partnershi...,NYSE ARCA,ARCX,REIT - Diversified,Real Estate,United States,US,...,,,104.0,12.0,TTM,2018-12-01,2017-12-01,2016-12-01,2015-12-01,2014-12-01
2,3.0,374.0,ESBA,Empire State Realty OP LP Operating Partnershi...,NYSE ARCA,ARCX,REIT - Diversified,Real Estate,United States,US,...,,,104.0,12.0,TTM,2018-12-01,2017-12-01,2016-12-01,2015-12-01,2014-12-01
3,18686.0,302.0,ARE,Alexandria Real Estate Equities Inc,"NEW YORK STOCK EXCHANGE, INC.",XNYS,REIT - Office,Real Estate,United States,US,...,,,,,TTM,2018-12-01,2017-12-01,2016-12-01,2015-12-01,2014-12-01
4,19275.0,302.0,LPT,Liberty Property Trust,"NEW YORK STOCK EXCHANGE, INC.",XNYS,REIT - Office,Real Estate,United States,US,...,,,,,TTM,2018-12-01,2017-12-01,2016-12-01,2015-12-01,2014-12-01


##### DataFrame Length

In [32]:
print('DataFrame contains {:,.0f} records.'.format(len(df_annualIS)))

DataFrame contains 68,309 records.


##### DataFrame Columns

In [33]:
labels = [col for col in df_annualIS if 'label' in col]
labels = [[label, header] for label in labels 
          for header in df_annualIS[label].unique().tolist() if pd.notna(header)]

df_labels_aIS = (pd.DataFrame(labels, columns=['header', 'value'])
                 .set_index('header')
                 .astype('int')
                )

df_labels_aIS['value'] = df_labels_aIS['value'].replace(df.colheaders['header'])
df_labels_aIS[df_labels_aIS['value'].astype('str').str.contains('ncome')].sort_values(by='value')

df_labels_aIS.sort_values(by='value')

Unnamed: 0_level_0,value
header,Unnamed: 1_level_1
label_i46,Advertising and market...
label_i36,Advertising and market...
label_i15,Advertising and promot...
label_i47,Amortization of intang...
label_i24,Asset impairment
label_i4,Asset mgmt and securit...
label_i85,Basic
label_i83,Basic
label_s2,"Benefits, claims and e..."
label_i14,Borrowed funds


<a id='isq'></a>
[return to list of methods](#methods),
[return to the top](#top)
### Quarterly Income Statements
##### DataFrame Instance

In [None]:
df_quarterlyIS0 = df.quarterlyIS()

In [None]:
df_quarterlyIS.head()

##### DataFrame Length

In [None]:
print('DataFrame contains {:,.0f} records.'.format(len(df_quarterlyIS)))

##### DataFrame Columns

In [None]:
labels = [col for col in df_annualIS if 'label' in col]
labels = [[label, header] for label in labels 
          for header in df_annualIS[label].unique().tolist() if pd.notna(header)]

df_labels_aIS = (pd.DataFrame(labels, columns=['header', 'value'])
                 .set_index('header')
                 .astype('int')
                )

df_labels_aIS['value'] = df_labels_aIS['value'].replace(df.colheaders['header'])
df_labels_aIS[df_labels_aIS['value'].astype('str').str.contains('ncome')].sort_values(by='value')

df_labels_aIS.sort_values(by='value')

<a id='bsa'></a>
[return to list of methods](#methods),
[return to the top](#top)
### Annual Balance Sheet
##### DataFrame Instance

In [45]:
df_bsa0 = df.annualBS()

In [46]:
df_bsa0.head()

Unnamed: 0,ticker_id,exchange_id,Year_Y_1,Year_Y_2,Year_Y_3,Year_Y_4,Year_Y_5,data_g1_Y_1,data_g1_Y_2,data_g1_Y_3,...,label_ttg2,label_ttg5,label_ttg8,label_ttgg1,label_ttgg2,label_ttgg3,label_ttgg5,label_ttgg6,label_tts1,label_tts2
0,1,374,2014-12,2015-12,2016-12,2017-12,2018-12,,,,...,,Total liabilities,Total stockholders eq...,,,,,,Total assets,Total liabilities and ...
1,2,374,2014-12,2015-12,2016-12,2017-12,2018-12,,,,...,,Total liabilities,Total stockholders eq...,,,,,,Total assets,Total liabilities and ...
2,3,374,2014-12,2015-12,2016-12,2017-12,2018-12,,,,...,,Total liabilities,Total stockholders eq...,,,,,,Total assets,Total liabilities and ...
3,4,482,2014-12,2015-12,2016-12,2017-12,2018-12,,,,...,,Total liabilities,Total stockholders eq...,,,,,,Total assets,Total liabilities and ...
4,5,1,2003-12,2004-12,2005-12,2006-06,2007-06,0.0,19946.0,167141.0,...,Total non-current asse...,Total liabilities,Total stockholders eq...,Total cash,"Net property, plant an...",,Total current liabilit...,Total non-current liab...,Total assets,Total liabilities and ...


##### DataFrame Length

In [47]:
print('DataFrame contains {:,.0f} records.'.format(len(df_bsa0)))

DataFrame contains 76,311 records.


##### DataFrame Columns

In [48]:
# Financial Health DataFrame Columns
(df_bsa0.loc[0, [col for col in df_bsa0.columns if 'Y' not in col and '_id' not in col]]
 .replace(df.colheaders['header']))

label_g1                             NaN
label_g2                             NaN
label_g5                     Liabilities
label_g8             Stockholders equity
label_gg1                            NaN
label_gg2                            NaN
label_gg3                            NaN
label_gg5                            NaN
label_gg6                            NaN
label_i1          Real estate properties
label_i10                            NaN
label_i11                            NaN
label_i12                            NaN
label_i13                            NaN
label_i14                            NaN
label_i15                            NaN
label_i16                            NaN
label_i17                            NaN
label_i18                            NaN
label_i19                            NaN
label_i2       Accumulated depreciati...
label_i21                            NaN
label_i3       Real estate properties...
label_i30                            NaN
label_i4       C

<a id='bsq'></a>
[return to list of methods](#methods),
[return to the top](#top)
### Quarterly Balance Sheet
##### DataFrame Instance

In [49]:
df_bsq0 = df.quarterlyBS()

In [50]:
df_bsq0.head()

Unnamed: 0,ticker_id,exchange_id,Year_Y_1,Year_Y_2,Year_Y_3,Year_Y_4,Year_Y_5,data_g1_Y_1,data_g1_Y_2,data_g1_Y_3,...,label_ttg2,label_ttg5,label_ttg8,label_ttgg1,label_ttgg2,label_ttgg3,label_ttgg5,label_ttgg6,label_tts1,label_tts2
0,1,374,2017-12,2018-03,2018-06,2018-09,2018-12,,,,...,,Total liabilities,Total stockholders eq...,,,,,,Total assets,Total liabilities and ...
1,2,374,2017-12,2018-03,2018-06,2018-09,2018-12,,,,...,,Total liabilities,Total stockholders eq...,,,,,,Total assets,Total liabilities and ...
2,3,374,2017-12,2018-03,2018-06,2018-09,2018-12,,,,...,,Total liabilities,Total stockholders eq...,,,,,,Total assets,Total liabilities and ...
3,4,482,2017-12,2018-03,2018-06,2018-09,2018-12,,,,...,,Total liabilities,Total stockholders eq...,,,,,,Total assets,Total liabilities and ...
4,5,1,2007-03,2007-06,2007-09,2007-12,2008-03,1100958.0,669928.0,272986.0,...,Total non-current asse...,Total liabilities,Total stockholders eq...,Total cash,"Net property, plant an...",,Total current liabilit...,Total non-current liab...,Total assets,Total liabilities and ...


##### DataFrame Length

In [51]:
print('DataFrame contains {:,.0f} records.'.format(len(df_bsq0)))

DataFrame contains 76,216 records.


##### DataFrame Columns

In [52]:
# Financial Health DataFrame Columns
(df_bsq0.loc[0, [col for col in df_bsq0.columns if 'Y' not in col and '_id' not in col]]
 .replace(df.colheaders['header']))

label_g1                             NaN
label_g2                             NaN
label_g5                     Liabilities
label_g8             Stockholders equity
label_gg1                            NaN
label_gg2                            NaN
label_gg3                            NaN
label_gg5                            NaN
label_gg6                            NaN
label_i1          Real estate properties
label_i10                            NaN
label_i11                            NaN
label_i12                            NaN
label_i13                            NaN
label_i14                            NaN
label_i15                            NaN
label_i16                            NaN
label_i17                            NaN
label_i18                            NaN
label_i19                            NaN
label_i2       Accumulated depreciati...
label_i21                            NaN
label_i3       Real estate properties...
label_i30                            NaN
label_i4       C

<a id='cfa'></a>
[return to list of methods](#methods),
[return to the top](#top)
### Annual Cash Flow Statement
##### DataFrame Instance

In [53]:
df_cfa0 = df.annualCF()

In [54]:
df_cfa0.head()

Unnamed: 0,ticker_id,exchange_id,Year_Y_1,Year_Y_2,Year_Y_3,Year_Y_4,Year_Y_5,Year_Y_6,data_i1_Y_1,data_i1_Y_2,...,label_i96,label_i97,label_i98,label_i99,label_s1,label_s2,label_s3,label_tts1,label_tts2,label_tts3
0,1,374,2014-12,2015-12,2016-12,2017-12,2018-12,TTM,,,...,Capital expenditure,Free cash flow,Cash paid for income t...,Cash paid for interest,Cash Flows From Operat...,Cash Flows From Invest...,Cash Flows From Financ...,Net cash provided by o...,Net cash used for inve...,Net cash provided by (...
1,2,374,2014-12,2015-12,2016-12,2017-12,2018-12,TTM,,,...,Capital expenditure,Free cash flow,Cash paid for income t...,Cash paid for interest,Cash Flows From Operat...,Cash Flows From Invest...,Cash Flows From Financ...,Net cash provided by o...,Net cash used for inve...,Net cash provided by (...
2,3,374,2014-12,2015-12,2016-12,2017-12,2018-12,TTM,,,...,Capital expenditure,Free cash flow,Cash paid for income t...,Cash paid for interest,Cash Flows From Operat...,Cash Flows From Invest...,Cash Flows From Financ...,Net cash provided by o...,Net cash used for inve...,Net cash provided by (...
3,4,482,2014-12,2015-12,2016-12,2017-12,2018-12,TTM,,,...,Capital expenditure,Free cash flow,Cash paid for income t...,Cash paid for interest,Cash Flows From Operat...,Cash Flows From Invest...,Cash Flows From Financ...,Net cash provided by o...,Net cash used for inve...,Net cash provided by (...
4,5,1,2003-12,2004-12,2005-12,2006-06,2007-06,TTM,-129926.0,-209185.0,...,Capital expenditure,Free cash flow,,,Cash Flows From Operat...,Cash Flows From Invest...,Cash Flows From Financ...,Net cash provided by o...,Net cash used for inve...,Net cash provided by (...


##### DataFrame Length

In [55]:
print('DataFrame contains {:,.0f} records.'.format(len(df_cfa0)))

DataFrame contains 75,831 records.


##### DataFrame Columns

In [56]:
# Financial Health DataFrame Columns
(df_cfa0.loc[0, [col for col in df_cfa0.columns if 'Y' not in col and '_id' not in col]]
 .replace(df.colheaders['header']))

label_g7                 Free Cash Flow
label_g8      Supplemental schedule ...
label_i1                     Net income
label_i10     Stock based compensati...
label_i100          Operating cash flow
label_i11                           NaN
label_i12                           NaN
label_i13                           NaN
label_i15                           NaN
label_i16           Accounts receivable
label_i17                     Inventory
label_i18              Prepaid expenses
label_i19              Accounts payable
label_i2      Depreciation & amortiz...
label_i20           Accrued liabilities
label_i21              Interest payable
label_i22          Income taxes payable
label_i23         Other working capital
label_i24                           NaN
label_i25                           NaN
label_i26                           NaN
label_i27                           NaN
label_i28                           NaN
label_i29                           NaN
label_i3      Amortization of debt d...


<a id='cfq'></a>
[return to list of methods](#methods),
[return to the top](#top)
### Quarterly Cash Flow Statement
##### DataFrame Instance

In [57]:
df_cfq0 = df.quarterlyCF()

In [58]:
df_cfq0.head()

Unnamed: 0,ticker_id,exchange_id,Year_Y_1,Year_Y_2,Year_Y_3,Year_Y_4,Year_Y_5,Year_Y_6,data_i1_Y_1,data_i1_Y_2,...,label_i96,label_i97,label_i98,label_i99,label_s1,label_s2,label_s3,label_tts1,label_tts2,label_tts3
0,1,374,2017-12,2018-03,2018-06,2018-09,2018-12,TTM,,,...,Capital expenditure,Free cash flow,Cash paid for income t...,Cash paid for interest,Cash Flows From Operat...,Cash Flows From Invest...,Cash Flows From Financ...,Net cash provided by o...,Net cash used for inve...,Net cash provided by (...
1,2,374,2017-12,2018-03,2018-06,2018-09,2018-12,TTM,,,...,Capital expenditure,Free cash flow,Cash paid for income t...,Cash paid for interest,Cash Flows From Operat...,Cash Flows From Invest...,Cash Flows From Financ...,Net cash provided by o...,Net cash used for inve...,Net cash provided by (...
2,3,374,2017-12,2018-03,2018-06,2018-09,2018-12,TTM,,,...,Capital expenditure,Free cash flow,Cash paid for income t...,Cash paid for interest,Cash Flows From Operat...,Cash Flows From Invest...,Cash Flows From Financ...,Net cash provided by o...,Net cash used for inve...,Net cash provided by (...
3,4,482,2017-12,2018-03,2018-06,2018-09,2018-12,TTM,,,...,Capital expenditure,Free cash flow,Cash paid for income t...,Cash paid for interest,Cash Flows From Operat...,Cash Flows From Invest...,Cash Flows From Financ...,Net cash provided by o...,Net cash used for inve...,Net cash provided by (...
4,5,1,2007-03,2007-06,2007-09,2007-12,2008-03,TTM,-429976.0,-980707.0,...,Capital expenditure,Free cash flow,,,Cash Flows From Operat...,Cash Flows From Invest...,Cash Flows From Financ...,Net cash provided by o...,Net cash used for inve...,Net cash provided by (...


##### DataFrame Length

In [59]:
print('DataFrame contains {:,.0f} records.'.format(len(df_cfq0)))

DataFrame contains 76,331 records.


##### DataFrame Columns

In [60]:
# Financial Health DataFrame Columns
(df_cfq0.loc[0, [col for col in df_cfq0.columns if 'Y' not in col and '_id' not in col]]
 .replace(df.colheaders['header']))

label_g7                 Free Cash Flow
label_g8      Supplemental schedule ...
label_i1                     Net income
label_i10     Stock based compensati...
label_i100          Operating cash flow
label_i11                           NaN
label_i12                           NaN
label_i13                           NaN
label_i15                           NaN
label_i16           Accounts receivable
label_i17                     Inventory
label_i18              Prepaid expenses
label_i19              Accounts payable
label_i2      Depreciation & amortiz...
label_i20           Accrued liabilities
label_i21              Interest payable
label_i22          Income taxes payable
label_i23         Other working capital
label_i24                           NaN
label_i25                           NaN
label_i26                           NaN
label_i27                           NaN
label_i28                           NaN
label_i29                           NaN
label_i3      Amortization of debt d...


<a id="stats"></a>
[return to list of methods](#methods),
[return to the top](#top)
## Below are a few statistics on database data

### Record Count
**1.** Total number of records **before** merging reference tables *(length of `df.master`)*

In [350]:
print('DataFrame df.master contains {:,.0f} records.'.format(len(df.master)))

DataFrame df.master contains 117,711 records.


**2.** Total number of records **after** merging reference tables *(length of `df_master`)*

In [351]:
print('DataFrame df_master0 contains {:,.0f} records.'.format(len(df_master0)))

DataFrame df_master0 contains 93 records.


**3.** Total number of records **after** merging reference tables where the following filters apply:
- $openprice > 0$
- $lastprice > 0$

In [352]:
print('DataFrame df_master contains {:,.0f} records.'.format(len(df_master)))

DataFrame df_master contains 18 records.


### Last Updated Dates
List of dates (as a pd.Series object) of when the database records were last updated. 
The values indicate the number of records updated on each date.

In [353]:
(df_master[['updated_date', 'ticker']].groupby(by='updated_date').count().sort_index(ascending=False)
 .rename(columns={'ticker':'ticker_count'}))

Unnamed: 0_level_0,ticker_count
updated_date,Unnamed: 1_level_1
2019-03-28,18


### Number of records by Type

In [354]:
(df_master[['security_type', 'ticker']].groupby(by='security_type').count()
 .rename(columns={'ticker':'ticker_count'}))

Unnamed: 0_level_0,ticker_count
security_type,Unnamed: 1_level_1
Exchange-Traded Fund,1
Stock,17


### Number of records by Country, based on the location of exchanges *(see next table)*

In [164]:
(df_master[['country', 'a3_un', 'ticker']]
 .groupby(by=['country', 'a3_un']).count().rename(columns={'ticker':'ticker_count'})
)

Unnamed: 0_level_0,Unnamed: 1_level_0,ticker_count
country,a3_un,Unnamed: 2_level_1
Australia,AUS,2153
Belgium,BEL,169
Canada,CAN,4184
China,CHN,3844
Finland,FIN,2
France,FRA,1201
Germany,DEU,36942
Hong Kong,HKG,2375
Luxembourg,LUX,949
Netherlands,NLD,225


### Number of records per Exchange
Where $ticker\_count > 100$

In [147]:
cols = ['country', 'a3_un', 'exchange', 'exchange_sym', 'ticker']
df_exchanges = df_master[cols].groupby(by=cols[:-1]).count().rename(columns={'ticker':'ticker_count'})
df_exchanges[df_exchanges['ticker_count'] > 100]

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Unnamed: 3_level_0,ticker_count
country,a3_un,exchange,exchange_sym,Unnamed: 4_level_1
Australia,AUS,ASX - ALL MARKETS,XASX,2153
Belgium,BEL,EURONEXT - EURONEXT BRUSSELS,XBRU,169
Canada,CAN,CANADIAN NATIONAL STOCK EXCHANGE,XCNQ,418
Canada,CAN,TORONTO STOCK EXCHANGE,XTSE,2021
Canada,CAN,TSX VENTURE EXCHANGE,XTSX,1685
China,CHN,SHANGHAI STOCK EXCHANGE,XSHG,1608
China,CHN,SHENZHEN STOCK EXCHANGE,XSHE,2236
France,FRA,EURONEXT - EURONEXT PARIS,XPAR,1201
Germany,DEU,BOERSE BERLIN,XBER,8182
Germany,DEU,BOERSE DUESSELDORF,XDUS,2170


### Number of Stocks by Country of Exchange

In [170]:
(df_master
 .where(df_master['security_type'] == 'Stock').dropna(axis=0, how='all')[['country', 'a3_un', 'ticker']]
 .groupby(by=['country', 'a3_un']).count().rename(columns={'ticker':'ticker_count'})
)

Unnamed: 0_level_0,Unnamed: 1_level_0,ticker_count
country,a3_un,Unnamed: 2_level_1
Australia,AUS,1842
Belgium,BEL,168
Canada,CAN,3372
China,CHN,3649
France,FRA,838
Germany,DEU,36293
Hong Kong,HKG,2280
Luxembourg,LUX,453
Netherlands,NLD,121
Portugal,PRT,53


### Number of Stocks by Sector

In [148]:
(df_master[['sector', 'ticker']]
 .where(df_master['security_type'] == 'Stock')
 .where(df_master['sector'] != '—')
 .dropna(axis=0, how='all')
 .groupby(by='sector').count().rename(columns={'ticker':'stock_count'}))

Unnamed: 0_level_0,stock_count
sector,Unnamed: 1_level_1
Basic Materials,11512
Communication Services,1460
Consumer Cyclical,8792
Consumer Defensive,3697
Energy,3827
Financial Services,7921
Healthcare,7520
Industrials,9784
Real Estate,3607
Technology,9570


### Number of Stocks by Industry

In [149]:
(df_master[['sector', 'industry', 'ticker']]
 .where(df_master['security_type'] == 'Stock')
 .where(df_master['industry'] != '—')
 .dropna(axis=0, how='all')
 .groupby(by=['sector', 'industry']).count().rename(columns={'ticker':'stock_count'}))

Unnamed: 0_level_0,Unnamed: 1_level_0,stock_count
sector,industry,Unnamed: 2_level_1
Basic Materials,Agricultural Inputs,288
Basic Materials,Aluminum,139
Basic Materials,Building Materials,818
Basic Materials,Chemicals,713
Basic Materials,Coal,375
Basic Materials,Copper,289
Basic Materials,Gold,1844
Basic Materials,Industrial Metals & Minerals,5248
Basic Materials,Lumber & Wood Production,137
Basic Materials,Paper & Paper Products,292


### Mean Price Ratio of Common Stocks by Sectors

In [153]:
df_mean_vals = (df_master
                .where(df_master['security_type'] == 'Stock').dropna(axis=0, how='all')
                .merge(df_valuation, on=['ticker_id', 'exchange_id'])
                .drop(['ticker_id', 'exchange_id'], axis=1)
               )

In [154]:
df_mean_vals.columns

Index(['ticker', 'company', 'exchange', 'exchange_sym', 'industry', 'sector',
       'country', 'a2_iso', 'a3_un', 'security_type_code', 'security_type',
       'stock_type', 'style', 'openprice', 'lastprice', 'day_hi', 'day_lo',
       '_52wk_hi', '_52wk_lo', 'yield', 'aprvol', 'avevol', 'Forward_PE', 'pb',
       'ps', 'pc', 'currency', 'currency_code', 'fy_end', 'updated_date',
       'PE_2009', 'PE_2010', 'PE_2011', 'PE_2012', 'PE_2013', 'PE_2014',
       'PE_2015', 'PE_2016', 'PE_2017', 'PE_2018', 'PE_TTM', 'PS_2009',
       'PS_2010', 'PS_2011', 'PS_2012', 'PS_2013', 'PS_2014', 'PS_2015',
       'PS_2016', 'PS_2017', 'PS_2018', 'PS_TTM', 'PB_2009', 'PB_2010',
       'PB_2011', 'PB_2012', 'PB_2013', 'PB_2014', 'PB_2015', 'PB_2016',
       'PB_2017', 'PB_2018', 'PB_TTM', 'PC_2009', 'PC_2010', 'PC_2011',
       'PC_2012', 'PC_2013', 'PC_2014', 'PC_2015', 'PC_2016', 'PC_2017',
       'PC_2018', 'PC_TTM'],
      dtype='object')

**For all stocks ...**

In [168]:
print('For a total of {:,.0f} stock records:'.format(len(df_mean_vals)))

(df_mean_vals[['sector', 'company']].groupby('sector').count()
 .rename(columns={'company':'count'})
 .merge(df_mean_vals[['Forward_PE', 'PE_TTM', 'PB_TTM', 'PS_TTM', 'PC_TTM', 
                      'PE_2018', 'PB_2018', 'PS_2018', 'PC_2018',
                      'PE_2017', 'PB_2017', 'PS_2017', 'PC_2017',
                      'lastprice', '_52wk_hi', '_52wk_lo', 'yield', 'sector']]
        .groupby('sector').mean().round(1), on='sector'))

For a total of 69,634 stock records:


Unnamed: 0_level_0,count,Forward_PE,PE_TTM,PB_TTM,PS_TTM,PC_TTM,PE_2018,PB_2018,PS_2018,PC_2018,PE_2017,PB_2017,PS_2017,PC_2017,lastprice,_52wk_hi,_52wk_lo,yield
sector,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1
Basic Materials,11512,17.0,28.4,494.6,334.2,26.1,28.3,1.1,48.1,-3.8,34.6,2.1,73.3,-14.2,27.1,29.4,18.7,3.4
Communication Services,1460,29.8,41.8,3.5,5.2,45.2,23.7,3.1,4.4,3.5,40.9,12.9,11.9,-2.5,36.9,89.3,26.5,4.5
Consumer Cyclical,8792,24.7,34.8,5.2,32.3,27.2,28.8,1.1,8.8,14.1,38.8,1.2,8.3,4.5,53.4,112.7,39.5,3.7
Consumer Defensive,3697,21.0,33.3,5.2,23.7,52.6,27.2,5.4,7.6,12.4,36.9,0.8,9.4,7.7,230.4,258.8,190.2,2.9
Energy,3827,48.3,30.7,4.8,193.9,12.7,20.1,0.9,19.4,-7.2,64.5,4.1,7.1,-1.0,18.9,27.1,13.9,6.0
Financial Services,7921,13.6,29.9,41.9,411.0,265.4,20.1,1.7,9.6,0.0,35.7,0.6,7.6,-0.9,361.0,575.1,265.3,4.3
Healthcare,7517,42.2,68.1,18.6,676.1,59.5,57.2,4.3,93.6,-10.2,74.7,5.2,111.8,-2.1,33.0,47.1,21.6,2.3
Industrials,9783,19.7,48.3,9.2,7587.1,57.4,33.3,3.0,8.1,4.2,37.8,3.4,11.6,3.4,68.8,88.2,55.6,3.2
Real Estate,3607,24.4,33.7,22.4,35.1,139.2,22.5,3.1,8.5,34.5,30.6,5.7,24.1,40.5,67.7,146.3,53.2,5.4
Technology,9570,42.6,61.4,10.5,401.9,134.8,56.5,1.1,23.6,-2.2,64.2,4.6,22.2,15.5,37.2,54.8,22.1,3.5


**For USA stocks only ...**

In [167]:
df_mean_vals_us = df_mean_vals.where(df_mean_vals['a3_un'] == 'USA').dropna(axis=0, how='all')
print('For a total of {:,.0f} stock records:'.format(len(df_mean_vals_us)))

(df_mean_vals_us[['sector', 'company']].groupby('sector').count()
 .rename(columns={'company':'count'})
 .merge(df_mean_vals_us[['Forward_PE', 'PE_TTM', 'PB_TTM', 'PS_TTM', 'PC_TTM', 
                      'PE_2018', 'PB_2018', 'PS_2018', 'PC_2018',
                      'PE_2017', 'PB_2017', 'PS_2017', 'PC_2017',
                      'lastprice', '_52wk_hi', '_52wk_lo', 'yield', 'sector']]
        .groupby('sector').mean().round(1), on='sector'))

For a total of 16,559 stock records:


Unnamed: 0_level_0,count,Forward_PE,PE_TTM,PB_TTM,PS_TTM,PC_TTM,PE_2018,PB_2018,PS_2018,PC_2018,PE_2017,PB_2017,PS_2017,PC_2017,lastprice,_52wk_hi,_52wk_lo,yield
sector,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1
Basic Materials,2323,17.1,23.6,2676.9,373.1,33.9,27.8,-2.8,56.4,-9.8,32.8,-0.2,86.3,-9.0,44.8,50.0,41.1,3.6
Communication Services,318,34.6,38.4,5.3,11.5,27.9,23.4,6.1,10.6,-7.2,29.9,-1.3,36.3,-10.2,79.4,83.8,59.4,5.1
Consumer Cyclical,1842,23.5,35.5,6.7,128.8,24.0,25.6,-3.0,10.3,3.6,40.2,-1.1,24.9,-6.5,47.4,55.6,41.1,3.8
Consumer Defensive,883,21.0,34.0,10.2,84.7,74.1,24.7,18.9,24.5,-27.6,38.9,-1.0,27.0,9.0,288.3,296.2,266.2,3.7
Energy,986,24.9,40.9,5.4,180.0,21.4,18.2,-0.9,22.4,-9.4,69.3,5.9,20.9,0.8,13.5,19.9,10.9,10.9
Financial Services,2511,14.7,29.0,5.4,266.2,34.9,22.1,7.7,5.6,2.3,31.6,-6.1,10.2,-10.5,244.2,285.4,223.2,4.0
Healthcare,1968,48.7,66.1,36.7,881.9,35.2,76.2,13.5,121.4,-27.6,117.5,5.8,157.7,0.9,22.2,30.4,16.6,3.4
Industrials,2243,21.6,55.5,20.4,42.9,164.0,25.3,-1.2,16.8,-20.2,34.0,-1.7,25.8,-9.6,107.2,125.0,99.2,3.5
Real Estate,939,26.4,42.9,1.9,10.9,36.2,25.1,1.0,-7.6,11.9,40.9,20.1,10.0,31.7,85.1,89.5,76.8,7.3
Technology,2064,42.3,66.5,23.2,1782.6,98.4,46.7,-7.7,62.6,-25.2,59.2,0.8,53.3,4.6,26.2,34.0,18.5,6.8


**For DEU (Germany) stocks only ...**

In [166]:
df_mean_vals_de = df_mean_vals.where(df_mean_vals['a3_un'] == 'DEU').dropna(axis=0, how='all')
print('For a total of {:,.0f} stock records:'.format(len(df_mean_vals_us)))

(df_mean_vals_de[['sector', 'company']].groupby('sector').count()
 .rename(columns={'company':'count'})
 .merge(df_mean_vals_de[['Forward_PE', 'PE_TTM', 'PB_TTM', 'PS_TTM', 'PC_TTM', 
                      'PE_2018', 'PB_2018', 'PS_2018', 'PC_2018',
                      'PE_2017', 'PB_2017', 'PS_2017', 'PC_2017',
                      'lastprice', '_52wk_hi', '_52wk_lo', 'yield', 'sector']]
        .groupby('sector').mean().round(1), on='sector'))

For a total of 36,290 stock records:


Unnamed: 0_level_0,count,Forward_PE,PE_TTM,PB_TTM,PS_TTM,PC_TTM,PE_2018,PB_2018,PS_2018,PC_2018,PE_2017,PB_2017,PS_2017,PC_2017,lastprice,_52wk_hi,_52wk_lo,yield
sector,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1
Basic Materials,5880,17.1,22.3,4.6,126.0,18.2,24.5,1.9,50.4,0.6,30.3,0.8,78.3,-14.3,15.4,20.0,9.0,3.5
Communication Services,910,27.7,39.5,3.0,3.8,55.2,21.1,2.4,3.2,7.2,44.9,14.9,6.3,-2.3,20.8,101.7,14.0,4.4
Consumer Cyclical,4624,27.0,31.4,4.4,11.7,19.5,24.5,3.1,10.2,16.7,35.0,2.3,4.0,6.8,43.7,139.6,27.1,3.8
Consumer Defensive,1953,21.0,26.2,3.7,6.5,35.5,22.2,1.5,2.3,16.6,29.3,-1.1,3.5,19.1,204.6,238.7,152.7,2.9
Energy,2054,51.1,23.6,5.2,165.3,9.1,18.0,1.4,12.0,-4.3,68.5,4.2,4.6,-2.3,13.3,21.4,8.5,4.7
Financial Services,3698,12.9,29.1,2.2,8.7,27.8,16.3,1.3,6.0,-2.4,35.4,2.3,8.1,3.1,558.1,626.7,373.9,4.2
Healthcare,4225,41.9,74.8,7.8,676.7,40.0,55.3,-1.7,85.1,-0.5,54.2,3.2,101.0,-7.8,30.4,44.2,17.0,2.2
Industrials,5009,18.7,45.4,6.4,155.4,23.6,27.4,3.6,3.3,10.2,31.6,3.9,9.0,11.8,41.0,57.6,26.9,3.2
Real Estate,1737,25.7,30.0,1.6,45.1,193.1,18.0,1.3,11.3,47.8,28.2,2.1,32.0,64.2,55.7,199.9,34.6,5.0
Technology,5171,45.2,53.5,6.9,72.7,161.9,53.6,2.9,14.8,2.3,58.9,5.6,13.5,10.6,35.3,48.8,17.2,3.6


### Stocks in the Cannabis Industry
Using stocks listed on [marijuanaindex.com](http://marijuanaindex.com/stock-quotes/north-american-marijuana-index/) under North America

In [98]:
import json

with open('input/pot_stocks.json') as file:
    pot_symbols = json.loads(file.read())
    
pot_stocks = (pd.DataFrame(pot_symbols, columns=['ticker', 'country_c3'])
               .merge(df_master, how='left', on=['ticker', 'country_c3']).drop('country', axis=1)
               .rename(columns={'country_c3':'country', 'exchange_sym':'exch'})
             )
pot_stocks = (pot_stocks
               .where(((pot_stocks['country'] == 'USA') | (pot_stocks['country'] == 'CAN')) &
                      (pot_stocks['sector'] != '—')
                     )
               .dropna(axis=0, how='all')
               .sort_values(by='ticker')
               [['country', 'ticker', 'exch', 'company', 'sector', 'industry']])

In [101]:
msg = 'Below are the {} stocks listed on marijuanaindex.com for North America.'
print(msg.format(len(pot_stocks['company'].unique())))
pot_stocks

Below are the 46 stocks listed on marijuanaindex.com for North America.


Unnamed: 0,country,ticker,exch,company,sector,industry
0,CAN,ACB,XTSE,Aurora Cannabis Inc,Healthcare,Drug Manufacturers - Specialty & Generic
2,CAN,ACRG.U,XCNQ,Acreage Holdings Inc Ordinary Shares (Sub Voting),Healthcare,Drug Manufacturers - Specialty & Generic
1,USA,ACRGF,PINX,Acreage Holdings Inc Ordinary Shares (Sub Voting),Healthcare,Drug Manufacturers - Specialty & Generic
37,USA,APHA,XNYS,Aphria Inc,Healthcare,Drug Manufacturers - Specialty & Generic
38,USA,CGC,XNYS,Canopy Growth Corp,Healthcare,Drug Manufacturers - Specialty & Generic
3,CAN,CL,XCNQ,Cresco Labs Inc,Healthcare,Drug Manufacturers - Specialty & Generic
4,CAN,CNNX,XCNQ,Cannex Capital Holdings Inc,Healthcare,Drug Manufacturers - Specialty & Generic
39,USA,CRON,XNAS,Cronos Group Inc,Healthcare,Drug Manufacturers - Specialty & Generic
5,CAN,CURA,XCNQ,Curaleaf Holdings Inc,Healthcare,Drug Manufacturers - Specialty & Generic
40,USA,CVSI,PINX,CV Sciences Inc,Healthcare,Drug Manufacturers - Specialty & Generic


<a id="value"></a>
[return to the top](#top)

## Applying value investing criteria to filter common stocks

- **[Rule 1](#rule1): No earnings deficit (loss) for past 5 years**
- **[Rule 2](#rule2): Uniterrupted and increasing Dividends for past 5 yrs**
- **[Rule 3](#rule3): P/E Ratio of 25 or less for the past 7 yrs and less then 20 for TTM¶**
- **[Rule 4](#rule4): P/B Ratio of 1 or less for TTM**
- **[Rule 5](#rule5): Filter for "bargain issues"**
- **[Rule 6](#rule5): **
- **[Rule 7](#rule5): **
- **[Rule 8](#rule5): **
<a id="rule1"></a>

### Rule 1. No earnings deficit (loss) for past 5 years

**a. Identify *Net Income* column labels in** `df_annualIS`

In [94]:
data = 'Net income'
df_labels = df_labels_aIS[df_labels_aIS['value'] == data].sort_values(by='value')
df_labels

Unnamed: 0_level_0,value
header,Unnamed: 1_level_1
label_i50,Net income
label_i70,Net income
label_i80,Net income


**b. Get column headers for 'Net income' values for the past 5 yrs**

In [95]:
i_ids = [(label[-3:] + '_') for label in df_labels.index]

def get_icols(col):
    for i_id in i_ids:
        if i_id in col:
            return True
    return False

main_cols1 = ['ticker_id', 'exchange_id', 
             'country', 'exchange_sym', 'ticker', 'company', 
             'sector', 'industry', 'stock_type', 'style', 
             'Y6', 'Y5', 'Y4', 'Y3', 'Y2', 'Y1']
data_cols = sorted(list(filter(get_icols, df_annualIS.columns)), key=lambda r: (r[-1], r[5:8]), reverse=True)
print('The following columns contain \'{}\' values:\n{}'.format(data, data_cols))

The following columns contain 'Net income' values:
['data_i80_Y_6', 'data_i70_Y_6', 'data_i50_Y_6', 'data_i80_Y_5', 'data_i70_Y_5', 'data_i50_Y_5', 'data_i80_Y_4', 'data_i70_Y_4', 'data_i50_Y_4', 'data_i80_Y_3', 'data_i70_Y_3', 'data_i50_Y_3', 'data_i80_Y_2', 'data_i70_Y_2', 'data_i50_Y_2', 'data_i80_Y_1', 'data_i70_Y_1', 'data_i50_Y_1']


**c. Create 'Net Income' DataFrame**

In [96]:
df_netinc5 = (df_annualIS
              .where((df_annualIS['security_type'] == 'Stock') & 
                     (df_annualIS['Y5'] >= pd.to_datetime('2018-01')))
              .dropna(axis=0, how='all')
              .drop(['country'], axis=1)
              .rename(columns={'country_c3':'country'})
             )[main_cols1 + data_cols]

np_netinc = df_netinc5[data_cols].values
netinc_cols = [('Net_Income_Y' + data_cols[i * 3][-1], (i * 3, i * 3 + 1, i * 3 + 2))
               for i in range(int(len(data_cols)/3))]

vals = []
for row in np_netinc:
    row_vals = []
    for i in range(len(netinc_cols)):
        val = None
        for col in netinc_cols[i][1]:
            if not np.isnan(row[col]):
                val = row[col]
                break
        row_vals.append(val)
    vals.append(row_vals)
    
df_vals = pd.DataFrame(vals, columns=list(zip(*netinc_cols))[0])
df_netinc5 = df_netinc5[main_cols1].join(df_vals)

In [97]:
df_rule1 = df_netinc5.where((df_netinc5['Net_Income_Y6'] > 0) & 
                            ((df_netinc5['Net_Income_Y5'] > 0) | (df_netinc5['Net_Income_Y5'].isna())) & 
                            ((df_netinc5['Net_Income_Y4'] > 0) | (df_netinc5['Net_Income_Y4'].isna())) & 
                            ((df_netinc5['Net_Income_Y3'] > 0) | (df_netinc5['Net_Income_Y3'].isna())) & 
                            ((df_netinc5['Net_Income_Y2'] > 0) | (df_netinc5['Net_Income_Y2'].isna())) & 
                            ((df_netinc5['Net_Income_Y1'] > 0) | (df_netinc5['Net_Income_Y1'].isna()))
                           ).dropna(axis=0, how='all')

In [98]:
df_rule1

Unnamed: 0,ticker_id,exchange_id,country,exchange_sym,ticker,company,sector,industry,stock_type,style,...,Y4,Y3,Y2,Y1,Net_Income_Y6,Net_Income_Y5,Net_Income_Y4,Net_Income_Y3,Net_Income_Y2,Net_Income_Y1
0,1.0,374.0,USA,ARCX,OGCP,Empire State Realty OP LP Operating Partnershi...,Real Estate,REIT - Diversified,Hard Asset,Mid Core,...,2017-12-01,2016-12-01,2015-12-01,2014-12-01,1.164020e+08,1.164020e+08,1.182530e+08,1.072500e+08,7.992800e+07,7.021000e+07
1,2.0,374.0,USA,ARCX,FISK,Empire State Realty OP LP Operating Partnershi...,Real Estate,REIT - Diversified,Hard Asset,Mid Core,...,2017-12-01,2016-12-01,2015-12-01,2014-12-01,1.164020e+08,1.164020e+08,1.182530e+08,1.072500e+08,7.992800e+07,7.021000e+07
2,3.0,374.0,USA,ARCX,ESBA,Empire State Realty OP LP Operating Partnershi...,Real Estate,REIT - Diversified,Hard Asset,Mid Core,...,2017-12-01,2016-12-01,2015-12-01,2014-12-01,1.164020e+08,1.164020e+08,1.182530e+08,1.072500e+08,7.992800e+07,7.021000e+07
4,19275.0,302.0,USA,XNYS,LPT,Liberty Property Trust,Real Estate,REIT - Office,Hard Asset,Mid Core,...,2017-12-01,2016-12-01,2015-12-01,2014-12-01,4.796070e+08,4.796070e+08,2.823400e+08,3.568170e+08,2.380390e+08,2.179100e+08
6,19849.0,302.0,USA,XNYS,DEI,Douglas Emmett Inc,Real Estate,REIT - Office,Hard Asset,Mid Core,...,2017-12-01,2016-12-01,2015-12-01,2014-12-01,1.160860e+08,1.160860e+08,9.444300e+07,8.539700e+07,5.838400e+07,4.462100e+07
7,19240.0,302.0,USA,XNYS,KRC,Kilroy Realty Corp,Real Estate,REIT - Office,Hard Asset,Mid Core,...,2017-12-01,2016-12-01,2015-12-01,2014-12-01,2.584150e+08,2.584150e+08,1.646120e+08,2.937880e+08,2.340810e+08,1.802190e+08
8,19993.0,302.0,USA,XNYS,PEB,Pebblebrook Hotel Trust,Real Estate,REIT - Hotel & Motel,Hard Asset,Mid Core,...,2017-12-01,2016-12-01,2015-12-01,2014-12-01,1.339300e+07,1.339300e+07,9.988800e+07,7.370400e+07,9.466800e+07,7.286600e+07
9,19335.0,302.0,USA,XNYS,MAA,Mid-America Apartment Communities Inc,Real Estate,REIT - Residential,Hard Asset,Mid Core,...,2017-12-01,2016-12-01,2015-12-01,2014-12-01,2.228990e+08,2.228990e+08,3.283790e+08,2.122220e+08,3.322870e+08,1.479800e+08
10,19694.0,302.0,USA,XNYS,UDR,UDR Inc,Real Estate,REIT - Residential,Hard Asset,Mid Core,...,2017-12-01,2016-12-01,2015-12-01,2014-12-01,2.031060e+08,2.031060e+08,1.215580e+08,2.927180e+08,3.403830e+08,1.543340e+08
11,19021.0,302.0,USA,XNYS,EGP,EastGroup Properties Inc,Real Estate,REIT - Industrial,Hard Asset,Mid Core,...,2017-12-01,2016-12-01,2015-12-01,2014-12-01,8.850600e+07,8.850600e+07,8.318300e+07,9.550900e+07,4.786600e+07,4.794100e+07


[return to top of this section](#value),
[return to the top](#top)
<a id="rule2"></a>
### Rule 2. Uniterrupted and increasing *Dividends* for past 7 yrs

**a. Identify *Dividends* column label in** `df_keyratios`

In [109]:
icol = df_labels_kratios[df_labels_kratios == 'Dividends'].index[0]
icol

'i6'

**b. Get column headers for *Dividends* for the past 5 yrs**

In [110]:
main_cols2 = ['ticker_id', 'exchange_id', 
             #'country_c3', 'exchange_sym', 'ticker', 'company', 
             #'sector', 'industry', 'stock_type', 'style', 
             'Y10', 'Y9', 'Y8', 'Y7', 'Y6', 'Y5']
icols = sorted([col for col in df_keyratios.columns if icol + '_' in col], 
               key=lambda col: int(col[4:]), reverse=True)[:8]
icols

['i6_Y10', 'i6_Y9', 'i6_Y8', 'i6_Y7', 'i6_Y6', 'i6_Y5', 'i6_Y4', 'i6_Y3']

**c. Create 'Net Income' DataFrame**

In [111]:
df_rule2 = (df_keyratios
            .where((df_keyratios['security_type'] == 'Stock') & 
                   (df_keyratios['Y9'] >= pd.to_datetime('2018-01')) & 
                   (df_keyratios['i6_Y10'].notna()) & (df_keyratios['i6_Y9'].notna()) &
                   (df_keyratios['i6_Y8'].notna()) & (df_keyratios['i6_Y7'].notna()) &
                   (df_keyratios['i6_Y6'].notna()) & (df_keyratios['i6_Y5'].notna()) & 
                   (df_keyratios['i6_Y4'].notna()) & (df_keyratios['i6_Y3'].notna()) & 
                   (df_keyratios['i6_Y10'] >= df_keyratios['i6_Y9']) & 
                   (df_keyratios['i6_Y9'] >= df_keyratios['i6_Y8']) & 
                   (df_keyratios['i6_Y8'] >= df_keyratios['i6_Y7']) & 
                   (df_keyratios['i6_Y7'] >= df_keyratios['i6_Y6']) & 
                   (df_keyratios['i6_Y6'] >= df_keyratios['i6_Y5']) & 
                   (df_keyratios['i6_Y5'] >= df_keyratios['i6_Y4']) & 
                   (df_keyratios['i6_Y4'] >= df_keyratios['i6_Y3']))
            .dropna(axis=0, how='all').sort_values(by='Y9', ascending=False))[main_cols2 + icols]

df_rule2.columns = main_cols2 + [col.replace('i6', 'Dividend') for col in icols]

In [112]:
df_rule2

Unnamed: 0,ticker_id,exchange_id,Y10,Y9,Y8,Y7,Y6,Y5,Dividend_Y10,Dividend_Y9,Dividend_Y8,Dividend_Y7,Dividend_Y6,Dividend_Y5,Dividend_Y4,Dividend_Y3
63894,19249.0,302.0,TTM,2019-01-01,2018-01-01,2017-01-01,2016-01-01,2015-01-01,2.44,2.44,2.20,2.00,1.80,1.56,1.40,1.28
68154,19649.0,141.0,TTM,2019-01-01,2018-01-01,2017-01-01,2016-01-01,2015-01-01,2.15,2.15,1.95,1.75,1.58,1.48,1.34,1.25
34106,19106.0,302.0,TTM,2019-01-01,2018-01-01,2017-01-01,2016-01-01,2015-01-01,0.97,0.97,0.92,0.92,0.92,0.88,0.70,0.50
34105,19393.0,302.0,TTM,2019-01-01,2018-01-01,2017-01-01,2016-01-01,2015-01-01,1.48,1.48,1.48,1.48,1.48,1.32,1.20,1.08
34103,18701.0,302.0,TTM,2019-01-01,2018-01-01,2017-01-01,2016-01-01,2015-01-01,0.55,0.55,0.50,0.50,0.50,0.50,0.38,0.33
63895,19276.0,302.0,TTM,2019-01-01,2018-01-01,2017-01-01,2016-01-01,2015-01-01,2.40,2.40,2.40,2.40,2.00,1.36,1.20,1.00
63900,18891.0,302.0,TTM,2019-01-01,2018-01-01,2017-01-01,2016-01-01,2015-01-01,0.34,0.34,0.33,0.32,0.31,0.30,0.24,0.21
68156,19649.0,142.0,TTM,2019-01-01,2018-01-01,2017-01-01,2016-01-01,2015-01-01,2.15,2.15,1.95,1.75,1.58,1.48,1.34,1.25
68153,19649.0,17.0,TTM,2019-01-01,2018-01-01,2017-01-01,2016-01-01,2015-01-01,2.15,2.15,1.95,1.75,1.58,1.48,1.34,1.25
68152,19649.0,16.0,TTM,2019-01-01,2018-01-01,2017-01-01,2016-01-01,2015-01-01,2.15,2.15,1.95,1.75,1.58,1.48,1.34,1.25


[return to top of this section](#value),
[return to the top](#top)
<a id="rule3"></a>
### Rule 3. P/E Ratio of 25 or less for the past 7 yrs and less then 20 for TTM

In [113]:
pe_cols = [col for col in df_valuation.columns if 'PE_' in col]
pe_cols = [pe_cols[len(pe_cols)-i-1] for i in range(len(pe_cols))][:8]
pe_cols

['PE_TTM',
 'PE_2018',
 'PE_2017',
 'PE_2016',
 'PE_2015',
 'PE_2014',
 'PE_2013',
 'PE_2012']

In [114]:
df_rule3 = (df_valuation[['ticker_id', 'exchange_id'] + pe_cols]
            .where((df_valuation['PE_TTM'] <= 20) & (df_valuation['PE_2018'] <= 25) &
                   (df_valuation['PE_2017'] <= 25) & (df_valuation['PE_2016'] <= 25) &
                   (df_valuation['PE_2015'] <= 25) & (df_valuation['PE_2014'] <= 25) &
                   (df_valuation['PE_2013'] <= 25) & (df_valuation['PE_2012'] <= 25)).dropna(axis=0, how='all'))

In [115]:
df_rule3

Unnamed: 0,ticker_id,exchange_id,PE_TTM,PE_2018,PE_2017,PE_2016,PE_2015,PE_2014,PE_2013,PE_2012
643,644.0,1.0,14.3,15.0,15.2,15.1,13.6,17.7,11.8,10.4
700,704.0,1.0,6.4,5.4,9.0,15.5,14.5,14.7,17.1,16.2
878,885.0,1.0,17.2,16.2,20.4,21.4,18.3,15.2,14.1,14.5
888,895.0,1.0,8.1,7.3,8.2,7.2,11.7,14.9,24.3,17.8
946,954.0,1.0,5.1,4.4,14.3,9.1,3.5,6.4,4.5,5.2
1054,1063.0,17.0,5.6,4.3,9.4,11.0,6.6,6.4,6.0,12.4
1055,1063.0,141.0,5.3,4.2,8.8,10.6,6.3,6.9,6.7,13.6
1080,1087.0,1.0,3.4,4.1,7.4,5.8,4.7,5.0,6.4,6.0
1144,1152.0,1.0,11.0,12.2,12.9,13.3,18.9,8.5,4.4,4.7
1149,1158.0,1.0,14.1,10.9,16.5,15.9,16.1,15.3,13.6,8.0


[return to top of this section](#value),
[return to the top](#top)
<a id="rule4"></a>
### Rule 4. P/B Ratio of 1 or less for TTM

In [117]:
pb_cols = [col for col in df_valuation.columns if 'PB_' in col]
pb_cols = [pb_cols[len(pb_cols)-i-1] for i in range(len(pb_cols))][:6]
pb_cols

['PB_TTM', 'PB_2018', 'PB_2017', 'PB_2016', 'PB_2015', 'PB_2014']

In [118]:
df_rule4 = (df_valuation[['ticker_id', 'exchange_id'] + pb_cols]
            .where(df_valuation['PB_TTM'] <= 1).dropna(axis=0))

In [119]:
df_rule4

Unnamed: 0,ticker_id,exchange_id,PB_TTM,PB_2018,PB_2017,PB_2016,PB_2015,PB_2014
144,139.0,1.0,0.6,0.6,0.7,0.2,0.6,0.8
148,143.0,1.0,0.1,0.1,0.3,0.6,0.2,0.5
308,302.0,1.0,0.9,1.0,3.6,-14.1,0.9,0.1
340,335.0,1.0,0.7,0.6,1.8,0.3,0.2,0.8
366,361.0,1.0,0.5,0.5,1.2,0.9,2.1,8.5
400,396.0,1.0,0.5,0.6,1.4,2.8,5.1,4.5
601,602.0,1.0,0.3,0.4,0.9,2.4,0.2,0.2
672,675.0,1.0,0.7,0.6,1.5,2.1,1.7,3.1
686,689.0,1.0,0.2,0.1,0.8,0.9,0.7,1.0
688,691.0,1.0,0.3,0.3,0.7,0.9,1.4,1.1


[return to top of this section](#value),
[return to the top](#top)
<a id="rule5"></a>
### Rule 5. Filtering for "bargain issues" as described by Benjamin Graham in *The Intelligent Investor*:**

In his book, Graham defines bargain issues as common stocks where $Price < net working capital$.

By "net working capital", Graham means a company's current assets (e.g. cash, marketable securities, and inventory) minus its total liabilities (including preferred stock and long-term debt).

[return to top of this section](#value),
[return to the top](#top)
<a id="rule5"></a>
### Rule 6.

[return to top of this section](#value),
[return to the top](#top)
<a id="rule5"></a>
### Rule 7.

### Merging DataFrames

In [49]:
df_rules = (df_rule1
            .merge(df_rule2, on=['ticker_id', 'exchange_id'])
            .merge(df_rule3, on=['ticker_id', 'exchange_id'])
            .merge(df_rule4, on=['ticker_id', 'exchange_id'])
           )

In [50]:
df_rules.columns.values

array(['ticker_id', 'exchange_id', 'country', 'exchange_sym', 'ticker',
       'company', 'sector', 'industry', 'stock_type', 'style', 'Y6_x',
       'Y5_x', 'Y4', 'Y3', 'Y2', 'Y1', 'Net_Income_Y6', 'Net_Income_Y5',
       'Net_Income_Y4', 'Net_Income_Y3', 'Net_Income_Y2', 'Net_Income_Y1',
       'Y10', 'Y9', 'Y8', 'Y7', 'Y6_y', 'Y5_y', 'Dividend_Y10',
       'Dividend_Y9', 'Dividend_Y8', 'Dividend_Y7', 'Dividend_Y6',
       'Dividend_Y5', 'PE_TTM', 'PE_2018', 'PE_2017', 'PE_2016',
       'PE_2015', 'PE_2014', 'PB_TTM', 'PB_2018', 'PB_2017', 'PB_2016',
       'PB_2015', 'PB_2014'], dtype=object)

In [51]:
df_rules.groupby('company').count()['ticker_id']

company
AP (Thailand) PCL DR                                                     2
APT Satellite Holdings Ltd                                               1
Acme United Corp                                                         1
Air Lease Corp Class A                                                   2
Aircastle Ltd                                                            3
Akbank TAS ADR                                                           3
Anhui Expressway Co Ltd Class H                                          4
Anhui Expressway Co Ltd H Shares                                         1
Apollo Commercial Real Estate Finance Inc                                1
Ares Capital Corp                                                        1
Ares Commercial Real Estate Corp                                         1
Associated Banc-Corp                                                     1
Assured Guaranty Ltd                                                     4
BTB Real Estate I

<a id="additional"></a>
[return to the top](#top)

## Additional sample / test code

##### Global Price Ratios

In [88]:
val_sector_mean = df_val.groupby(by='sector').mean()
val_sector_mean

Unnamed: 0_level_0,PE_Y0,PE_Y1,PE_Y2,PE_Y3,PE_Y4,PE_Y5,PE_Y6,PE_Y7,PE_Y8,PE_Y9,...,PC_Y1,PC_Y2,PC_Y3,PC_Y4,PC_Y5,PC_Y6,PC_Y7,PC_Y8,PC_Y9,PC_Y10
sector,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Basic Materials,61.579721,55.173533,33.150601,37.892019,40.270251,75.634843,55.824362,66.400121,34.528865,28.095283,...,-22.888388,-11.204057,-0.992312,-2.604594,0.376739,2.811162,-7.049953,-13.555288,-7.437073,24.716683
Communication Services,27.280032,23.509114,32.292725,20.279396,29.823804,40.925704,31.437312,37.990069,40.31033,23.593796,...,10.681604,5.663848,7.061435,1.865088,5.571925,-6.217938,8.607053,-8.395761,-19.03525,44.971531
Consumer Cyclical,46.345072,45.667061,29.209846,36.320366,57.654242,36.053718,43.791304,50.365307,37.998469,28.859865,...,7.077191,7.040202,8.756523,-1.477731,-6.665751,9.056978,8.552634,3.388435,5.751244,28.451575
Consumer Defensive,54.303199,36.345419,27.909314,27.053295,41.59689,33.028225,39.559253,38.191858,37.73671,27.347293,...,-0.433539,-0.836272,8.936629,5.585188,7.072185,16.315822,9.393806,7.750529,12.969358,584.717393
Energy,43.555902,40.025847,32.462549,34.151984,40.244201,30.825918,55.653275,47.53867,63.279322,20.067568,...,-5.177682,-1.013004,-2.634143,7.87505,8.242134,0.998608,3.229006,1.646553,-8.042885,14.287151
Financial Services,47.73487,30.955263,20.962768,31.641858,27.185534,26.733373,22.350217,32.79756,35.635731,19.810838,...,3.187986,1.366544,0.861333,-16.042334,6.710047,0.982878,2.835551,-1.005149,4.879832,141.788565
Healthcare,38.761228,31.956026,29.584983,44.143259,59.232556,44.167312,62.997914,74.878086,75.230029,57.101916,...,2.287013,-13.197037,-5.701759,-13.623159,-7.84004,3.809005,-10.14674,-0.229029,-9.861485,63.67107
Industrials,46.85261,41.802904,26.371065,31.160932,30.645437,35.36228,42.961363,55.039857,37.352636,33.357437,...,15.766111,8.818,-1.703175,2.753959,-0.62719,4.978782,4.737996,2.67766,10.096189,37.827617
Real Estate,65.544753,40.331667,30.670026,36.808816,61.408539,32.006535,31.813958,31.938044,30.417213,21.927673,...,-1.096933,-8.506704,9.698204,-11.249846,27.771158,9.313883,0.496038,41.120451,34.998174,142.405966
Technology,56.893969,42.048801,49.258669,49.56396,53.950418,56.536949,66.926293,71.903875,64.234473,58.127225,...,-3.401651,2.176702,-4.565596,21.516915,2.797546,10.007432,-0.73232,15.332165,-0.346728,133.715475


##### USA Price Ratios

In [123]:
val_usa_sector_mean = df_val_usa.groupby(by='sector').mean()
val_usa_sector_mean

Unnamed: 0_level_0,PE_Y0,PE_Y1,PE_Y2,PE_Y3,PE_Y4,PE_Y5,PE_Y6,PE_Y7,PE_Y8,PE_Y9,...,PC_Y1,PC_Y2,PC_Y3,PC_Y4,PC_Y5,PC_Y6,PC_Y7,PC_Y8,PC_Y9,PC_Y10
sector,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Basic Materials,65.624731,59.718588,31.665778,24.384635,33.998869,77.820082,51.960808,58.569565,32.919213,27.633333,...,-37.3679,-29.170923,-2.486114,2.65673,-12.38098,9.401498,-22.462067,-14.417306,-9.475509,32.796386
Communication Services,34.987719,38.2856,25.385714,17.111864,48.698171,47.364458,27.544578,39.36519,29.24269,23.28024,...,8.717127,1.112209,7.260479,-9.371749,-28.823207,-77.65,14.804583,-10.185425,-7.272199,27.55935
Consumer Cyclical,71.262791,37.674878,21.027796,33.532143,42.279977,30.6875,32.418817,41.749531,39.487674,25.317296,...,-43.31009,-11.706897,-23.813626,-23.978261,-5.282775,-15.467108,1.228756,-6.28054,1.365096,24.597731
Consumer Defensive,30.797037,44.207576,20.652294,19.965476,23.334346,36.093563,37.436259,40.154902,38.32423,24.461884,...,-76.193805,-49.448758,2.02244,-10.22073,-1.151406,9.417035,4.780404,9.208145,-24.062557,59.4725
Energy,51.645673,31.254779,31.239474,51.812338,42.106596,31.825995,46.442248,41.940964,67.644062,18.044693,...,-19.045788,-17.919176,-24.887695,-8.569126,0.009799,-1.398716,-0.387808,0.643394,-10.145397,20.875896
Financial Services,34.431461,26.950427,22.463145,26.516349,30.255868,24.804731,22.508942,33.238128,31.506003,22.069916,...,1.870047,8.264575,-3.899252,0.636416,8.38439,4.792059,1.742151,-10.174777,1.475148,36.699611
Healthcare,28.098519,33.437621,29.017763,33.393603,67.000781,45.667593,52.547564,75.603023,116.855605,75.821489,...,-25.464269,-31.231338,-1.857688,-38.811381,-49.436965,-17.288018,-20.319986,4.256539,-26.064767,34.049344
Industrials,48.088075,31.683244,27.164495,21.752222,27.962811,27.100986,36.902823,41.125393,33.982917,24.681114,...,-10.155146,-17.953752,-21.434585,-8.25625,-19.058246,-24.097502,-29.318594,-11.848051,-18.14049,28.612509
Real Estate,42.975595,40.25367,46.349057,49.812556,78.229874,41.171191,42.346543,38.386967,40.777728,25.057642,...,-4.476852,-32.933657,45.4225,-4.522072,15.041684,10.918526,6.105534,31.067082,11.790175,38.890641
Technology,38.395082,30.352737,48.925468,56.195951,51.115868,55.516591,48.624279,55.943921,58.744138,46.438121,...,-12.702081,-3.697797,-30.179163,-2.668127,-17.873145,-12.487411,-17.608983,2.302521,-19.429303,96.206475


#### TTM P/E by Sector
*Global*

In [285]:
pe = val_sector_mean.iloc[:, 0:11]
pe.columns = val_yr_cols
pe['TTM']

sector
Basic Materials           27.407019
Communication Services    40.627828
Consumer Cyclical         33.935426
Consumer Defensive        32.322798
Energy                    29.279468
Financial Services        29.090206
Healthcare                66.213090
Industrials               46.009728
Real Estate               31.627990
Technology                59.974015
Utilities                 33.861731
Name: TTM, dtype: float64

In [284]:
fig_pe, ax_pe = plt.subplots(1)
x = [x*3 for x in range(len(pe['TTM']))]
y = pe['TTM'].sort_values(ascending=True)
bars = ax_pe.bar(x, y, width=2)
plt.xticks(ticks=x, labels=y.index.tolist(), fontsize=9)
for tick in ax_pe.xaxis.get_ticklabels():
    tick.set_rotation(90)
plt.subplots_adjust(bottom=0.4)
plt.title('Average TTM P/E by Sector for all Database Records', fontsize=10, fontweight='bold')
plt.yticks([])
for bar in bars:
    ax_pe.text(bar.get_x()+1, bar.get_height()+1, '{:.1f}'.format(bar.get_height()), 
               color='black', ha='center', fontsize=9)
plt.axis([-3, len(x)*3, 0, 80])
ax_pe.get_children()[22].set_color(None)
ax_pe.get_children()[23].set_color(None)
ax_pe.get_children()[25].set_color(None)

<IPython.core.display.Javascript object>

*USA*

In [288]:
pe = val_usa_sector_mean.iloc[:, 0:11]
pe.columns = val_yr_cols
pe['TTM']

sector
Basic Materials           23.903800
Communication Services    38.718408
Consumer Cyclical         34.866476
Consumer Defensive        32.394366
Energy                    39.252530
Financial Services        28.249057
Healthcare                64.388257
Industrials               53.404394
Real Estate               44.450980
Technology                54.644835
Utilities                 48.827458
Name: TTM, dtype: float64

In [291]:
fig_pe, ax_pe = plt.subplots(1)
x = [x*3 for x in range(len(pe['TTM']))]
y = pe['TTM'].sort_values(ascending=True)
bars = ax_pe.bar(x, y, width=2)
plt.xticks(ticks=x, labels=y.index.tolist(), fontsize=9)
for tick in ax_pe.xaxis.get_ticklabels():
    tick.set_rotation(90)
plt.subplots_adjust(bottom=0.4)
plt.title('Average TTM P/E by Sector for USA Equities', fontsize=10, fontweight='bold')
plt.yticks([])
for bar in bars:
    ax_pe.text(bar.get_x()+1, bar.get_height()+1, '{:.1f}'.format(bar.get_height()), 
               color='black', ha='center', fontsize=9)
plt.axis([-3, len(x)*3, 0, 80])
ax_pe.get_children()[22].set_color(None)
ax_pe.get_children()[23].set_color(None)
ax_pe.get_children()[25].set_color(None)

<IPython.core.display.Javascript object>

In [292]:
val_indus_mean = df_val.groupby(by=['sector', 'industry']).mean()
val_indus_mean

Unnamed: 0_level_0,Unnamed: 1_level_0,PE_Y0,PE_Y1,PE_Y2,PE_Y3,PE_Y4,PE_Y5,PE_Y6,PE_Y7,PE_Y8,PE_Y9,...,PC_Y1,PC_Y2,PC_Y3,PC_Y4,PC_Y5,PC_Y6,PC_Y7,PC_Y8,PC_Y9,PC_Y10
sector,industry,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1
Basic Materials,Agricultural Inputs,68.052212,87.797458,28.250350,26.244828,31.131293,38.530000,75.002174,89.248026,33.946853,65.704478,...,-13.307738,-18.397849,54.282673,-13.779426,2.534667,29.732456,-0.830833,-9.387500,-11.983525,23.737423
Basic Materials,Aluminum,75.782759,103.396970,28.358065,38.950000,91.714516,59.498437,31.629268,69.584706,54.081308,32.617895,...,15.737079,11.605714,4.730909,32.500000,27.807500,23.708264,28.816031,3.904380,2.174809,15.041228
Basic Materials,Building Materials,88.626667,40.467742,25.664987,46.976617,34.894575,33.315278,45.419596,45.234791,36.582962,42.196754,...,28.927349,0.255621,16.223242,11.704821,15.068823,20.012215,14.939036,-14.737796,26.068667,37.734912
Basic Materials,Chemicals,102.706667,57.232888,72.826980,49.987147,62.100787,34.430529,48.203645,61.462582,37.333829,27.178559,...,16.028605,14.708539,15.487473,11.039474,20.943774,21.890807,-22.269479,-64.233118,-13.075079,25.043696
Basic Materials,Coal,33.486555,52.674026,43.208475,16.841714,33.952000,42.122556,29.026214,90.610345,20.030806,15.373391,...,-20.701230,-3.560432,6.404923,0.003488,-18.925915,2.095115,14.102594,-3.615718,0.375066,7.878547
Basic Materials,Copper,32.019565,43.222892,9.058974,33.418681,23.748571,72.233333,34.339706,24.883582,21.339175,11.291729,...,-50.259043,-29.797512,-14.318140,-6.816450,-3.845417,-8.341202,-5.349012,-15.540000,-3.026071,11.660390
Basic Materials,Gold,45.518142,124.040708,23.555037,50.198437,76.589825,70.100755,114.246037,40.616377,34.480034,40.550313,...,-29.339207,-19.235643,-2.459709,-4.614908,-1.034414,-7.468410,-15.625234,-7.444552,-4.338038,23.172675
Basic Materials,Industrial Metals & Minerals,51.678821,44.970560,23.232277,37.714985,24.055596,228.227671,63.354662,133.887201,35.629881,18.184137,...,-22.672515,-22.150267,-11.100607,-9.016420,-2.005977,0.044345,-13.867060,-19.209248,-9.167106,33.956090
Basic Materials,Lumber & Wood Production,31.855263,29.464407,194.637500,39.387037,55.386301,41.403571,34.130120,36.367470,72.190217,27.428448,...,11.409677,3.505376,7.713483,8.803158,13.708108,-2.620513,0.985593,6.218254,-30.417778,10.555645
Basic Materials,Paper & Paper Products,39.872816,28.722561,18.726875,23.620482,29.601818,35.770918,30.137173,23.411163,29.439744,19.548291,...,-246.265482,21.784360,21.172642,15.696364,8.739831,2.184188,16.902893,14.926744,-107.933333,19.234109


In [16]:
df = None # Set df variable to none to close db connection 

Database connection for file db/mstables2.sqlite closed.
