# Building a value-weighted index

In this notebook you create a **value-weighted index**. This index uses market-cap data contained in the stock exchange listings to calculate weights and 2016 stock price information. Index performance is then compared against benchmarks to evaluate the performance of the index you created.

## Table of Contents

- [Select index components & import data](#intro)

In [2]:
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

path = "data/dc27/"

---
<a id='intro'></a>

## Select index components & import data

## Explore and clean company listing information

To get started with the construction of a **market-value based index**, you'll work with the combined listing info for the three largest US stock exchanges, the NYSE, the NASDAQ and the AMEX.

In this and the next exercise, you will calculate **market-cap weights** for these stocks.

We have already imported pandas as pd, and loaded the `listings` data set with listings information from the NYSE, NASDAQ, and AMEX. The column 'Market Capitalization' is already measured in USD mn.

In [5]:
listings = pd.read_csv(path+'listings.csv', index_col=0)
listings.head()

Unnamed: 0,Exchange,Stock Symbol,Company Name,Last Sale,Market Capitalization,IPO Year,Sector,Industry
0,amex,XXII,"22nd Century Group, Inc",1.33,120.62849,,Consumer Non-Durables,Farming/Seeds/Milling
1,amex,FAX,Aberdeen Asia-Pacific Income Fund Inc,5.0,1266.332595,1986.0,,
2,amex,IAF,Aberdeen Australia Equity Fund Inc,6.15,139.865305,,,
3,amex,CH,"Aberdeen Chile Fund, Inc.",7.2201,67.563458,,,
4,amex,ABE,Aberdeen Emerging Markets Smaller Company Oppo...,13.36,128.842972,,,


In [6]:
# Inspect listings
print(listings.info())

<class 'pandas.core.frame.DataFrame'>
Int64Index: 6674 entries, 0 to 6673
Data columns (total 8 columns):
Exchange                 6674 non-null object
Stock Symbol             6674 non-null object
Company Name             6674 non-null object
Last Sale                6590 non-null float64
Market Capitalization    6674 non-null float64
IPO Year                 2852 non-null float64
Sector                   5182 non-null object
Industry                 5182 non-null object
dtypes: float64(3), object(5)
memory usage: 469.3+ KB
None


In [7]:
# Move 'stock symbol' into the index
listings.set_index('Stock Symbol', inplace=True)
listings.head()

Unnamed: 0_level_0,Exchange,Company Name,Last Sale,Market Capitalization,IPO Year,Sector,Industry
Stock Symbol,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
XXII,amex,"22nd Century Group, Inc",1.33,120.62849,,Consumer Non-Durables,Farming/Seeds/Milling
FAX,amex,Aberdeen Asia-Pacific Income Fund Inc,5.0,1266.332595,1986.0,,
IAF,amex,Aberdeen Australia Equity Fund Inc,6.15,139.865305,,,
CH,amex,"Aberdeen Chile Fund, Inc.",7.2201,67.563458,,,
ABE,amex,Aberdeen Emerging Markets Smaller Company Oppo...,13.36,128.842972,,,


In [8]:
# Drop rows with missing 'sector' data
listings.dropna(subset=['Sector'], inplace=True)
listings.head()

Unnamed: 0_level_0,Exchange,Company Name,Last Sale,Market Capitalization,IPO Year,Sector,Industry
Stock Symbol,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
XXII,amex,"22nd Century Group, Inc",1.33,120.62849,,Consumer Non-Durables,Farming/Seeds/Milling
ACU,amex,Acme United Corporation.,27.39,91.138992,1988.0,Capital Goods,Industrial Machinery/Components
AIII,amex,"ACRE Realty Investors, Inc.",1.16,23.768939,,Consumer Services,Real Estate Investment Trusts
ATNM,amex,"Actinium Pharmaceuticals, Inc.",1.47,82.037381,,Health Care,Major Pharmaceuticals
AE,amex,"Adams Resources & Energy, Inc.",37.8,159.425129,,Energy,Oil Refining/Marketing


In [9]:
# Select companies with IPO Year before 2019
listings = listings.loc[listings['IPO Year'] < 2019]
listings.head()

Unnamed: 0_level_0,Exchange,Company Name,Last Sale,Market Capitalization,IPO Year,Sector,Industry
Stock Symbol,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ACU,amex,Acme United Corporation.,27.39,91.138992,1988.0,Capital Goods,Industrial Machinery/Components
AAU,amex,"Almaden Minerals, Ltd.",1.72,154.891745,2015.0,Basic Industries,Precious Metals
USAS,amex,Americas Silver Corporation,3.05,120.694838,2017.0,Basic Industries,Precious Metals
AINC,amex,Ashford Inc.,57.3373,115.550771,2014.0,Consumer Services,Professional Services
AUXO,amex,"Auxilio, Inc.",6.3043,59.131037,2017.0,Miscellaneous,Business Services


In [10]:
# Inspect the new listings data
print(listings.info())

<class 'pandas.core.frame.DataFrame'>
Index: 2349 entries, ACU to ZTO
Data columns (total 7 columns):
Exchange                 2349 non-null object
Company Name             2349 non-null object
Last Sale                2349 non-null float64
Market Capitalization    2349 non-null float64
IPO Year                 2349 non-null float64
Sector                   2349 non-null object
Industry                 2349 non-null object
dtypes: float64(3), object(4)
memory usage: 146.8+ KB
None


In [11]:
# Show the number of companies per sector
print(listings.groupby('Sector').size().sort_values(ascending=False))

Sector
Health Care              445
Consumer Services        402
Technology               386
Finance                  351
Energy                   144
Capital Goods            143
Public Utilities         104
Basic Industries         104
Consumer Non-Durables     89
Miscellaneous             68
Transportation            58
Consumer Durables         55
dtype: int64


In [None]:
---
<a id='intro'></a>

In [None]:
<img src="images/ts2_001.png" alt="" style="width: 400px;"/>