# Tabular Data

## DataFrames

### Exercise 7
Use the same data as exercise 6, but using a fresh notebook. Perform the following:  

a. Download a table (cross-sectional) containing reference data for each ticker. Reference data should include the full company name and any other useful column data you can
find.  
b. Create a ‘transactions’ table, which has three columns: Date, Ticker, Amount, and BuySell. Add a bunch of transactions (at least ten) of your favorite stocks – both buys and sells (all long positions, no shorts, no day trading). Transactions should span a period of two years. Keep them self-consistent but somewhat arbitrary.  
c. Create a function that takes a date and returns a (cross-sectional) table containing a snapshot of your portfolio at that point in time (Ticker, Position).  
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; i. The function should join the table with your stock reference data, to enrich with the full company name, etc.  
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ii. The function should join the table with the stock data table, to retrieve the open price, volume, and return for the given date, for each ticker.  
d. Create a ‘scaffold’ table, which is the concatenation of the above function, for every business date over the two-year period. Be sure to add a ‘Date’ column.
e. ‘PnL’ (Profit and Loss) is your total portfolio profit/loss at any given point in time. For a given stock, the daily marked-to-market PnL is simply the return*position. Add a
column containing the daily PnL for each ticker.  
f. Add a column containing the running cumulative PnL per ticker.  
g. Add a column containing the running cumulative PnL for your entire portfolio.  
h. Create a summary table of your average daily PnL, total PnL, and overall yield per ticker.  
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; i. ‘Yield’ is PnL/Position.  
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ii. You should exclude days when you had no position in that stock, from the average.  
i. Display your total PnL at the end of the two years.  

In [1]:
from IPython.core.interactiveshell import InteractiveShell

InteractiveShell.ast_node_interactivity = "all"

#### a. Download a table (cross-sectional) containing reference data for each ticker. Reference data should include the full company name and any other useful column data you can find.

In [2]:
# Reload data from 8.2.6
%store -r dataReference
dataReference = dataReference
dataReference = dataReference.rename(columns = {"Symbol":"Ticker"})
dataReference.head()

%store -r dataDaily
hist = dataDaily
hist.tail()

Unnamed: 0,Ticker,Security,SEC filings,GICS Sector,GICS Sub-Industry,Headquarters Location,Date first added,CIK,Founded
0,MMM,3M Company,reports,Industrials,Industrial Conglomerates,"St. Paul, Minnesota",1976-08-09,66740,1902
1,ABT,Abbott Laboratories,reports,Health Care,Health Care Equipment,"North Chicago, Illinois",1964-03-31,1800,1888
2,ABBV,AbbVie Inc.,reports,Health Care,Pharmaceuticals,"North Chicago, Illinois",2012-12-31,1551152,2013 (1888)
3,ABMD,ABIOMED Inc,reports,Health Care,Health Care Equipment,"Danvers, Massachusetts",2018-05-31,815094,1981
4,ACN,Accenture plc,reports,Information Technology,IT Consulting & Other Services,"Dublin, Ireland",2011-07-06,1467373,1989


Unnamed: 0,Date,Ticker,Close,Dividends,High,Low,Open,Stock Splits,Volume,Daily Return,1-week MA Daily Return,Rolling Cumulative Volume
3906315,2020-10-30,YUM,93.330002,0.0,95.209999,92.349998,94.360001,0.0,2139000.0,0.071035,0.634927,18110940000000.0
3906316,2020-10-30,ZBH,132.100006,0.0,134.820007,130.050003,133.490005,0.0,1256400.0,0.415408,0.863042,18110940000000.0
3906317,2020-10-30,ZBRA,283.640015,0.0,290.970001,281.019989,290.0,0.0,304300.0,1.147161,1.003142,18110950000000.0
3906318,2020-10-30,ZION,32.27,0.0,32.310001,31.24,31.33,0.0,1732100.0,-0.886229,0.952237,18110950000000.0
3906319,2020-10-30,ZTS,158.550003,0.0,161.320007,156.25,160.020004,0.0,2078300.0,3.913232,0.932121,18110950000000.0


#### b. Create a ‘transactions’ table, which has three columns: Date, Ticker, Amount, and BuySell. Add a bunch of transactions (at least ten) of your favorite stocks – both buys and sells (all long positions, no shorts, no day trading). Transactions should span a period of two years. Keep them self-consistent but somewhat arbitrary.

In [3]:
import pandas as pd
import datetime

# Adding transaction to the ledger
# Raise exception if buysell is not "Buy" or "Sell"
def addTransaction(date, ticker, amount, buysell):
    if not (buysell == 'Buy' or buysell == 'Sell'):
        raise Exception('Invalid buysell parameter, only "Buy" or "Sell"')
    else:
        if buysell=='Sell':
            amount = -amount
            
        return pd.Series({'Date': date, 
                          'Ticker': ticker, 
                          'Amount': amount, 
                          'BuySell': buysell
                         }
                        )

########################
# Generate transaction header
transactionsHeader = ['Date', 'Ticker', 'Amount', 'BuySell']

# Init the empty df 
ledger = pd.DataFrame(columns = transactionsHeader, dtype=float)

# Adding transactions to the ledger
ledger = ledger.append(addTransaction('2018-01-10', 'MMM', 15000, 'Buy'), ignore_index=True)
ledger = ledger.append(addTransaction('2018-01-11', 'ABT', 20000, 'Buy'), ignore_index=True)
ledger = ledger.append(addTransaction('2018-01-17', 'MMM', 20000, 'Buy'), ignore_index=True)
ledger = ledger.append(addTransaction('2018-01-24', 'YUM', 22000, 'Buy'), ignore_index=True)
ledger = ledger.append(addTransaction('2018-02-06', 'MMM', 3000, 'Buy'), ignore_index=True)
ledger = ledger.append(addTransaction('2018-02-08', 'AMZN', 7000, 'Buy'), ignore_index=True)
ledger = ledger.append(addTransaction('2018-02-12', 'AMZN', 1000, 'Sell'), ignore_index=True)
ledger = ledger.append(addTransaction('2018-06-13', 'AAPL', 9000, 'Buy'), ignore_index=True)
ledger = ledger.append(addTransaction('2018-08-13', 'ZTS', 7000, 'Buy'), ignore_index=True)
ledger = ledger.append(addTransaction('2018-10-02', 'YUM', 10000, 'Sell'), ignore_index=True)
ledger = ledger.append(addTransaction('2019-01-07', 'ACN', 20000, 'Buy'), ignore_index=True)
ledger = ledger.append(addTransaction('2019-02-20', 'SCHW', 20000, 'Buy'), ignore_index=True)
ledger = ledger.append(addTransaction('2019-03-20', 'YUM', 5000, 'Sell'), ignore_index=True)
ledger = ledger.append(addTransaction('2019-04-01', 'YUM', 7000, 'Sell'), ignore_index=True)
ledger = ledger.append(addTransaction('2019-05-09', 'SCHW', 10000, 'Sell'), ignore_index=True)
ledger = ledger.append(addTransaction('2019-06-14', 'SCHW', 10000, 'Sell'), ignore_index=True)
ledger = ledger.append(addTransaction('2019-07-22', 'MMM', 2000, 'Buy'), ignore_index=True)
ledger = ledger.append(addTransaction('2019-11-12', 'AAPL', 8000, 'Buy'), ignore_index=True)
ledger = ledger.append(addTransaction('2019-11-15', 'MMM', 2000, 'Buy'), ignore_index=True)
ledger = ledger.append(addTransaction('2019-12-11', 'ZTS', 2000, 'Sell'), ignore_index=True)

ledger['Date'] = pd.to_datetime(ledger['Date'], format='%Y-%m-%d') # Convert date to datetime
ledger

Unnamed: 0,Date,Ticker,Amount,BuySell
0,2018-01-10,MMM,15000.0,Buy
1,2018-01-11,ABT,20000.0,Buy
2,2018-01-17,MMM,20000.0,Buy
3,2018-01-24,YUM,22000.0,Buy
4,2018-02-06,MMM,3000.0,Buy
5,2018-02-08,AMZN,7000.0,Buy
6,2018-02-12,AMZN,-1000.0,Sell
7,2018-06-13,AAPL,9000.0,Buy
8,2018-08-13,ZTS,7000.0,Buy
9,2018-10-02,YUM,-10000.0,Sell


#### c. Create a function that takes a date and returns a (cross-sectional) table containing a snapshot of your portfolio at that point in time (Ticker, Position).

In [4]:
# Function to view a snapshot of your portfolio
# Can view Ticker and Cumulative position by date
def viewSnapshot(date):
    snapshot = ledger[ledger['Date'] <= date].groupby(by=['Ticker'], as_index=False).agg({'Amount': 'sum'}).rename(columns = {'Amount':'Position'})
    snapshot['Date'] = pd.to_datetime(date)
    
    return snapshot

viewSnapshot('2019-11-15')

Unnamed: 0,Ticker,Position,Date
0,AAPL,17000.0,2019-11-15
1,ABT,20000.0,2019-11-15
2,ACN,20000.0,2019-11-15
3,AMZN,6000.0,2019-11-15
4,MMM,42000.0,2019-11-15
5,SCHW,0.0,2019-11-15
6,YUM,0.0,2019-11-15
7,ZTS,7000.0,2019-11-15


##### i. The function should join the table with your stock reference data, to enrich with the full company name, etc.

In [5]:
# Function to view a snapshot of your portfolio
# Can view cumulative share by ticker
# Can also view other info about the underlying company
def viewSnapshot(date):
    snapshot = ledger[ledger['Date'] <= date].groupby(by=['Ticker'], as_index=False).agg({'Amount': 'sum'}).rename(columns = {'Amount':'Position'})
    snapshot = snapshot.join(dataReference.set_index('Ticker'), how='inner', on='Ticker')
    snapshot['Date'] = pd.to_datetime(date)
    
    return snapshot

viewSnapshot('2018-01-10')

Unnamed: 0,Ticker,Position,Security,SEC filings,GICS Sector,GICS Sub-Industry,Headquarters Location,Date first added,CIK,Founded,Date
0,MMM,15000.0,3M Company,reports,Industrials,Industrial Conglomerates,"St. Paul, Minnesota",1976-08-09,66740,1902,2018-01-10


##### ii. The function should join the table with the stock data table, to retrieve the open price, volume, and return for the given date, for each ticker.

In [6]:
# Function to view a snapshot of your portfolio
# Can view cumulative share by ticker
# Can also view other info about the underlying company
# Can view corresponded daily stock data (Open, Close, etc.)
def viewSnapshot(date):
    snapshot = ledger[ledger['Date'] <= date].groupby(by=['Ticker'], as_index=False).agg({'Amount': 'sum'}).rename(columns = {'Amount':'Position'})
    snapshot = snapshot.join(dataReference.set_index('Ticker'), how='inner', on='Ticker')
    snapshot['Date'] = pd.to_datetime(date)
    snapshot = snapshot.join(hist.set_index(['Date', 'Ticker']), how='left', on=['Date', 'Ticker'])
    
    return snapshot

viewSnapshot('2019-11-15')


Unnamed: 0,Ticker,Position,Security,SEC filings,GICS Sector,GICS Sub-Industry,Headquarters Location,Date first added,CIK,Founded,...,Close,Dividends,High,Low,Open,Stock Splits,Volume,Daily Return,1-week MA Daily Return,Rolling Cumulative Volume
0,AAPL,17000.0,Apple Inc.,reports,Information Technology,"Technology Hardware, Storage & Peripherals","Cupertino, California",1982-11-30,320193,1977,...,65.984779,0.0,65.989742,65.30199,65.468338,0.0,100206400.0,-0.590965,0.892755,17450950000000.0
1,ABT,20000.0,Abbott Laboratories,reports,Health Care,Health Care Equipment,"North Chicago, Illinois",1964-03-31,1800,1888,...,84.3797,0.0,84.3797,83.090035,83.355841,0.0,5453500.0,-0.622006,0.135429,17450970000000.0
2,ACN,20000.0,Accenture plc,reports,Information Technology,IT Consulting & Other Services,"Dublin, Ireland",2011-07-06,1467373,1989,...,193.70047,0.0,194.43862,192.71627,192.804844,0.0,2426200.0,1.295581,0.512738,17450970000000.0
3,AMZN,6000.0,Amazon.com Inc.,reports,Consumer Discretionary,Internet & Direct Marketing Retail,"Seattle, Washington",2005-11-18,1018724,1994,...,1739.48999,0.0,1761.680054,1732.859985,1760.050049,0.0,3927600.0,7.297552,2.021281,17451120000000.0
4,MMM,42000.0,3M Company,reports,Industrials,Industrial Conglomerates,"St. Paul, Minnesota",1976-08-09,66740,1902,...,165.708633,0.0,166.489548,164.898783,165.496531,0.0,2616900.0,0.598208,0.238312,17452060000000.0
5,SCHW,0.0,Charles Schwab Corporation,reports,Financials,Investment Banking & Brokerage,"San Francisco, California",1997-06-02,316709,1971,...,43.721573,0.0,43.810298,43.011778,43.15965,0.0,7064400.0,-0.472582,0.055246,17452430000000.0
6,YUM,0.0,Yum! Brands Inc,reports,Consumer Discretionary,Restaurants,"Louisville, Kentucky",1997-10-06,1041061,1997,...,96.443382,0.0,97.012545,96.021418,96.78684,0.0,2575800.0,0.256501,0.110592,17452730000000.0
7,ZTS,7000.0,Zoetis,reports,Health Care,Pharmaceuticals,"Florham Park, New Jersey",2013-06-21,1555280,1952,...,117.248283,0.0,117.248283,115.548025,116.055122,0.0,3429200.0,1.434601,0.407224,17452740000000.0


#### d. Create a ‘scaffold’ table, which is the concatenation of the above function, for every business date over the two-year period. Be sure to add a ‘Date’ column.

In [7]:
# Get the most recent date of the closest transactions
def recentTransactionDate(date, ledger):
    temp = []
    for i in list(ledger['Date']):
        if date >= i:
            temp.append(i)
    
    return max(temp)


# Create scaffold poortfolio
def create_scaffoldPortfolio():
    # Generate transaction header
    scaffoldPortfolioHeader = ['Date',
                               'Ticker',
                               'Position',
                               'Security',
                               'SEC filings',
                               'GICS Sector',
                               'GICS Sub-Industry',
                               'Headquarters Location',
                               'Date first added',
                               'CIK',
                               'Founded',
                               'Close',
                               'Dividends',
                               'High',
                               'Low',
                               'Open',
                               'Stock Splits',
                               'Volume',
                               'Daily Return',
                               '1-week MA Daily Return',
                               'Rolling Cumulative Volume']

    # Init the empty df 
    scaffoldPortfolio = pd.DataFrame (columns = scaffoldPortfolioHeader, dtype=float)

    # Get list of business date for the period
    start_date = '2018-01-10'
    end_date = '2020-01-10'
    dates = list(pd.bdate_range(start=start_date, end=end_date))
    
    print(f'Generating scaffold portfolio from {start_date} to {end_date}. This will takes a while.')
    
    # Iterate through dates, for each date call viewSnapshot to get portfolio info
    # If an update to the portfolio via new transactions happens on this date, add the new portfolio to existing scaffold.
    # If not--i.e. no change--(result in a KeyError), handle the error by calling viewSnapshot on the most recent date that has an update and add this to the existing scaffold
    for date in dates:
        try:
            scaffoldPortfolio = pd.concat([scaffoldPortfolio, viewSnapshot(date)], axis=0)
        except KeyError:
            scaffoldPortfolio = pd.concat([scaffoldPortfolio, viewSnapshot(recentTransactionDate(date, ledger))], axis=0)
    
    return scaffoldPortfolio

scaffoldPortfolio = create_scaffoldPortfolio()

scaffoldPortfolio.head(10)

Generating scaffold portfolio from 2018-01-10 to 2020-01-10. This will takes a while.


Unnamed: 0,Date,Ticker,Position,Security,SEC filings,GICS Sector,GICS Sub-Industry,Headquarters Location,Date first added,CIK,...,Close,Dividends,High,Low,Open,Stock Splits,Volume,Daily Return,1-week MA Daily Return,Rolling Cumulative Volume
0,2018-01-10,MMM,15000.0,3M Company,reports,Industrials,Industrial Conglomerates,"St. Paul, Minnesota",1976-08-09,66740.0,...,220.988754,0.0,222.299261,219.971513,220.860454,0.0,1640900.0,1.876795,0.350461,16445070000000.0
0,2018-01-11,ABT,20000.0,Abbott Laboratories,reports,Health Care,Health Care Equipment,"North Chicago, Illinois",1964-03-31,1800.0,...,56.328888,0.28,56.338442,55.698234,56.042228,0.0,4240900.0,-0.731959,0.201628,16445880000000.0
1,2018-01-11,MMM,15000.0,3M Company,reports,Industrials,Industrial Conglomerates,"St. Paul, Minnesota",1976-08-09,66740.0,...,222.06102,0.0,222.088512,219.714952,220.622226,0.0,1487700.0,1.895749,0.353683,16447020000000.0
0,2018-01-12,ABT,20000.0,Abbott Laboratories,reports,Health Care,Health Care Equipment,"North Chicago, Illinois",1964-03-31,1800.0,...,56.22377,0.0,56.605986,56.175994,56.338434,0.0,6320900.0,-0.738628,0.205469,16447810000000.0
1,2018-01-12,MMM,15000.0,3M Company,reports,Industrials,Industrial Conglomerates,"St. Paul, Minnesota",1976-08-09,66740.0,...,224.040512,0.0,225.442654,222.235137,222.757511,0.0,1974300.0,1.895828,0.356432,16449070000000.0
0,2018-01-15,ABT,20000.0,Abbott Laboratories,reports,Health Care,Health Care Equipment,"North Chicago, Illinois",1964-03-31,1800.0,...,,,,,,,,,,
1,2018-01-15,MMM,15000.0,3M Company,reports,Industrials,Industrial Conglomerates,"St. Paul, Minnesota",1976-08-09,66740.0,...,,,,,,,,,,
0,2018-01-16,ABT,20000.0,Abbott Laboratories,reports,Health Care,Health Care Equipment,"North Chicago, Illinois",1964-03-31,1800.0,...,55.841557,0.0,56.510433,55.640895,56.271549,0.0,7175000.0,-0.743952,0.216451,16449920000000.0
1,2018-01-16,MMM,15000.0,3M Company,reports,Industrials,Industrial Conglomerates,"St. Paul, Minnesota",1976-08-09,66740.0,...,224.287949,0.0,226.533208,223.453991,224.801149,0.0,2411100.0,1.91615,0.358569,16451500000000.0
0,2018-01-17,ABT,20000.0,Abbott Laboratories,reports,Health Care,Health Care Equipment,"North Chicago, Illinois",1964-03-31,1800.0,...,56.367104,0.0,56.47221,56.02311,56.147327,0.0,5037500.0,-0.748507,0.220962,16452550000000.0


#### e. ‘PnL’ (Profit and Loss) is your total portfolio profit/loss at any given point in time. For a given stock, the daily marked-to-market PnL is simply the return*position. Add a column containing the daily PnL for each ticker.

In [8]:
scaffoldPortfolio['Daily P&L'] = scaffoldPortfolio['Daily Return'] * scaffoldPortfolio['Position']
scaffoldPortfolio

Unnamed: 0,Date,Ticker,Position,Security,SEC filings,GICS Sector,GICS Sub-Industry,Headquarters Location,Date first added,CIK,...,Dividends,High,Low,Open,Stock Splits,Volume,Daily Return,1-week MA Daily Return,Rolling Cumulative Volume,Daily P&L
0,2018-01-10,MMM,15000.0,3M Company,reports,Industrials,Industrial Conglomerates,"St. Paul, Minnesota",1976-08-09,66740.0,...,0.00,222.299261,219.971513,220.860454,0.0,1640900.0,1.876795,0.350461,1.644507e+13,28151.927647
0,2018-01-11,ABT,20000.0,Abbott Laboratories,reports,Health Care,Health Care Equipment,"North Chicago, Illinois",1964-03-31,1800.0,...,0.28,56.338442,55.698234,56.042228,0.0,4240900.0,-0.731959,0.201628,1.644588e+13,-14639.173012
1,2018-01-11,MMM,15000.0,3M Company,reports,Industrials,Industrial Conglomerates,"St. Paul, Minnesota",1976-08-09,66740.0,...,0.00,222.088512,219.714952,220.622226,0.0,1487700.0,1.895749,0.353683,1.644702e+13,28436.232002
0,2018-01-12,ABT,20000.0,Abbott Laboratories,reports,Health Care,Health Care Equipment,"North Chicago, Illinois",1964-03-31,1800.0,...,0.00,56.605986,56.175994,56.338434,0.0,6320900.0,-0.738628,0.205469,1.644781e+13,-14772.556368
1,2018-01-12,MMM,15000.0,3M Company,reports,Industrials,Industrial Conglomerates,"St. Paul, Minnesota",1976-08-09,66740.0,...,0.00,225.442654,222.235137,222.757511,0.0,1974300.0,1.895828,0.356432,1.644907e+13,28437.418212
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3,2020-01-10,AMZN,6000.0,Amazon.com Inc.,reports,Consumer Discretionary,Internet & Direct Marketing Retail,"Seattle, Washington",2005-11-18,1018724.0,...,0.00,1906.939941,1880.000000,1905.369995,0.0,2853700.0,7.267550,1.952383,1.751834e+13,43605.302282
4,2020-01-10,MMM,42000.0,3M Company,reports,Industrials,Industrial Conglomerates,"St. Paul, Minnesota",1976-08-09,66740.0,...,0.00,177.159388,175.175614,176.605104,0.0,2103800.0,0.598660,0.252569,1.751918e+13,25143.699270
5,2020-01-10,SCHW,0.0,Charles Schwab Corporation,reports,Financials,Investment Banking & Brokerage,"San Francisco, California",1997-06-02,316709.0,...,0.00,47.477588,46.945239,47.448014,0.0,7452900.0,-0.469068,0.057612,1.751950e+13,-0.000000
6,2020-01-10,YUM,0.0,Yum! Brands Inc,reports,Consumer Discretionary,Restaurants,"Louisville, Kentucky",1997-10-06,1041061.0,...,0.00,101.401455,100.070987,101.243774,0.0,1462400.0,0.266653,0.134937,1.751977e+13,0.000000


#### f. Add a column containing the running cumulative PnL per ticker.

In [9]:
scaffoldPortfolio['Cumulative Daily P&L'] = scaffoldPortfolio.groupby('Ticker', as_index=False)['Daily P&L'].cumsum()
scaffoldPortfolio.head(10)

Unnamed: 0,Date,Ticker,Position,Security,SEC filings,GICS Sector,GICS Sub-Industry,Headquarters Location,Date first added,CIK,...,High,Low,Open,Stock Splits,Volume,Daily Return,1-week MA Daily Return,Rolling Cumulative Volume,Daily P&L,Cumulative Daily P&L
0,2018-01-10,MMM,15000.0,3M Company,reports,Industrials,Industrial Conglomerates,"St. Paul, Minnesota",1976-08-09,66740.0,...,222.299261,219.971513,220.860454,0.0,1640900.0,1.876795,0.350461,16445070000000.0,28151.927647,28151.927647
0,2018-01-11,ABT,20000.0,Abbott Laboratories,reports,Health Care,Health Care Equipment,"North Chicago, Illinois",1964-03-31,1800.0,...,56.338442,55.698234,56.042228,0.0,4240900.0,-0.731959,0.201628,16445880000000.0,-14639.173012,-14639.173012
1,2018-01-11,MMM,15000.0,3M Company,reports,Industrials,Industrial Conglomerates,"St. Paul, Minnesota",1976-08-09,66740.0,...,222.088512,219.714952,220.622226,0.0,1487700.0,1.895749,0.353683,16447020000000.0,28436.232002,56588.159649
0,2018-01-12,ABT,20000.0,Abbott Laboratories,reports,Health Care,Health Care Equipment,"North Chicago, Illinois",1964-03-31,1800.0,...,56.605986,56.175994,56.338434,0.0,6320900.0,-0.738628,0.205469,16447810000000.0,-14772.556368,-29411.72938
1,2018-01-12,MMM,15000.0,3M Company,reports,Industrials,Industrial Conglomerates,"St. Paul, Minnesota",1976-08-09,66740.0,...,225.442654,222.235137,222.757511,0.0,1974300.0,1.895828,0.356432,16449070000000.0,28437.418212,85025.577862
0,2018-01-15,ABT,20000.0,Abbott Laboratories,reports,Health Care,Health Care Equipment,"North Chicago, Illinois",1964-03-31,1800.0,...,,,,,,,,,,
1,2018-01-15,MMM,15000.0,3M Company,reports,Industrials,Industrial Conglomerates,"St. Paul, Minnesota",1976-08-09,66740.0,...,,,,,,,,,,
0,2018-01-16,ABT,20000.0,Abbott Laboratories,reports,Health Care,Health Care Equipment,"North Chicago, Illinois",1964-03-31,1800.0,...,56.510433,55.640895,56.271549,0.0,7175000.0,-0.743952,0.216451,16449920000000.0,-14879.035491,-44290.764871
1,2018-01-16,MMM,15000.0,3M Company,reports,Industrials,Industrial Conglomerates,"St. Paul, Minnesota",1976-08-09,66740.0,...,226.533208,223.453991,224.801149,0.0,2411100.0,1.91615,0.358569,16451500000000.0,28742.247836,113767.825698
0,2018-01-17,ABT,20000.0,Abbott Laboratories,reports,Health Care,Health Care Equipment,"North Chicago, Illinois",1964-03-31,1800.0,...,56.47221,56.02311,56.147327,0.0,5037500.0,-0.748507,0.220962,16452550000000.0,-14970.142119,-59260.90699


#### g. Add a column containing the running cumulative PnL for your entire portfolio.

In [10]:
scaffoldPortfolio['Cumulative Daily Portfolio P&L'] = scaffoldPortfolio['Daily P&L']
scaffoldPortfolio['Cumulative Daily Portfolio P&L'] = scaffoldPortfolio.groupby('Date', as_index=False).transform('sum')[['Cumulative Daily Portfolio P&L']]
scaffoldPortfolio.head(10)

Unnamed: 0,Date,Ticker,Position,Security,SEC filings,GICS Sector,GICS Sub-Industry,Headquarters Location,Date first added,CIK,...,Low,Open,Stock Splits,Volume,Daily Return,1-week MA Daily Return,Rolling Cumulative Volume,Daily P&L,Cumulative Daily P&L,Cumulative Daily Portfolio P&L
0,2018-01-10,MMM,15000.0,3M Company,reports,Industrials,Industrial Conglomerates,"St. Paul, Minnesota",1976-08-09,66740.0,...,219.971513,220.860454,0.0,1640900.0,1.876795,0.350461,16445070000000.0,28151.927647,28151.927647,28151.9
0,2018-01-11,ABT,20000.0,Abbott Laboratories,reports,Health Care,Health Care Equipment,"North Chicago, Illinois",1964-03-31,1800.0,...,55.698234,56.042228,0.0,4240900.0,-0.731959,0.201628,16445880000000.0,-14639.173012,-14639.173012,13797.1
1,2018-01-11,MMM,15000.0,3M Company,reports,Industrials,Industrial Conglomerates,"St. Paul, Minnesota",1976-08-09,66740.0,...,219.714952,220.622226,0.0,1487700.0,1.895749,0.353683,16447020000000.0,28436.232002,56588.159649,13797.1
0,2018-01-12,ABT,20000.0,Abbott Laboratories,reports,Health Care,Health Care Equipment,"North Chicago, Illinois",1964-03-31,1800.0,...,56.175994,56.338434,0.0,6320900.0,-0.738628,0.205469,16447810000000.0,-14772.556368,-29411.72938,13664.9
1,2018-01-12,MMM,15000.0,3M Company,reports,Industrials,Industrial Conglomerates,"St. Paul, Minnesota",1976-08-09,66740.0,...,222.235137,222.757511,0.0,1974300.0,1.895828,0.356432,16449070000000.0,28437.418212,85025.577862,13664.9
0,2018-01-15,ABT,20000.0,Abbott Laboratories,reports,Health Care,Health Care Equipment,"North Chicago, Illinois",1964-03-31,1800.0,...,,,,,,,,,,0.0
1,2018-01-15,MMM,15000.0,3M Company,reports,Industrials,Industrial Conglomerates,"St. Paul, Minnesota",1976-08-09,66740.0,...,,,,,,,,,,0.0
0,2018-01-16,ABT,20000.0,Abbott Laboratories,reports,Health Care,Health Care Equipment,"North Chicago, Illinois",1964-03-31,1800.0,...,55.640895,56.271549,0.0,7175000.0,-0.743952,0.216451,16449920000000.0,-14879.035491,-44290.764871,13863.2
1,2018-01-16,MMM,15000.0,3M Company,reports,Industrials,Industrial Conglomerates,"St. Paul, Minnesota",1976-08-09,66740.0,...,223.453991,224.801149,0.0,2411100.0,1.91615,0.358569,16451500000000.0,28742.247836,113767.825698,13863.2
0,2018-01-17,ABT,20000.0,Abbott Laboratories,reports,Health Care,Health Care Equipment,"North Chicago, Illinois",1964-03-31,1800.0,...,56.02311,56.147327,0.0,5037500.0,-0.748507,0.220962,16452550000000.0,-14970.142119,-59260.90699,53157.9


#### h. Create a summary table of your average daily PnL, total PnL, and overall yield per ticker.  
##### i. ‘Yield’ is PnL/Position.  
##### ii. You should exclude days when you had no position in that stock, from the average.

In [11]:
import numpy as np

tempPortfolio = scaffoldPortfolio

# Add 'Yield' value per ticker. 'Yield' = 'Daily P&L' / 'Position'
tempPortfolio['Yield'] = tempPortfolio['Daily P&L'] / tempPortfolio['Position']

# Replace 0 with NaN in Daily P&L for average calculation
tempPortfolio['Daily P&L'] = tempPortfolio['Daily P&L'].replace(0, np.NaN)

# Create a summary table of average 'Daily P&L', 'Total P&L' and 'Yield'
# 'Total P&L' = sum('Daily P&L')
summaryPortfolio = tempPortfolio.groupby(by=['Ticker'], as_index=False).agg({'Daily P&L': ['mean', 'sum'], 'Yield': 'sum'})

# Formatting the summary table: Collapse stacked columns and renaming them.
summaryPortfolio.columns = summaryPortfolio.columns.get_level_values(0)   # Collapse the stacked header
summaryPortfolio.columns = pd.io.parsers.ParserBase({'Average Daily P&L':summaryPortfolio.columns})._maybe_dedup_names(summaryPortfolio.columns)
summaryPortfolio.rename(columns={'Daily P&L': 'Average Daily P&L', 'Daily P&L.1': 'Total P&L'}, inplace=True)

summaryPortfolio

Unnamed: 0,Ticker,Average Daily P&L,Total P&L,Yield
0,AAPL,-6530.059376,-2598964.0,-268.439727
1,ABT,-14667.919404,-7377963.0,-368.898173
2,ACN,25179.295709,6445900.0,322.294985
3,AMZN,56595.565707,27392250.0,4562.203015
4,MMM,42780.571413,21561410.0,569.108765
5,SCHW,-6727.528223,-538202.3,-32.495296
6,YUM,2814.899935,836025.3,54.593719
7,ZTS,9160.454303,3261122.0,475.512331


#### i. Display your total PnL at the end of the two years.

In [12]:
print(f'Total P&L = {summaryPortfolio["Total P&L"].sum()}')

Total P&L = 48981579.158350095


In [15]:
import pandas_datareader as pdr

scaffoldPortfolio.to_csv('scaffoldPortfolio_data.csv')  # Load to CSV
summaryPortfolio.to_csv('summaryPortfolio_data.csv')

In [14]:
# To repurpose this table in other notebooks
scaffoldData = scaffoldPortfolio
%store scaffoldData
del scaffoldData

Stored 'scaffoldData' (DataFrame)
