# Stock Price Data

I created a dictionary containg stock symbols as the keys and a corresponding dataframes containg the trading information for the symbol (trade date, closing price, opening price, price high, price low, and volume sold)as the value.

I read in the data from a folder of CSV files containing stock data.

In [35]:
import pandas as pd
import os

stockprices = {}

for fn in os.listdir("prices"):
    df = pd.read_csv(os.path.join("prices", fn))
    stockprices.update({fn[:-4]:df})




# Computing Aggregates

## Computing Average Closing Prices

To compute average closing prices, I calculated a dictionary where the keys are the stock symbols and the values are the average closing price of that stock.

In [36]:
avg_stockprices = {}

for fn in os.listdir("prices"):
    df = pd.read_csv(os.path.join("prices", fn))
    avg = df['close'].mean()
    avg_stockprices.update({fn[:-4]:avg})


## Computing Minimum Closing Prices

I calculated the stock symbol with the minimum average closing price.

In [37]:
close_min = min(avg_stockprices.values())

for key,value in avg_stockprices.items():
    if value == close_min:
        print(key,value)

blfs 0.8122763011583011


## Computing Maximum Closing Prices

I calculated the stock symbol with the maximum average closing price.

In [38]:
close_max = max(avg_stockprices.values())

for key,value in avg_stockprices.items():
    if value == close_max:
        print(key,value)

amzn 275.13407757104255


# Organizing the trades per day

To organize the trades per day, I calculated a dictionary where the keys are the days and the values are lists of pairs (volume, stocksymbol) of all trades that occurred on that day.

In [39]:
bydate = {}
            
for stocksymbol, data in stockprices.items():
    for date_index, date in enumerate(data['date']):
        if date not in bydate:
            bydate[date]=[]
        bydate[date].append([stocksymbol,data['volume'][date_index]])

In [40]:
bydate['2007-01-03']

[['dgica', 30100],
 ['bdge', 100],
 ['cvco', 36500],
 ['blkb', 365800],
 ['bbox', 108200],
 ['ffbc', 192600],
 ['fbiz', 400],
 ['ffic', 46300],
 ['bdsi', 29500],
 ['amgn', 12908400],
 ['expe', 1703300],
 ['expd', 2343300],
 ['clct', 25100],
 ['alny', 269000],
 ['evol', 72900],
 ['ahgp', 127900],
 ['dfbg', 4700],
 ['afsi', 1312400],
 ['chy', 203200],
 ['bmrn', 1946200],
 ['agys', 993300],
 ['adrd', 19700],
 ['drrx', 380300],
 ['crus', 1216400],
 ['brew', 15500],
 ['fbms', 0],
 ['emcf', 1600],
 ['bsqr', 63000],
 ['csfl', 4300],
 ['car', 1146900],
 ['cmcsa', 39543600],
 ['cmtl', 241800],
 ['elos', 820400],
 ['eltk', 11600],
 ['agii', 151000],
 ['coke', 23300],
 ['egan', 4000],
 ['cpss', 114000],
 ['adtn', 2596900],
 ['ffiv', 1740400],
 ['cspi', 3500],
 ['bwen', 2700],
 ['cgnx', 434200],
 ['cdns', 5240400],
 ['egt', 10600],
 ['cray', 485600],
 ['arcw', 500],
 ['bncn', 27400],
 ['admp', 1100],
 ['cnsl', 205100],
 ['abax', 193000],
 ['aris', 2000],
 ['cyrn', 257600],
 ['asys', 8500],
 ['bosc

# Finding The Most Traded Stock Each Day

To find the most traded stock for each day, I created a dictionary where the trading days are the keys, and a list of pairs of the stock symbol and the trade volume are the values.


In [41]:
most_traded = {}
for k, value in bydate.items():
    ordered = sorted(value , key=lambda i : i[1],reverse=True)
    if k not in most_traded:
        most_traded[k]=[]
    most_traded[k].append(ordered[0])

In [42]:
print(most_traded['2007-01-03'])
print(most_traded['2007-01-04'])
print(most_traded['2007-01-05'])
print(most_traded['2007-01-08'])
print(most_traded['2007-01-09'])

[['aapl', 309579900]]
[['aapl', 211815100]]
[['aapl', 208685400]]
[['aapl', 199276700]]
[['aapl', 837324600]]


# Searching For High Volume Days

To find the days with the highest volumes, I created a list that contained each trading day and the sum of the volumes per day.

In [43]:
total_volume = []

for date in bydate:
    volumes_sum = sum([volume for stocksymbol, volume in bydate[date]])
    total_volume.append((volumes_sum, date))

total_volume.sort(reverse=True)

total_volume[:10]

[(1964583900, '2008-01-23'),
 (1770266900, '2008-10-10'),
 (1611272800, '2007-07-26'),
 (1599183500, '2008-10-08'),
 (1578877700, '2008-01-22'),
 (1559032100, '2008-02-07'),
 (1555072400, '2008-09-29'),
 (1553880500, '2007-11-08'),
 (1536176400, '2008-01-16'),
 (1533363200, '2008-01-24')]

# Finding Profitable Stocks

To find the most profitable stocks I calcuted how much each stock price grew over time (2007-2017) and put the percentages in a list with each corresponding stock symbol.

In [44]:
percentages = []

for stocksymbol in stockprices:
    prices = stockprices[stocksymbol]
    initial = prices.loc[0, "close"]
    final = prices.loc[prices.shape[0] - 1, "close"]
    percentage = 100 * (final - initial) / initial
    percentages.append((percentage, stocksymbol))

percentages.sort(reverse=True)

percentages[:10]
                                        

[(7483.8389225948395, 'admp'),
 (4005.0000000000005, 'adxs'),
 (3898.6004898285596, 'arcw'),
 (2437.4365640858978, 'blfs'),
 (2230.7234281466817, 'amzn'),
 (1707.3554472785036, 'anip'),
 (1549.6700659868027, 'apdn'),
 (1525.162516251625, 'cui'),
 (1339.2137535980346, 'bcli'),
 (1330.0000666666667, 'achc')]