# Analyzing Stock Prices

We'll work with stock market data that was downloaded from [Yahoo Finance](https://finance.yahoo.com/) using the [yahoo_finance](https://pypi.python.org/pypi/yahoo-finance) Python package. This data consists of the daily stock prices from 2007-1-1 to 2017-04-17 for several hundred stock symbols traded on the NASDAQ stock exchange, stored in the prices folder. The download_data.py script in the same folder as the Jupyter notebook was used to download all of the stock price data. Each file in the prices folder is named for a specific stock symbol, and contains the:

- `date` -- date that the data is from.
- `close` -- the closing price on that day, which is the price when the trading day ends.
- `open` -- the opening price on that day, which is the price when the trading day starts.
- `high` -- the highest price the stock reached during trading.
- `low` -- the lowest price the stock reached during trading.
- `volume` -- the number of shares that were traded during the day.

In [35]:
import pandas as pd
import os

In [36]:
stock_prices = {}

In [37]:
for file_name in os.listdir("prices"):
    df = pd.read_csv(os.path.join("prices", file_name))
    stock_prices[file_name.replace(".csv", "")] = df

In [38]:
stock_prices["aapl"].head()

Unnamed: 0,date,close,open,high,low,volume
0,2007-01-03,83.800002,86.289999,86.579999,81.899999,309579900
1,2007-01-04,85.659998,84.050001,85.949998,83.820003,211815100
2,2007-01-05,85.049997,85.77,86.199997,84.400002,208685400
3,2007-01-08,85.47,85.959998,86.529998,85.280003,199276700
4,2007-01-09,92.570003,86.450003,92.979999,85.15,837324600


## Minimum and Maximum Average Closing Prices

In [39]:
closing_prices = {}
min_avg_closing_price = float("inf")
max_avg_closing_price = 0

In [40]:
for k, df in stock_prices.items():
    avg_closing_price = df["close"].mean()
    if avg_closing_price < min_avg_closing_price:
        min_avg_closing_price = avg_closing_price
    if avg_closing_price > max_avg_closing_price:
        max_avg_closing_price = avg_closing_price
    closing_prices[k] = avg_closing_price

closing_prices

{'dgica': 14.986583006177607,
 'bdge': 24.12035132432432,
 'cvco': 53.36543631042471,
 'blkb': 33.75537838185328,
 'bbox': 25.997579137451737,
 'ffbc': 16.002316604633204,
 'fbiz': 22.958876448262547,
 'ffic': 16.59364864787645,
 'bdsi': 4.8207065644787646,
 'amgn': 92.2331003965251,
 'expe': 53.78315830308881,
 'expd': 42.86821235366795,
 'cur': 1.907691699604743,
 'clct': 14.4366796011583,
 'alny': 39.171486488030894,
 'evol': 5.701853281853282,
 'ahgp': 38.20530885868726,
 'dfbg': 1.4005010393822395,
 'afsi': 26.69982658918919,
 'chy': 12.45603860888031,
 'bmrn': 50.52171040733592,
 'agys': 10.303613901544402,
 'adrd': 22.51748262046332,
 'drrx': 2.352779922779923,
 'crus': 20.50549421969112,
 'brew': 8.094903474517373,
 'fbms': 14.224830092664092,
 'emcf': 22.219915054826256,
 'bsqr': 4.305370656370656,
 'csfl': 11.947644780694981,
 'car': 24.829617774131272,
 'cmcsa': 35.904505792277995,
 'capr': 2.473247462919594,
 'cmtl': 30.96300771621621,
 'elos': 11.39328957027027,
 'cplp': 9

In [41]:
print(min_avg_closing_price)
print(max_avg_closing_price)

0.8122763011583011
275.13407757104255


In [42]:
trades_by_day = {}

In [43]:
for stock, df in stock_prices.items():
    for index, row in df.iterrows():
        pair = (row["volume"], stock)
        day = row["date"]
        if day not in trades_by_day:
            trades_by_day[day] = [pair]
        else:
            trades_by_day[day].append(pair)

In [None]:
trades_by_day

## Finding the Most Traded Stock Each Day

In [22]:
most_traded_by_day = {}

In [24]:
for day, trades in trades_by_day.items():
    most_traded_by_day[day] = max(trades, key=lambda x: x[1])

In [25]:
most_traded_by_day

{'2007-01-03': (275000, 'fmbi'),
 '2007-01-04': (208700, 'fmbi'),
 '2007-01-05': (321300, 'fmbi'),
 '2007-01-08': (222000, 'fmbi'),
 '2007-01-09': (192500, 'fmbi'),
 '2007-01-10': (124400, 'fmbi'),
 '2007-01-11': (82700, 'fmbi'),
 '2007-01-12': (99600, 'fmbi'),
 '2007-01-16': (196300, 'fmbi'),
 '2007-01-17': (118700, 'fmbi'),
 '2007-01-18': (248600, 'fmbi'),
 '2007-01-19': (189900, 'fmbi'),
 '2007-01-22': (134200, 'fmbi'),
 '2007-01-23': (87700, 'fmbi'),
 '2007-01-24': (434400, 'fmbi'),
 '2007-01-25': (312500, 'fmbi'),
 '2007-01-26': (305200, 'fmbi'),
 '2007-01-29': (251200, 'fmbi'),
 '2007-01-30': (230700, 'fmbi'),
 '2007-01-31': (213500, 'fmbi'),
 '2007-02-01': (190800, 'fmbi'),
 '2007-02-02': (203200, 'fmbi'),
 '2007-02-05': (246600, 'fmbi'),
 '2007-02-06': (140800, 'fmbi'),
 '2007-02-07': (158800, 'fmbi'),
 '2007-02-08': (158800, 'fmbi'),
 '2007-02-09': (176800, 'fmbi'),
 '2007-02-12': (139500, 'fmbi'),
 '2007-02-13': (159900, 'fmbi'),
 '2007-02-14': (155600, 'fmbi'),
 '2007-02-15'

## Searching for High Volume Days

In [47]:
daily_volumes = []

In [48]:
for day in trades_by_day:
    day_volume = sum([volume for volume, _ in trades_by_day[day]])
    daily_volumes.append((day_volume, day))

In [49]:
daily_volumes.sort()

In [50]:
daily_volumes[-10:]

[(1533363200, '2008-01-24'),
 (1536176400, '2008-01-16'),
 (1553880500, '2007-11-08'),
 (1555072400, '2008-09-29'),
 (1559032100, '2008-02-07'),
 (1578877700, '2008-01-22'),
 (1599183500, '2008-10-08'),
 (1611272800, '2007-07-26'),
 (1770266900, '2008-10-10'),
 (1964583900, '2008-01-23')]

## Finding Profitable Stocks

In [51]:
percentages = []

for stock_sym in stock_prices:
    prices = stock_prices[stock_sym]
    initial = prices.loc[0, "close"]
    final = prices.loc[prices.shape[0] - 1, "close"]
    percentage = 100 * (final - initial) / initial
    percentages.append((percentage, stock_sym))

percentages.sort()

percentages[-10:]

[(1330.0000666666667, 'achc'),
 (1339.2137535980346, 'bcli'),
 (1525.162516251625, 'cui'),
 (1549.6700659868027, 'apdn'),
 (1707.3554472785036, 'anip'),
 (2230.7234281466817, 'amzn'),
 (2437.4365640858978, 'blfs'),
 (3898.6004898285596, 'arcw'),
 (4005.0000000000005, 'adxs'),
 (7483.8389225948395, 'admp')]