# Analyzing Stock Prices

In this project, we'll work with stock market data that was downloaded from Yahoo Finance using the yahoo_finance Python package. This data consists of the daily stock prices from 2007-1-1 to 2017-04-17 for several hundred stock symbols traded on the NASDAQ stock exchange, stored in the prices folder. 

The download_data.py script in the same folder as the Jupyter notebook was used to download all of the stock price data. Each file in the prices folder is named for a specific stock symbol, and contains the:

* date -- date that the data is from.
* close -- the closing price on that day, which is the price when the trading day ends.
* open -- the opening price on that day, which is the price when the trading day starts.
* high -- the highest price the stock reached during trading.
* low -- the lowest price the stock reached during trading.
* volume -- the number of shares that were traded during the day.

## Stock Price Data

Now we will import and look at some sample data.

In [3]:
import pandas as pd
import os

stocks = {}

for fn in os.listdir("prices"):
    # Each file is in format: stockprice.csv
    stocks[fn.split('.')[0]] = pd.read_csv(os.path.join("prices", fn))
    
stocks['aapl'].head()

Unnamed: 0,date,close,open,high,low,volume
0,2007-01-03,83.800002,86.289999,86.579999,81.899999,309579900
1,2007-01-04,85.659998,84.050001,85.949998,83.820003,211815100
2,2007-01-05,85.049997,85.77,86.199997,84.400002,208685400
3,2007-01-08,85.47,85.959998,86.529998,85.280003,199276700
4,2007-01-09,92.570003,86.450003,92.979999,85.15,837324600


## Minimum and Maximum Average Closing Prices

We'll compute aggregates for the following:

* The average closing price of each stock.
* The minimum average closing price over all stocks.
* The maximum average closing price over all stocks.

In [7]:
average_closing = {}

for stock in stocks:
    average_closing[stock] = stocks[stock]['close'].mean()
    
min_avg_closing_price = min(average_closing, key=average_closing.get)
max_avg_closing_price = max(average_closing, key=average_closing.get)

print('Minimum: ' + str(min_avg_closing_price))
print(average_closing[min_avg_closing_price])
print('Maximum: ' + str(max_avg_closing_price))
print(average_closing[max_avg_closing_price])

Minimum: blfs
0.8122763011583011
Maximum: amzn
275.13407757104255


## Grouping Trades per Day

We'll organize trades per day for future analysis.

In [9]:
trades_by_day = {}
    
for stock in stocks:
    for idx, row in stocks[stock].iterrows():
        day = row["date"]
        volume = row["volume"]
        pair = (volume, stock)
        if day not in trades_by_day:
            trades_by_day[day] = []
        trades_by_day[day].append(pair)

## Finding the Most Traded Stock Each Day

We'll look now at finding the most traded stock each day.

In [14]:
most_traded_by_day = {}

for day in trades_by_day:
    trades_by_day[day].sort()
    most_traded_by_day[day] = trades_by_day[day][-1]

In [15]:
print(most_traded_by_day['2007-01-03'])
print(most_traded_by_day['2007-01-04'])
print(most_traded_by_day['2007-01-05'])
print(most_traded_by_day['2007-01-08'])

(309579900, 'aapl')
(211815100, 'aapl')
(208685400, 'aapl')
(199276700, 'aapl')


## Searching for High Volume Days

We will now find the top 10 days with the most trade volume.

In [16]:
print(trades_by_day['2007-01-03'][0:5])

[(0, 'apps'), (0, 'asrvp'), (0, 'atlo'), (0, 'aubn'), (0, 'banfp')]


In [19]:
volume_per_day = []

for day in trades_by_day:
    day_volume = sum([volume for volume, _ in trades_by_day[day]])
    volume_per_day.append((day_volume, day))

volume_per_day.sort()

volume_per_day[-10:]

[(1533363200, '2008-01-24'),
 (1536176400, '2008-01-16'),
 (1553880500, '2007-11-08'),
 (1555072400, '2008-09-29'),
 (1559032100, '2008-02-07'),
 (1578877700, '2008-01-22'),
 (1599183500, '2008-10-08'),
 (1611272800, '2007-07-26'),
 (1770266900, '2008-10-10'),
 (1964583900, '2008-01-23')]

## Finding Profitable Stocks

Now we will look at finding the most profitable stocks by:

* Subtracting the initial close price (first row) from the final close price (last row), then computing a percentage relative to the initial price. This will tell us how much our initial investment would have grown or shrunk.
* Sorting all of the percentages.
* Finding the ten stocks that grew the most in the time period.

In [22]:
percentages = []

for stock in stocks:
    prices = stocks[stock]
    initial = prices.loc[0, "close"]
    final = prices.loc[prices.shape[0] - 1, "close"]
    percentage = 100 * (final - initial) / initial
    percentages.append((percentage, stock))

percentages.sort()

percentages[-10:]

[(1330.0000666666667, 'achc'),
 (1339.2137535980346, 'bcli'),
 (1525.162516251625, 'cui'),
 (1549.6700659868027, 'apdn'),
 (1707.3554472785036, 'anip'),
 (2230.7234281466817, 'amzn'),
 (2437.4365640858978, 'blfs'),
 (3898.6004898285596, 'arcw'),
 (4005.0000000000005, 'adxs'),
 (7483.8389225948395, 'admp')]