---
__Analyzing Stock Prices__

---

In this project, we'll look at stock market data from the yahoo_finance Python package. 

This data consists of the daily stock prices from 2007-1-1 to 2017-04-17 for several hundred stock symbols traded on the NASDAQ stock exchange, stored in the prices folder. 

The download_data.py script in the same folder as the Jupyter notebook was used to download all of the stock price data. 

Each file in the prices folder is named for a specific stock symbol, and contains the:

- date -- date that the data is from.
- close -- the closing price on that day, which is the price when the trading day ends.
- open -- the opening price on that day, which is the price when the trading day starts.
- high -- the highest price the stock reached during trading.
- low -- the lowest price the stock reached during trading.
- volume -- the number of shares that were traded during the day.

In [1]:
import os, pandas as pd

In [2]:
print("Files in directory: ", len(os.listdir('prices/')))

Files in directory:  560


In [3]:
print('Rows across all directory files:' , 
      (sum([ pd.read_csv("prices/" + i).shape[0] for i in os.listdir('prices/')])))

Rows across all directory files: 1449763


In [4]:
# Create dictionary of dataframes for directory files
directory = 'prices'
stocks = {}
for file in os.listdir(directory):
    key = file.split('.')[0]
    val = pd.read_csv(os.path.join(directory, file))
    stocks[key] = val

In [5]:
stocks['aapl'].head()

Unnamed: 0,date,close,open,high,low,volume
0,2007-01-03,83.800002,86.289999,86.579999,81.899999,309579900
1,2007-01-04,85.659998,84.050001,85.949998,83.820003,211815100
2,2007-01-05,85.049997,85.77,86.199997,84.400002,208685400
3,2007-01-08,85.47,85.959998,86.529998,85.280003,199276700
4,2007-01-09,92.570003,86.450003,92.979999,85.15,837324600


__Closing Prices__

In [6]:
# Average closing price
closing_mean = {}
for df in stocks.keys():
    close = stocks[df]['close'].mean()
    closing_mean.update({df: close})
    
closing_mean = pd.DataFrame(data = closing_mean.values(), 
                            index = closing_mean.keys(),
                           columns = ['close_mean'])

closing_mean = closing_mean.sort_values(by = 'close_mean', ascending = False)

In [7]:
print('Top 5') 
closing_mean.head().round(2)

Top 5


Unnamed: 0,close_mean
amzn,275.13
aapl,257.18
cme,230.29
atri,228.39
fcnca,200.25


In [8]:
print('Bottom 5')
closing_mean.tail().round(2)

Bottom 5


Unnamed: 0,close_mean
cyrx,1.16
bcli,1.0
bmra,0.9
apdn,0.82
blfs,0.81


__Trades Per Day__

In [9]:
day_trades = {}

for i in stocks.keys():
    for ind, row in stocks[i].iterrows():
        day = row['date']
        vol = row['volume']
        pair = (vol, i)
        
        if day not in day_trades:
            day_trades[day] = []
                
        day_trades[day].append(pair)

__Most Traded Stock Per Day__

In [10]:
top_daily = {}

for i in day_trades:
    day_trades[i].sort()
    top_daily[i] = day_trades[i][-1]

__High Volume Days__

In [11]:
vols = []

for i in day_trades:
    day_vol = sum([vol for vol, _ in day_trades[i]])
    vols.append((round(day_vol,2), i))

vols.sort()
vols[-10:]

[(1533363200, '2008-01-24'),
 (1536176400, '2008-01-16'),
 (1553880500, '2007-11-08'),
 (1555072400, '2008-09-29'),
 (1559032100, '2008-02-07'),
 (1578877700, '2008-01-22'),
 (1599183500, '2008-10-08'),
 (1611272800, '2007-07-26'),
 (1770266900, '2008-10-10'),
 (1964583900, '2008-01-23')]

__Most Profitable Stocks__

In [12]:
pct_change = []
for i in stocks:
    data = stocks[i]
    start = data.loc[0, 'close']
    end = data.loc[len(data.index)-1, 'close']
    pct = round(100 * (end - start) / start,2)
    pct_change.append((pct, i))
    
pct_change.sort()
pct_change[-10:]

[(1330.0, 'achc'),
 (1339.21, 'bcli'),
 (1525.16, 'cui'),
 (1549.67, 'apdn'),
 (1707.36, 'anip'),
 (2230.72, 'amzn'),
 (2437.44, 'blfs'),
 (3898.6, 'arcw'),
 (4005.0, 'adxs'),
 (7483.84, 'admp')]

Within this dataset, the most profitable stock to have purchased wsa teh ADMP stock, growing by almost 8000% in value over the period. 

Potential further analysis:
- What stocks would have been best to short at the start of the period?
- Which stocks have the most after-hours trading, and show the biggest changes between the closing price and the next day open?
- Can technical indicators like Bollinger Bands help us forecast the market?
- What time periods have resulted in steady increases in prices, and what periods have resulted in steady declines?
- Based on price, what was the optimal day to buy each stock if we wanted to hold them until now?
- On days with high trading volume, do stocks move in one direction (up or down) more than the other one?

