# Imports

In [6]:
import yfinance as yf
import pandas as pd
import os
from pystock.portfolio import Stock

In [2]:
DATA_DIR = 'Data'

# Downloading Data

`pystock` modules loads the data from local directory. Furthermore, `pystock` assumes that your data is in standard format. The standard format is a csv file with the following columns:
1. **Date**: Date of the data
2. **Close**: Closing price of the stock (**Adj Close** also works.)

This two columns are the minimum requirement for the data. However, you can add more columns to the data. For example, you can add the following columns to the data:
1. **Open**: Opening price of the stock
2. **High**: Highest price of the stock
3. **Low**: Lowest price of the stock

The best way to have the data in this format is by using Yahoo Finance. For this you can use the `yfinance` module. The following code downloads the data from Yahoo Finance and saves it in the `data` directory.

```python
    import yfinance as yf
    import pandas as pd
    
    # Downloading data
    data = yf.download('AAPL', start='2010-01-01', end='2020-01-01')
    
    # Saving data
    data.to_csv('data/AAPL.csv')

Here, we'll give some examples of how to use the `yfinance` module. For details, you can refer the `yfinance` [repository](https://github.com/ranaroussi/yfinance).

## Downloading Indices

### Downloading Single Indices

In [12]:
start = '2010-01-01'
end = '2022-12-30'

snp = yf.download('^GSPC', start=start, end=end)
snp.head()

[*********************100%***********************]  1 of 1 completed


Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2010-01-04,1116.560059,1133.869995,1116.560059,1132.98999,1132.98999,3991400000
2010-01-05,1132.660034,1136.630005,1129.660034,1136.52002,1136.52002,2491020000
2010-01-06,1135.709961,1139.189941,1133.949951,1137.140015,1137.140015,4972660000
2010-01-07,1136.27002,1142.459961,1131.319946,1141.689941,1141.689941,5270680000
2010-01-08,1140.52002,1145.390015,1136.219971,1144.97998,1144.97998,4389590000


Let's save this:

In [5]:
snp.to_csv(os.path.join(DATA_DIR, 'snp.csv'))

Now, you can easily load the data using `pystock`:

In [14]:
snp = Stock("S&P", os.path.join(DATA_DIR, 'snp.csv'))
snp

Stock(name=S&P)

In [16]:
snp.load_data(columns=['Adj Close'], rename_cols=['Close'], frequency='D')

Unnamed: 0_level_0,Close
Date,Unnamed: 1_level_1
2010-01-04,1132.989990
2010-01-05,1136.520020
2010-01-06,1137.140015
2010-01-07,1141.689941
2010-01-08,1144.979980
...,...
2022-12-26,3844.820068
2022-12-27,3829.250000
2022-12-28,3783.219971
2022-12-29,3849.280029


### Downloading Multiple Indices

This can be done by passing a list of indices to the `tickers` argument:

In [21]:
indices = ["^DJI", "^FTSE"]

data = yf.download(indices, start=start, end=end, group_by='ticker')

[*********************100%***********************]  2 of 2 completed


You also need pass the argument `group_by='ticker'` to the `load_data` function. This will group the data by the ticker.

In [23]:
data.head()

Unnamed: 0_level_0,^FTSE,^FTSE,^FTSE,^FTSE,^FTSE,^FTSE,^DJI,^DJI,^DJI,^DJI,^DJI,^DJI
Unnamed: 0_level_1,Open,High,Low,Close,Adj Close,Volume,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2
2010-01-04,5412.899902,5500.299805,5410.799805,5500.299805,5500.299805,750942000.0,10430.69043,10604.969727,10430.69043,10583.959961,10583.959961,179780000.0
2010-01-05,5500.299805,5536.399902,5480.700195,5522.5,5522.5,1149301000.0,10584.55957,10584.55957,10522.519531,10572.019531,10572.019531,188540000.0
2010-01-06,5522.5,5536.5,5497.700195,5530.0,5530.0,998295300.0,10564.719727,10594.990234,10546.549805,10573.679688,10573.679688,186040000.0
2010-01-07,5530.0,5551.700195,5499.799805,5526.700195,5526.700195,1162934000.0,10571.110352,10612.370117,10505.209961,10606.860352,10606.860352,217390000.0
2010-01-08,5526.700195,5549.299805,5494.799805,5534.200195,5534.200195,1006421000.0,10606.400391,10619.400391,10554.330078,10618.19043,10618.19043,172710000.0


Now, you can separate the data by the ticker:

In [25]:
dji = data["^DJI"]
ftse = data["^FTSE"]
display(dji.head())
display(ftse.head())

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2010-01-04,10430.69043,10604.969727,10430.69043,10583.959961,10583.959961,179780000.0
2010-01-05,10584.55957,10584.55957,10522.519531,10572.019531,10572.019531,188540000.0
2010-01-06,10564.719727,10594.990234,10546.549805,10573.679688,10573.679688,186040000.0
2010-01-07,10571.110352,10612.370117,10505.209961,10606.860352,10606.860352,217390000.0
2010-01-08,10606.400391,10619.400391,10554.330078,10618.19043,10618.19043,172710000.0


Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2010-01-04,5412.899902,5500.299805,5410.799805,5500.299805,5500.299805,750942000.0
2010-01-05,5500.299805,5536.399902,5480.700195,5522.5,5522.5,1149301000.0
2010-01-06,5522.5,5536.5,5497.700195,5530.0,5530.0,998295300.0
2010-01-07,5530.0,5551.700195,5499.799805,5526.700195,5526.700195,1162934000.0
2010-01-08,5526.700195,5549.299805,5494.799805,5534.200195,5534.200195,1006421000.0


## Downloading Stocks and Other Securities

The same procedure can be used for downloading stocks and other securities. For example, let's download the data for the some securities:

In [27]:
tickers = ['AAPL', 'MSFT', 'AMZN', 'GOOG']
data = yf.download(tickers, start='2010-01-01', end='2020-01-01', group_by='ticker')

[*********************100%***********************]  4 of 4 completed


In [29]:
amazon = data["AMZN"]
display(amazon.head())

Unnamed: 0_level_0,Open,High,Low,Close,Adj Close,Volume
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
2010-01-04,6.8125,6.8305,6.657,6.695,6.695,151998000
2010-01-05,6.6715,6.774,6.5905,6.7345,6.7345,177038000
2010-01-06,6.73,6.7365,6.5825,6.6125,6.6125,143576000
2010-01-07,6.6005,6.616,6.44,6.5,6.5,220604000
2010-01-08,6.528,6.684,6.4515,6.676,6.676,196610000


> For the time being, `pystock` requires you to pass the stock data as a file directory. In the future, we'll add the functionality to pass the data as a `pandas` dataframe.