# 5.1 Downloading adjustments
Adjustment factors are stored in the Polygon folder under <code>raw/adjustments/AAPL.csv</code>. If a ticker is recycled, then this file has the adjustment factors for all the companies associated with the ticker. To get the adjustments, we simply use the <code>Stock Splits</code> and <code>Dividends</code> endpoint through the SDK. These are <code>list_splits</code> and <code>list_dividends</code>. 

Ex-dividend is the date when the investor does *not* get the dividend. If an investor held the stock before, he does. So on ex-dividend date the stock on average drops with the dividend. We need to add the dividends back to the stock price. Or subtract them from before ex-dividend. Most platforms do the backwards adjustments so I will also do that. The advantage is that the current price in the data is then unadjusted and thus equal to the actual market price.

The execution date of a split if when the stock has just been split before market open. So all prices before the split should be adjusted. If the split is 10-to-1, this means that 10 stocks have become 1. So all prices before the execution date must be x10. If the split is 1-to-5, this means 1 stock is split into 5 pieces. Then all prices before the split date have to be divided by 5.

The file with adjustments in the <code>adjustments</code> folder have the following columns: <code>['ticker', 'date', 'type', 'subtype', 'amount']</code> The date is the ex-dividend or execution date. Type is 'DIV' or 'SPLIT'. Subtype is 'CD', 'SD', 'LT', 'ST' for dividends and 'R' (reverse), 'N' (regular) for splits. Amount is the USD amount for dividends and a fraction for splits. A 10-to-1 reverse split is 10. A 1-to-5 split is 0.2.

*Note: dividends are [already adjusted](https://polygon.io/knowledge-base/article/does-polygon-adjust-historic-dividends-for-splits) for splits. So we don't have to do this again.*

In [1]:
from polygon.rest import RESTClient
from datetime import datetime, date, time, timedelta
from times import first_trading_date_after_equal, last_trading_date_before_equal
from tickers import get_tickers
import os
import pandas as pd
import numpy as np

In [2]:
POLYGON_DATA_PATH = "../data/polygon/"

START_DATE = date(2019, 6, 1)
END_DATE = date(2024, 3, 1)

CLEAN_DOWNLOAD = True # If False, only update existing data to the END_DATE. If True, remove all files in adjustments/.

with open(POLYGON_DATA_PATH + "secret.txt") as f:
    KEY = next(f).strip()

client = RESTClient(api_key=KEY)

In [None]:
# Loop through all tickers
tickers_v3 = get_tickers(v=3)
"""
To keep track of which dates the adjustments folder has, we save the end date to _end_date.txt.
"""

# Use a different START_DATE if we only want to update
if not CLEAN_DOWNLOAD:
    if os.path.isfile(POLYGON_DATA_PATH + "raw/adjustments/_end_date.txt"):
        with open(POLYGON_DATA_PATH + "raw/adjustments/_end_date.txt") as f:
            end_date_of_data = next(f).strip()
            START_DATE = first_trading_date_after_equal(date.fromisoformat(end_date_of_data) + timedelta(days=1))
    else:
        raise Exception('There is no _end_date.txt file!')
        
for index, row in tickers_v3.iterrows():
    ticker = row["ticker"]
    start_date = max(row["start_date"], START_DATE) # Trim to START_DATE if ticker existed for longer
    end_date = min(row["end_date"], END_DATE) # Trim to END_DATE. But use end date of the ticker if delisted.

    # Tickers that do not need downloading/updating. This happens when the ticker is delisted before START_DATE.
    if end_date < START_DATE:
        continue

    # Get data
    try:
        splits = pd.DataFrame(client.list_splits(ticker=ticker, execution_date_gte=start_date, execution_date_lte=end_date))
        dividends = pd.DataFrame(client.list_dividends(ticker=ticker, ex_dividend_date_gte=start_date, ex_dividend_date_lte =end_date))
    except Exception as e:
        print(repr(e))
        continue

    # Get correct format
    if not dividends.empty:
        dividends = dividends.rename(columns={'ex_dividend_date': 'date', 'dividend_type':'subtype', 'cash_amount': 'amount'})
        dividends['type'] = 'DIV'
        dividends = dividends[['date', 'type', 'subtype', 'amount']]

    if not splits.empty:
        splits = splits.rename(columns={'execution_date': 'date'})
        splits['type'] = 'SPLIT'
        splits['subtype'] = np.where(splits['split_from'] > splits['split_to'], 'R', 'N')
        splits['amount'] = splits['split_from'] / splits['split_to']
        splits = splits[['date', 'type', 'subtype', 'amount']]
    
    # Skip loop if no data
    if splits.empty and dividends.empty:
        continue

    # Merge dividends and splits
    adjustments = pd.concat([dividends, splits])
    adjustments = adjustments.sort_values(by='date').reset_index(drop=True)
    adjustments['date'] = pd.to_datetime(adjustments['date']).dt.date

    if adjustments.isnull().values.any():
        null_data = tickers_v3[tickers_v3[["ticker", "name", "active", "type", "start_date", "last_updated_utc"]].isnull().any(axis=1)]
        raise Exception(f"There are missing values for {ticker} at index {index}.")

    # Save or update
    path = POLYGON_DATA_PATH + f'raw/adjustments/{ticker}.csv'

    # If file exists, add new adjustments. This happens if a ticker is recycled or we are updating and the stock already has a history of adjustments.
    if os.path.isfile(path):
        old_adjustments = pd.read_csv(
            path,
            parse_dates=True,
        )
        all_adjustments = pd.concat([old_adjustments, adjustments])
        all_adjustments.to_csv(path, index=False)
    else:
        adjustments.to_csv(path, index=False)
    
    print(index)

# Set the END_DATE to end_date.txt
with open(POLYGON_DATA_PATH + "raw/adjustments/_end_date.txt", 'w') as f:
    f.write(f'{END_DATE}\n')

Some examples.

In [11]:
pd.read_csv(POLYGON_DATA_PATH + "raw/adjustments/MFA.csv", index_col="date").tail(7)

Unnamed: 0_level_0,type,subtype,amount
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2022-06-29,DIV,CD,0.44
2022-09-29,DIV,CD,0.44
2022-12-29,DIV,CD,0.35
2023-03-30,DIV,CD,0.35
2023-06-29,DIV,CD,0.35
2023-09-29,DIV,CD,0.35
2023-12-28,DIV,CD,0.35


In [12]:
pd.read_csv(POLYGON_DATA_PATH + "raw/adjustments/TSLA.csv", index_col="date")

Unnamed: 0_level_0,type,subtype,amount
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2020-08-31,SPLIT,N,0.2
2022-08-25,SPLIT,N,0.333333


### Updates
Simply rerun the file with the correct <code>END_DATE</code> and set <code>CLEAN_DOWNLOAD</code> to False. The code is already structured in such a way that only the necessary is downloaded. 