## BRONZE TO SILVER LAYER

### GOLD LAYER - PROCESS DIVIDEND HISTORY


### Description

The notebook is designed to process stock dividend data by merging it with holding data, calculating dividend amounts, and saving the result to a CSV file. The function follows these steps:


In [1]:
# Importing Common Utility Function
import pandas as pd

from StockETL import GlobalPath

In [2]:
# Import necessary libraries and utility functions
%run ../COMMON/common_utility.ipynb

USERNAME = 'ptprashanttripathi'


In [3]:
# Instantiate GlobalPath
holding_gold_file_path = GlobalPath(f"DATA/GOLD/Holding/{USERNAME}/Holding_data.csv")
stockevents_silver_file_path = GlobalPath(
    "DATA/SILVER/StockEvents/StockEvents_data.csv"
)
dividend_gold_file_path = GlobalPath(f"DATA/GOLD/Dividend/{USERNAME}/Dividend_data.csv")
dividend_gold_schema_file_path = GlobalPath("CONFIG/DATA_CONTRACTS/GOLD/Dividend.json")

In [4]:
# Load holding data from the GOLD layer
df_holding = pd.read_csv(holding_gold_file_path)
df_holding.info()
df_holding["date"] = pd.to_datetime(df_holding["date"])
print(f"Loaded GOLD Layer holding data from => {holding_gold_file_path}")

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8478 entries, 0 to 8477
Data columns (total 17 columns):
 #   Column            Non-Null Count  Dtype  
---  ------            --------------  -----  
 0   username          0 non-null      float64
 1   date              8478 non-null   object 
 2   segment           8478 non-null   object 
 3   exchange          8478 non-null   object 
 4   symbol            8478 non-null   object 
 5   scrip_name        8478 non-null   object 
 6   holding_quantity  8478 non-null   float64
 7   avg_price         8478 non-null   float64
 8   holding_amount    8478 non-null   float64
 9   open_price        8475 non-null   float64
 10  open_amount       8475 non-null   float64
 11  high_price        8475 non-null   float64
 12  high_amount       8475 non-null   float64
 13  low_price         8475 non-null   float64
 14  low_amount        8475 non-null   float64
 15  close_price       8475 non-null   float64
 16  close_amount      8475 non-null   float64


In [5]:
# Load dividend data from the SILVER layer
df_dividends = pd.read_csv(stockevents_silver_file_path)
df_dividends["date"] = pd.to_datetime(df_dividends["date"])
print(f"Loaded SILVER Layer stock dividend data from => {stockevents_silver_file_path}")

Loaded SILVER Layer stock dividend data from => /home/runner/work/PortfolioTracker/PortfolioTracker/DATA/SILVER/StockEvents/StockEvents_data.csv


In [6]:
# Filter for dividend events only
df_dividends = df_dividends[df_dividends["event"].str.upper() == "DIVIDENDS"]

# Merge dividend data with holding data
df_dividend = pd.merge(df_holding, df_dividends, on=["date", "symbol"], how="left")

# Calculate the dividend amount
df_dividend["dividend_amount"] = (
    df_dividend["value"].fillna(0) * df_dividend["holding_quantity"]
)

# Filter out rows where dividend amount is 0
df_dividend = df_dividend[df_dividend["dividend_amount"] != 0]

# Apply the function to create the 'financial_year' column
df_dividend["financial_year"] = pd.to_datetime(df_dividend["date"]).apply(
    get_financial_year
)

In [7]:
# Align Datafame with DataContract
df_dividend = align_with_datacontract(df_dividend, dividend_gold_schema_file_path)

# Save the result to a new CSV file
df_dividend.to_csv(dividend_gold_file_path, index=False)
print(
    f"GOLD Layer CSV file for Holding successfully created at: {dividend_gold_file_path}"
)

# Display DataFrame debugrmation
df_dividend.info()

DataContract loaded from => /home/runner/work/PortfolioTracker/PortfolioTracker/CONFIG/DATA_CONTRACTS/GOLD/Dividend.json
GOLD Layer CSV file for Holding successfully created at: /home/runner/work/PortfolioTracker/PortfolioTracker/DATA/GOLD/Dividend/ptprashanttripathi/Dividend_data.csv
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 20 entries, 0 to 19
Data columns (total 6 columns):
 #   Column           Non-Null Count  Dtype         
---  ------           --------------  -----         
 0   username         0 non-null      string        
 1   financial_year   20 non-null     string        
 2   date             20 non-null     datetime64[ns]
 3   segment          20 non-null     string        
 4   symbol           20 non-null     string        
 5   dividend_amount  20 non-null     float64       
dtypes: datetime64[ns](1), float64(1), string(4)
memory usage: 1.1 KB
