## BRONZE TO SILVER LAYER

### GOLD LAYER - PROCESS DIVIDEND HISTORY

### Description
The notebook is designed to process stock dividend data by merging it with holding data, calculating dividend amounts, and saving the result to a CSV file. The function follows these steps:

In [1]:
# Importing Common Utility Function
import pandas as pd
from StockETL import GlobalPath

In [2]:
# Import necessary libraries and utility functions
%run ../COMMON/common_utility.ipynb

In [3]:
# Instantiate GlobalPath
holding_gold_file_path = GlobalPath("DATA/GOLD/Holding/Holding_data.csv")
stockevents_silver_file_path = GlobalPath(
    "DATA/SILVER/StockEvents/StockEvents_data.csv"
)
dividend_gold_file_path = GlobalPath("DATA/GOLD/Dividend/Dividend_data.csv")
dividend_gold_schema_file_path = GlobalPath("CONFIG/DATA_CONTRACTS/GOLD/Dividend.json")

In [4]:
# Load holding data from the GOLD layer
df_holding = pd.read_csv(holding_gold_file_path)
df_holding.info()
df_holding["date"] = pd.to_datetime(df_holding["date"])
print(f"Loaded GOLD Layer holding data from: {holding_gold_file_path}")

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6254 entries, 0 to 6253
Data columns (total 16 columns):
 #   Column            Non-Null Count  Dtype  
---  ------            --------------  -----  
 0   date              6254 non-null   object 
 1   segment           6254 non-null   object 
 2   exchange          6254 non-null   object 
 3   symbol            6254 non-null   object 
 4   scrip_name        6254 non-null   object 
 5   holding_quantity  6254 non-null   float64
 6   avg_price         6254 non-null   float64
 7   holding_amount    6254 non-null   float64
 8   open_price        6251 non-null   float64
 9   open_amount       6251 non-null   float64
 10  high_price        6251 non-null   float64
 11  high_amount       6251 non-null   float64
 12  low_price         6251 non-null   float64
 13  low_amount        6251 non-null   float64
 14  close_price       6251 non-null   float64
 15  close_amount      6251 non-null   float64
dtypes: float64(11), object(5)
memory usage: 78

In [5]:
# Load dividend data from the SILVER layer
df_dividends = pd.read_csv(stockevents_silver_file_path)
df_dividends["date"] = pd.to_datetime(df_dividends["date"])
print(f"Loaded SILVER Layer stock dividend data from: {stockevents_silver_file_path}")

Loaded SILVER Layer stock dividend data from: C:\Users\prashant.tripathi\Code\PortfolioTracker\DATA\SILVER\StockEvents\StockEvents_data.csv


In [6]:
# Filter for dividend events only
df_dividends = df_dividends[df_dividends["event"].str.upper() == "DIVIDENDS"]

# Merge dividend data with holding data
df_dividend = pd.merge(df_holding, df_dividends, on=["date", "symbol"], how="left")

# Calculate the dividend amount
df_dividend["dividend_amount"] = (
    df_dividend["value"].fillna(0) * df_dividend["holding_quantity"]
)
df_dividend["dividend_amount"] = df_dividend["dividend_amount"].round(2)

# Filter out rows where dividend amount is 0
df_dividend = df_dividend[df_dividend["dividend_amount"] != 0]

In [7]:
# Align Datafame with schema
df_dividend = align_with_schema(df_dividend, dividend_gold_schema_file_path)

# Sort and format the DataFrame
df_dividend = df_dividend.sort_values(by=["date", "segment", "symbol"]).reset_index(
    drop=True
)

# Save the result to a new CSV file
df_dividend.to_csv(dividend_gold_file_path, index=False)
print(
    f"GOLD Layer CSV file for Holding successfully created at: {dividend_gold_file_path}"
)

# Display DataFrame debugrmation
df_dividend.info()

GOLD Layer CSV file for Holding successfully created at: C:\Users\prashant.tripathi\Code\PortfolioTracker\DATA\GOLD\Dividend\Dividend_data.csv
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 16 entries, 0 to 15
Data columns (total 4 columns):
 #   Column           Non-Null Count  Dtype         
---  ------           --------------  -----         
 0   date             16 non-null     datetime64[ns]
 1   segment          16 non-null     string        
 2   symbol           16 non-null     string        
 3   dividend_amount  16 non-null     float64       
dtypes: datetime64[ns](1), float64(1), string(2)
memory usage: 644.0 bytes
