## BRONZE TO SILVER LAYER

### GOLD LAYER - PROCESS DIVIDEND HISTORY


### Description

The notebook is designed to process stock dividend data by merging it with holding data, calculating dividend amounts, and saving the result to a CSV file. The function follows these steps:


In [1]:
# Importing Common Utility Function
import pandas as pd

from StockETL import GlobalPath

In [2]:
# Import necessary libraries and utility functions
%run ../COMMON/common_utility.ipynb

USERNAME = 'ptprashanttripathi'


In [3]:
# Instantiate GlobalPath
holding_gold_file_path = GlobalPath(f"DATA/GOLD/Holding/{USERNAME}/Holding_data.csv")
stockevents_silver_file_path = GlobalPath(
    "DATA/SILVER/StockEvents/StockEvents_data.csv"
)
dividend_gold_file_path = GlobalPath(f"DATA/GOLD/Dividend/{USERNAME}/Dividend_data.csv")
dividend_gold_schema_file_path = GlobalPath("CONFIG/DATA_CONTRACTS/GOLD/Dividend.json")

In [4]:
# Load holding data from the GOLD layer
df_holding = pd.read_csv(holding_gold_file_path)
df_holding.info()
df_holding["date"] = pd.to_datetime(df_holding["date"])
print(f"Loaded GOLD Layer holding data from => {holding_gold_file_path}")

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7695 entries, 0 to 7694
Data columns (total 16 columns):
 #   Column            Non-Null Count  Dtype  
---  ------            --------------  -----  
 0   date              7695 non-null   object 
 1   segment           7695 non-null   object 
 2   exchange          7695 non-null   object 
 3   symbol            7695 non-null   object 
 4   scrip_name        7695 non-null   object 
 5   holding_quantity  7695 non-null   float64
 6   avg_price         7695 non-null   float64
 7   holding_amount    7695 non-null   float64
 8   open_price        7692 non-null   float64
 9   open_amount       7692 non-null   float64
 10  high_price        7692 non-null   float64
 11  high_amount       7692 non-null   float64
 12  low_price         7692 non-null   float64
 13  low_amount        7692 non-null   float64
 14  close_price       7692 non-null   float64
 15  close_amount      7692 non-null   float64
dtypes: float64(11), object(5)
memory usage: 96

In [5]:
# Load dividend data from the SILVER layer
df_dividends = pd.read_csv(stockevents_silver_file_path)
df_dividends["date"] = pd.to_datetime(df_dividends["date"])
print(f"Loaded SILVER Layer stock dividend data from => {stockevents_silver_file_path}")

Loaded SILVER Layer stock dividend data from => /home/runner/work/PortfolioTracker/PortfolioTracker/DATA/SILVER/StockEvents/StockEvents_data.csv


In [6]:
# Filter for dividend events only
df_dividends = df_dividends[df_dividends["event"].str.upper() == "DIVIDENDS"]

# Merge dividend data with holding data
df_dividend = pd.merge(df_holding, df_dividends, on=["date", "symbol"], how="left")

# Calculate the dividend amount
df_dividend["dividend_amount"] = (
    df_dividend["value"].fillna(0) * df_dividend["holding_quantity"]
)

# Filter out rows where dividend amount is 0
df_dividend = df_dividend[df_dividend["dividend_amount"] != 0]

# Apply the function to create the 'financial_year' column
df_dividend["financial_year"] = pd.to_datetime(df_dividend["date"]).apply(
    get_financial_year
)

In [7]:
# Align Datafame with DataContract
df_dividend = align_with_datacontract(df_dividend, dividend_gold_schema_file_path)

# Save the result to a new CSV file
df_dividend.to_csv(dividend_gold_file_path, index=False)
print(
    f"GOLD Layer CSV file for Holding successfully created at: {dividend_gold_file_path}"
)

# Display DataFrame debugrmation
df_dividend.info()

DataContract loaded from => /home/runner/work/PortfolioTracker/PortfolioTracker/CONFIG/DATA_CONTRACTS/GOLD/Dividend.json
GOLD Layer CSV file for Holding successfully created at: /home/runner/work/PortfolioTracker/PortfolioTracker/DATA/GOLD/Dividend/ptprashanttripathi/Dividend_data.csv
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 19 entries, 0 to 18
Data columns (total 5 columns):
 #   Column           Non-Null Count  Dtype         
---  ------           --------------  -----         
 0   financial_year   19 non-null     string        
 1   date             19 non-null     datetime64[ns]
 2   segment          19 non-null     string        
 3   symbol           19 non-null     string        
 4   dividend_amount  19 non-null     float64       
dtypes: datetime64[ns](1), float64(1), string(3)
memory usage: 892.0 bytes
