## BRONZE TO SILVER LAYER

### GOLD LAYER - PROCESS HOLDING RECORDS HISTORY

### Description
The notebook is designed to process stock dividend data by merging it with holdings data, calculating dividend amounts, and saving the result to a CSV file. The function follows these steps:

1. **Load Holdings Data**: Reads holdings data from a specified file, converting the 'date' column to datetime format for consistency.

2. **Load Dividend Data**: Reads dividend data from a specified file, converting the 'date' column to datetime format and filtering to include only 'DIVIDENDS' events.

3. **Merge Data**: Combines the holdings data with the dividend data based on common 'date' and 'stock_name' columns.

4. **Calculate Dividend Amount**: Computes the dividend amount by multiplying the dividend 'value' by the 'holding_quantity', rounding the result to two decimal places.

5. **Filter and Format Data**: Removes rows where the calculated dividend amount is zero, sorts the data by 'date', 'segment', and 'stock_name', and selects relevant columns for the output.

6. **Save Data**: Saves the processed data to a new CSV file, ensuring data is properly formatted and errors are handled.

7. **Logging and Display**: Logs the file path of the saved data and displays the DataFrame's debugrmation for verification.

This notebook ensures that the dividend data is accurately processed, organized, and ready for further analysis or reporting.


In [29]:
# Import necessary libraries and utility functions
import pandas as pd

from PortfolioTracker.globalpath import GlobalPath

In [30]:
# Instantiate GlobalPath
global_path = GlobalPath("PortfolioTracker")
# GLOBAL PATH
holdings_gold_file_path = global_path.joinpath(
    "DATA/GOLD/Holdings/Holdings_data.csv"
)
stockevents_silver_file_path = global_path.joinpath(
    "DATA/SILVER/StockEvents/StockEvents_data.csv"
)
dividend_gold_file_path = global_path.joinpath(
    "DATA/GOLD/Dividend/Dividend_data.csv"
)

In [31]:
# Load holdings data from the GOLD layer
df_holdings = pd.read_csv(holdings_gold_file_path)
df_holdings["date"] = pd.to_datetime(df_holdings["date"])
print(f"Loaded GOLD Layer holdings data from: {holdings_gold_file_path}")

Loaded GOLD Layer holdings data from: C:\Users\prashant.tripathi\Code\PortfolioTracker\DATA\GOLD\Holdings\Holdings_data.csv


In [32]:
# Load dividend data from the SILVER layer
df_dividends = pd.read_csv(stockevents_silver_file_path)
df_dividends["date"] = pd.to_datetime(df_dividends["date"])
print(
    f"Loaded SILVER Layer stock dividend data from: {stockevents_silver_file_path}"
)

Loaded SILVER Layer stock dividend data from: C:\Users\prashant.tripathi\Code\PortfolioTracker\DATA\SILVER\StockEvents\StockEvents_data.csv


In [33]:
# Filter for dividend events only
df_dividends = df_dividends[df_dividends["event"].str.upper() == "DIVIDENDS"]

# Merge dividend data with holdings data
df_dividend = pd.merge(
    df_holdings, df_dividends, on=["date", "symbol"], how="left"
)

# Calculate the dividend amount
df_dividend["dividend_amount"] = (
    df_dividend["value"].fillna(0) * df_dividend["holding_quantity"]
)
df_dividend["dividend_amount"] = df_dividend["dividend_amount"].round(2)

# Filter out rows where dividend amount is 0
df_dividend = df_dividend[df_dividend["dividend_amount"] != 0]

In [34]:
# Sort and format the DataFrame
df_dividend = df_dividend.sort_values(
    by=["date", "segment", "symbol"]
).reset_index(drop=True)
df_dividend = df_dividend[["date", "segment", "symbol", "dividend_amount"]]

# Save the result to a new CSV file
df_dividend.to_csv(dividend_gold_file_path, index=False)
print(
    f"GOLD Layer CSV file for Holdings successfully created at: {dividend_gold_file_path}"
)

# Display DataFrame debugrmation
df_dividend.info()

GOLD Layer CSV file for Holdings successfully created at: C:\Users\prashant.tripathi\Code\PortfolioTracker\DATA\GOLD\Dividend\Dividend_data.csv
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 14 entries, 0 to 13
Data columns (total 4 columns):
 #   Column           Non-Null Count  Dtype         
---  ------           --------------  -----         
 0   date             14 non-null     datetime64[ns]
 1   segment          14 non-null     object        
 2   symbol           14 non-null     object        
 3   dividend_amount  14 non-null     float64       
dtypes: datetime64[ns](1), float64(1), object(2)
memory usage: 580.0+ bytes
