# EBIT Calculation

This notebook calculates the metric EBIT from other metric values extracted from EDGAR database. The formula to calculate EBIT is: <br><br>
$EBIT = Revenue - Cost of Goods and Serves Sold - Administrative Expenses$

# Import packages

In [3]:
import os
import json
from tqdm import tqdm
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import re 

## Calculate and extract EBIT file

In [4]:
# Define the directory where the metric CSV files are stored
metric_data_directory = '../../data/00_raw/metric_data'

# Define the filename for the EBIT CSV file
ebit_output_file = f'{metric_data_directory}/EBIT.csv'

# List of metrics required for EBIT calculation
metrics_for_ebit = [
    'RevenueFromContractWithCustomerExcludingAssessedTax',
    'CostOfGoodsAndServicesSold',
    'SellingGeneralAndAdministrativeExpense'
]

# Initialize an empty DataFrame for EBIT
ebit_df = None

# Loop through the metrics for EBIT calculation
for metric in metrics_for_ebit:
    # Load the metric-specific CSV file
    metric_file = os.path.join(metric_data_directory, f'{metric}.csv')
    df = pd.read_csv(metric_file)
    
    # Pivot the data to have metrics as columns, indexed by 'year' and 'cik'
    df_pivot = df.pivot(index=['year', 'cik'], columns='metric', values='val')
    
    # If this is the first metric, set the EBIT DataFrame to the pivot table
    if ebit_df is None:
        ebit_df = df_pivot.copy()
    else:
        # Subtract the metric-specific DataFrame from the existing EBIT DataFrame
        ebit_df = ebit_df.sub(df_pivot, fill_value=0)
    
# Calculate EBIT as the difference between specific columns
ebit_df['val'] = ebit_df['Revenue'] - ebit_df['Cost of Goods Sold'] - ebit_df['Selling, General and Administrative Expenses']
ebit_df['metric'] = 'EBIT'
# Reset the index
ebit_df.reset_index(inplace=True)

# Save the EBIT DataFrame to a CSV file
ebit_df.to_csv(ebit_output_file, index=False)

# Print a completion message
print("EBIT calculation and export completed.")

EBIT calculation and export completed.


## Next step

This is the last step of getting and cleaning the raw data. Next, we will start training the FLS classifiers in folder **1_fls_classifiers**. 