# Financial Analysis of Top German Companies

In this notebook, we load and analyze key financial metrics for several major German companies.
We perform data transformations, compute statistical measures, group by business sector, and create various plots to visualize trends in
revenue, net income, return on assets (ROA), and return on equity (ROE)

In [1]:
%use dataframe(0.16.0-dev-6098)

In [2]:
%useLatestDescriptors
%use kandy

In [3]:
// Read data from a CSV file into a DataFrame
val dataFrame = DataFrame.read("top_12_german_companies.csv")
    .renameToCamelCase().rename("rOA(%)", "rOE(%)").into("ROA", "ROE")

java.lang.IllegalStateException: Column 'rOA(%)' not found among [company, period, revenue, netIncome, liabilities, assets, equity, roa, roe, debtToEquity, percentageDebtToEquity].

In [None]:
dataFrame.head()

## Data Preparation: Formatting and Categorization

In this step, we prepare and clean the data for analysis.

- Custom Date Format: We define a custom date format (MM/DD/YYYY) to parse the "period" column into `LocalDate` without zero-padding for months.
- Business Sectors: We create an `enum` to classify companies into sectors such as Automotive, Banking, IT, and others.
- Data Transformation:
    - Convert the "period" column to `LocalDate` using the custom format.
    - Parse the "percentageDebtToEquity" column by removing the percentage sign and converting it to a `Double`.
    - Sort the data by "company" and "period".
    - Add a new column, "sector," which assigns companies to specific business sectors based on their names.


This step ensures the dataset is well-structured and categorized for further analysis.

In [None]:
import kotlinx.datetime.format.Padding
import kotlinx.datetime.format.char

// Define a custom date format without zero-padding for the month,
// separating month/day/year with slashes
val format = LocalDate.Format {
    monthNumber(Padding.NONE)
    char('/')
    dayOfMonth()
    char('/')
    year()
}

// Enum of Business Sectors
enum class BusinessSector(val simpleName: String) {
    AUTOMOTIVE("Automotive"),
    BANKING("Banking"),
    INDUSTRIAL_TECH("Industrial"),
    INSURANCE_FINANCE("Insurance"),
    TELECOMMUNICATIONS("Telecom"),
    IT_SOFTWARE("IT"),
    PHARMA_CHEMICAL("Pharma"),
    OTHER("Other")
}

// Create a new DataFrame by converting the "period" column to LocalDate using the custom format
// and converting "percentageDebtToEquity" column to Double,
// then sorting based on "company" and "period", and finally adding a "sector" column
// depending on the company name
val companiesDf = dataFrame
    .convert { period }.with { LocalDate.parse(it, format) }
    .convert { percentageDebtToEquity }.with { it.removeSuffix("%").replace(',', '.').toDouble() }
    .convert { ROA and ROE }.with { it.replace(".", "").toDouble() }
    .sortBy { company and period }
    .add("sector") {
        when (company) {
            "Volkswagen AG", "BMW AG", "Daimler AG", "Porsche AG" -> BusinessSector.AUTOMOTIVE
            "Siemens AG", "BASF SE" -> BusinessSector.INDUSTRIAL_TECH
            "Allianz SE" -> BusinessSector.INSURANCE_FINANCE
            "Deutsche Bank AG" -> BusinessSector.BANKING
            "Deutsche Telekom AG" -> BusinessSector.TELECOMMUNICATIONS
            "SAP SE" -> BusinessSector.IT_SOFTWARE
            "Bayer AG", "Merck KGaA" -> BusinessSector.PHARMA_CHEMICAL
            else -> BusinessSector.OTHER
        }
    }

### Aggregating Financial Data

These steps group data by company and calculate key metrics (mean, median, std, min, max) for financial columns like revenue, net income, and ratios.

In [None]:
companiesDf.groupBy { company }.aggregate {
    val financeColumns = it.select { revenue and netIncome and liabilities and assets and equity and ROA and ROE and debtToEquity and percentageDebtToEquity }
    financeColumns.mean() into "mean"
    financeColumns.median() into "median"
    financeColumns.std() into "std"
    financeColumns.min() into "min"
    financeColumns.max() into "max"
}

In [None]:
// Group by "company" and aggregate key financial columns
companiesDf.groupBy { sector }.aggregate {
    val financeColumns = it.select { revenue and netIncome and liabilities and assets and equity and ROA and ROE and debtToEquity and percentageDebtToEquity }
    financeColumns.mean() into "mean"
    financeColumns.median() into "median"
    financeColumns.std() into "std"
    financeColumns.min() into "min"
    financeColumns.max() into "max"
}

In [None]:
companiesDf.groupBy { sector }.aggregate {
    revenue.mean() into "Avg revenue"
    revenue.sum() into "Total revenue"
    netIncome.mean() into "Avg Net Income"
    netIncome.sum() into "Sum Net Income"
    ROA.mean() into "Avg ROA"
    ROE.mean() into "Avg ROE"
}.sortBy { sector }

In [None]:
// Group by "period" and "sector" then compute total revenue and net income
val timeSerDf = companiesDf.groupBy { period and sector }.aggregate {
    revenue.sum() into "totalRevenue"
    netIncome.sum() into "totalNetIncome"
}

// List of business sectors
val listOfSectors = listOf(
    BusinessSector.AUTOMOTIVE,
    BusinessSector.BANKING,
    BusinessSector.INSURANCE_FINANCE,
    BusinessSector.INDUSTRIAL_TECH,
    BusinessSector.TELECOMMUNICATIONS,
    BusinessSector.IT_SOFTWARE,
    BusinessSector.PHARMA_CHEMICAL
)

// Matching colors for each sector
val listOfSectorColors = listOf(
    Color.hex("#ffaf00"),
    Color.hex("#f46920"),
    Color.hex("#f53255"),
    Color.hex("#f857c1"),
    Color.hex("#29bdfd"),
    Color.hex("#00cbbf"),
    Color.hex("#01c159")
)

## Visualizing Revenue and Net Income by Sector

1. Revenue by Sector:
    - A line chart shows total revenue over time, grouped by business sector.
    - Points highlight specific values, and each sector is color-coded using a predefined palette.
    - The chart includes a legend for sector identification.
2. Net Income by Sector:
    - A similar line chart displays total net income over time for each sector.
    - Points and color-coding are used to enhance clarity, with a legend indicating the sectors.

These visualizations help analyze trends and compare financial performance across sectors over time.

In [None]:
// Plot total revenue by period and sector
timeSerDf.plot {
    // Map the x-axis to the "period" column
    x(period) { axis.name = "Date" }
    // Map the y-axis to the aggregated "totalRevenue"
    y(totalRevenue) { axis.name = "Revenue" }

    // Draw a line chart
    line {
        // Color lines by the "sector" column
        color(sector) {
            // Use a categorical color scale with predefined colors and sectors
            scale = categorical(range = listOfSectorColors, domain = listOfSectors)
            // Configure and label the legend
            legend {
                name = "Sector"
                this.breaksLabeled(
                    BusinessSector.AUTOMOTIVE to BusinessSector.AUTOMOTIVE.simpleName,
                    BusinessSector.BANKING to BusinessSector.BANKING.simpleName,
                    BusinessSector.INSURANCE_FINANCE to BusinessSector.INSURANCE_FINANCE.simpleName,
                    BusinessSector.INDUSTRIAL_TECH to BusinessSector.INDUSTRIAL_TECH.simpleName,
                    BusinessSector.TELECOMMUNICATIONS to BusinessSector.TELECOMMUNICATIONS.simpleName,
                    BusinessSector.IT_SOFTWARE to BusinessSector.IT_SOFTWARE.simpleName,
                    BusinessSector.PHARMA_CHEMICAL to BusinessSector.PHARMA_CHEMICAL.simpleName
                )
            }
        }
    }
    // Add points on top of the line chart
    points {
        size = 3.0
        color(sector) { scale = categorical(range = listOfSectorColors, domain = listOfSectors) }
    }

    // Adjust the layout and overall plot appearance
    layout {
        title = "Revenue by Sector"
        size = 875 to 500
    }
}

In [None]:
// Plot total net income by period and sector
timeSerDf.plot {
    // Map the x-axis to the "period" column
    x(period) { axis.name = "Date" }
    // Map the y-axis to the aggregated "totalNetIncome"
    y(totalNetIncome) { axis.name = "Net Income" }

    // Draw a line chart
    line {
        // Color lines by the "sector" column
        color(sector) {
            // Use the same categorical color scale and sector list
            scale = categorical(range = listOfSectorColors, domain = listOfSectors)
            // Configure and label the legend
            legend {
                name = "Sector"
                this.breaksLabeled(
                    BusinessSector.AUTOMOTIVE to BusinessSector.AUTOMOTIVE.simpleName,
                    BusinessSector.BANKING to BusinessSector.BANKING.simpleName,
                    BusinessSector.INSURANCE_FINANCE to BusinessSector.INSURANCE_FINANCE.simpleName,
                    BusinessSector.INDUSTRIAL_TECH to BusinessSector.INDUSTRIAL_TECH.simpleName,
                    BusinessSector.TELECOMMUNICATIONS to BusinessSector.TELECOMMUNICATIONS.simpleName,
                    BusinessSector.IT_SOFTWARE to BusinessSector.IT_SOFTWARE.simpleName,
                    BusinessSector.PHARMA_CHEMICAL to BusinessSector.PHARMA_CHEMICAL.simpleName
                )
            }
        }

    }

    // Add points on top of the line chart
    points {
        size = 3.0
        color(sector) { scale = categorical(range = listOfSectorColors, domain = listOfSectors) }
    }

    // Adjust the layout and overall plot appearance
    layout {
        title = "Net Income by Sector"
        size = 875 to 500
    }
}

## ROA and ROE Analysis by Sector

1. Computing Averages and Standard Deviations:
    - Group the data by sector and calculate the mean and standard deviation for Return on Assets (ROA) and Return on Equity (ROE).
    - This creates a summarized dataset for sector-level performance comparison.
2. Visualizing ROA by Sector:
    - A bar chart displays the average ROA for each sector.
    - Error bars represent one standard deviation, showing the variability within each sector.
3. Visualizing ROE by Sector:
    - A similar bar chart illustrates the average ROE across sectors.
    - Error bars provide insight into the standard deviation of ROE within each sector.

These charts help compare sector-level profitability metrics and assess consistency within sectors.

In [None]:
// Group data by sector to compute average and standard deviations of ROA and ROE
val roeAndRoaDf = companiesDf.groupBy { sector }.aggregate {
    ROA.mean() into "Avg ROA"
    ROA.std() into "Std ROA"
    ROE.mean() into "Avg ROE"
    ROE.std() into "Std ROE"
}

roeAndRoaDf

In [None]:
// Plot average ROA by sector with error bars representing one standard deviation
roeAndRoaDf.plot {
    // Set the x-axis to the sector names
    x(sector.map { it.simpleName }) { axis.name = "Sector of Business" }

    bars {
        // Use the "Avg ROA" column for the bar heights
        y(`Avg ROA`) { scale = continuous(min = .0, max = 4.5e+9) }
        // Fill bars with a chosen color
        fillColor = Color.hex("#ffaf00")
    }
    lineRanges {
        // Calculate the min and max for the error bars (Std ROA)
        yMin(`Avg ROA`.toList().zip(`Std ROA`.toList()).map { it.first - it.second })
        yMax(`Avg ROA`.toList().zip(`Std ROA`.toList()).map { it.first + it.second })
        // Color the line of the ranges
        borderLine.color = Color.GREY
    }

    // Adjust layout options such as title and overall size
    layout {
        title = "Average ROA By Sector With Standard Deviation"
        size = 875 to 500
    }
}

In [None]:
// Plot average ROE by sector with error bars representing one standard deviation
roeAndRoaDf.plot {
    // Set the x-axis to the sector names
    x(sector.map { it.simpleName }) { axis.name = "Sector of Business" }

    bars {
        // Use the "Avg ROE" column for the bar heights
        y(`Avg ROE`)
        // Fill bars with a chosen color
        fillColor = Color.hex("#ffaf00")
    }
    lineRanges {
        // Calculate the min and max for the error bars (Std ROE)
        yMin(`Avg ROE`.toList().zip(`Std ROE`.toList()).map { it.first - it.second })
        yMax(`Avg ROE`.toList().zip(`Std ROE`.toList()).map { it.first + it.second })
        // Color the line of the ranges
        borderLine.color = Color.GREY
    }

    // Adjust layout options such as title and overall size
    layout {
        title = "Average ROE By Sector With Standard Deviation"
        size = 875 to 500
    }
}