## Loan Distribution by Category
https://www.lb.lt/lt/m_statistika/t-paskolos-pagal-rezidentiskuma

In [3]:
import numpy as np
import pandas as pd
import plotly.graph_objects as go
import plotly.io as pio

pio.renderers.default = 'browser'

## Data Loading, Review, and Cleaning

In [4]:
file_path = 'paskolos_pagal_rezidentiskuma.csv' 
data = pd.read_csv(file_path, delimiter=';')

print(data.columns)

# Check for duplicate rows
duplicates = data.duplicated().sum()
print(f'Total duplicate rows: {duplicates}')

# Check for missing values
missing_values = data.isnull().sum().sum()
print(f'Total missing (empty) values: {missing_values}')

# Show detailed missing values for each column 
print("Missing values per column:")
print(data.isnull().sum())
print(data.head())


Index(['code', 'en_long_title', 'lt_long_title', 'date', 'value', 'power'], dtype='object')
Total duplicate rows: 0
Total missing (empty) values: 0
Missing values per column:
code             0
en_long_title    0
lt_long_title    0
date             0
value            0
power            0
dtype: int64
                                       code  \
0  BPS.M.A20.A.1.00.000.000.LT.N.PAB__.E.SR   
1  BPS.M.A20.A.1.00.000.000.LT.N.PAB__.E.SR   
2  BPS.M.A20.A.1.00.000.000.LT.N.PAB__.E.SR   
3  BPS.M.A20.A.1.00.000.000.LT.N.PAB__.E.SR   
4  BPS.M.A20.A.1.00.000.000.LT.N.PAB__.E.SR   

                                       en_long_title  \
0  Monetary financial institutions balance sheet ...   
1  Monetary financial institutions balance sheet ...   
2  Monetary financial institutions balance sheet ...   
3  Monetary financial institutions balance sheet ...   
4  Monetary financial institutions balance sheet ...   

                                       lt_long_title     date   value    power

The dataset is neatly structured (there are no missing data or duplicates) and consists of the following columns:

code: A unique identifier assigned to each row in the dataset.

en_long_title: An indicator that specifies the information or data this indicator represents in English.

lt_long_title: An indicator that provides the same information as "en_long_title," but in Lithuanian.

date: The loan date, presented in year-month format (YYYY-MM).

value: The loan amount.

power: A numerical power or factor that complements the "value" field.

We proceed by converting the 'date' column to datetime format and extracting loan categories from the 'lt_long_title' column, where each loan category is determined by the first word of the description. Next, we create an interactive bar chart that displays the loan distribution over time according to these categories. Additionally, we present an interactive line chart that allows for the analysis of loan trends across different categories over time.

In [5]:
data['date'] = pd.to_datetime(data['date'], format='%Y-%m')

data['adjusted_value'] = data['value'] * data['power']

data['category'] = data['lt_long_title'].apply(lambda x: x.split(' ')[0])

data['category'] = data['category'].replace({'Pinigų': 'Value'})  


fig = go.Figure()

categories = data['category'].unique()


for category in categories:
    category_data = data[data['category'] == category]
    
    # Bar chart trace
    fig.add_trace(
        go.Bar(
            x=category_data['date'],
            y=category_data['adjusted_value'],
            name=f'{category} (Bar)',
            visible=True,  
        )
    )
    
    # Line chart trace
    fig.add_trace(
        go.Scatter(
            x=category_data['date'],
            y=category_data['adjusted_value'],
            mode='lines+markers',
            name=f'{category} (Line)',
            visible=False  
        )
    )


dropdown_buttons = [
    {
        'label': 'Bar Chart',
        'method': 'update',
        'args': [
            {'visible': [True if i % 2 == 0 else False for i in range(2 * len(categories))]},
            {'title': 'Loan Distribution by Category - Bar Chart'}
        ]
    },
    {
        'label': 'Line Chart',
        'method': 'update',
        'args': [
            {'visible': [True if i % 2 == 1 else False for i in range(2 * len(categories))]},
            {'title': 'Loan Distribution by Category - Line Chart'}
        ]
    },
    {
        'label': 'Both',
        'method': 'update',
        'args': [
            {'visible': [True] * (2 * len(categories))},  
            {'title': 'Loan Distribution by Category - Both Charts'}
        ]
    }
]


fig.update_layout(
    updatemenus=[{
        'buttons': dropdown_buttons,
        'direction': 'down',
        'showactive': True,
    }],
    title="Loan Distribution by Category Over Time",
    xaxis_title="Date",
    yaxis_title="Total Loan Amount (€)"
)


fig.show()


In [6]:
# Export to HTML
pio.write_html(fig, file="loan_distribution.html", auto_open=False)

### Data Interpretation and Conclusions

Overall Loan Growth:

Both the bar and line charts clearly demonstrate significant loan growth during the period from 2005 to 2023. Loan amounts increased substantially, particularly after 2010, reflecting the economic recovery following the global financial crisis. The steady rise in loans indicates increasing demand for credit due to improving macroeconomic conditions, more favorable lending terms, and growing consumer and business needs to borrow capital for investments.

Both the bar and line charts show signs of seasonality and fluctuations:

2008 Financial Crisis: The crisis had a global impact, and periods of stagnation or even loan reductions can be observed between 2008 and 2010. This was likely due to the financial instability caused by the global economic downturn at that time.

2011–2020 Period: A stable increase in loan amounts is visible post-crisis, reflecting economic recovery and rising confidence in financial markets.

2020 and the Pandemic: During the pandemic (starting in 2020), some fluctuations can be observed. This likely represents a time when businesses and individuals sought more financing options due to the economic uncertainty brought about by COVID-19.

Economic Significance:
The increase in loans typically signals growing economic activity. This loan growth can be associated with:

Rising Investments: Businesses borrowing more capital for investments, which fuels economic growth.

Increasing Consumption: Individuals may also be borrowing larger amounts to finance major purchases (e.g., homes).

Macroeconomic Conditions: Favorable macroeconomic factors, such as lower interest rates, may have encouraged consumers to take on more loans.

This combination of factors suggests that the overall loan growth reflects broader economic recovery and the financial system's ability to support both corporate and personal borrowing needs.

### About the Visualization

An interactive data visualization has been created to analyze the distribution of loans by categories over a certain period of time. The visualizations include bar and line charts, allowing the user to choose different display modes via a dropdown menu. The presented graphs show the total loan amounts, which can be filtered, the desired chart type selected, and details viewed.

Interactivity:
Dropdown Menu: The user can choose one of three options:

Bar chart – displays only the bars.

Line chart – displays only the lines.

Both – displays both bars and lines simultaneously.

Tooltips: Hovering over any bar or line point shows a tooltip displaying the exact date and loan amount at that time.

Zoom and Pan: Users can zoom in or out on a specific data area to analyze a desired time interval.

Legend Filter: By clicking on a legend item, users can add or remove the corresponding category from the chart.






