<a href="https://colab.research.google.com/github/ariadnemnp/ariadnemnp/blob/main/Deposit_Trend_Indonesia.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 📊 Deposit Trends and the Changing Financial Habits of Indonesians

The dataset used for this analysis is from Indonesian Financial and Economic Statistics, covering deposits in commercial and rural banks from 2010 to 2024.One data wrangling i did was to convert data type: year to object and other data to integer.

It is shown that deposit amount consistently increase in line with the number of accounts, indicating growing financial engagement in Indonesia.

What does this trend say about financial habit of Indonesian people? Next, I’ll explore whether people prefer to save in short-term or long-term instrument.




## Part 1 — Find a Dataset (Upload)

In [8]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [9]:
data = '/content/drive/MyDrive/Colab Notebooks/Rev Banking Data Indonesia.csv'

## Part 2 — Load & Inspect

In [10]:
import pandas as pd

df = pd.read_csv(data)
rows, cols = df.shape
rows, cols

(15, 13)

In [11]:
df.head(15)

Unnamed: 0,Year,Total Deposit,Demand Deposit,Saving Deposit,Time Deposit,1 Month Time Deposit,3 Month Time Deposit,6 Month Time Deposit,12 Month Time Deposit,24 Month Time Deposit,Others,Matured Deposits,Number of Account
0,2010,2304875,504096,738695,1062084,542497,323790,91176,93488,1609,8822,702,97204863
1,2011,2736415,605085,905700,1225630,609109,373948,123882,94166,5497,18351,677,101503564
2,2012,3163519,710400,1085743,1367377,596108,425804,170200,153227,3137,18616,285,119917930
3,2013,3578208,797243,1222029,1558935,758313,428975,188930,132338,30286,15101,4991,147626510
4,2014,4013817,845667,1298326,1869823,921542,497681,203592,175828,44023,26440,717,160881757
5,2015,4335269,937089,1411620,1986560,883859,626063,227586,155825,49202,41151,2875,175994476
6,2016,4749400,1065072,1569943,2114385,958102,570157,240312,200813,45070,28828,71103,199301222
7,2017,5142991,1153733,1720959,2268299,942449,666230,292500,194294,57619,32252,82954,242396164
8,2018,5457228,1213887,1847928,2395414,1016271,737593,269701,165002,37295,26402,143151,275764037
9,2019,5805078,1351901,1968855,2484323,967214,794936,289853,196029,63535,28107,144648,301697958


## Part 3 — Explore & Clean

In [12]:
df.dtypes




Unnamed: 0,0
Year,int64
Total Deposit,object
Demand Deposit,object
Saving Deposit,object
Time Deposit,object
1 Month Time Deposit,object
3 Month Time Deposit,object
6 Month Time Deposit,object
12 Month Time Deposit,object
24 Month Time Deposit,object


In [18]:

    df = pd.read_csv(data)

    cols_to_convert = [
        'Total Deposit', 'Demand Deposit', 'Saving Deposit', 'Time Deposit',
        '1 Month Time Deposit', '3 Month Time Deposit', '6 Month Time Deposit',
        '12 Month Time Deposit', '24 Month Time Deposit', 'Others',
        'Matured Deposits', 'Number of Account'
    ]

    for col in cols_to_convert:
        # Step 1: Remove commas and spaces
        df[col] = df[col].str.replace(',', '').str.strip()

        # Step 2: Convert to numeric (float), coerce errors to NaN
        df[col] = pd.to_numeric(df[col], errors='coerce')

        # Step 3: If there are no NaNs, convert to integer
        if df[col].isnull().sum() == 0:
            df[col] = df[col].astype(int)
        else:
            print(f"⚠️ Column '{col}' contains missing values; kept as float.")

    df['Year'] = df['Year'].astype(str)

In [19]:
df.dtypes

Unnamed: 0,0
Year,object
Total Deposit,int64
Demand Deposit,int64
Saving Deposit,int64
Time Deposit,int64
1 Month Time Deposit,int64
3 Month Time Deposit,int64
6 Month Time Deposit,int64
12 Month Time Deposit,int64
24 Month Time Deposit,int64


In [21]:
df.isnull().sum().sort_values(ascending=False).head(10)

Unnamed: 0,0
Year,0
Total Deposit,0
Demand Deposit,0
Saving Deposit,0
Time Deposit,0
1 Month Time Deposit,0
3 Month Time Deposit,0
6 Month Time Deposit,0
12 Month Time Deposit,0
24 Month Time Deposit,0


## Part 4 — Summarization

In [23]:
category_col = 'Year'
numeric_cols = ['Demand Deposit', 'Saving Deposit', 'Time Deposit']

by_cat = df.groupby(category_col)[numeric_cols].sum().sort_values(by=numeric_cols, ascending=False)
by_cat.head(10)

Unnamed: 0_level_0,Demand Deposit,Saving Deposit,Time Deposit
Year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2024,2579530,2868611,3124509
2023,2524880,2691261,3019807
2022,2429991,2637474,2864725
2021,2006916,2454070,2792277
2020,1580036,2196025,2689041
2019,1351901,1968855,2484323
2018,1213887,1847928,2395414
2017,1153733,1720959,2268299
2016,1065072,1569943,2114385
2015,937089,1411620,1986560


## Part 5 — Visualization (Pandas + Plotly)

In [24]:
import plotly.graph_objects as go
from plotly.subplots import make_subplots

fig = make_subplots()
fig.add_trace(go.Scatter(x=df['Year'], y=df['Number of Account'], name='Number of Account', mode="lines"))
fig.show()

In [25]:
fig = make_subplots(specs=[[{"secondary_y": True}]])
fig.add_trace(
    go.Scatter(x=df['Year'], y=df['Number of Account'], name='Number of Account', mode="lines"),
    secondary_y=True,
)

fig.add_trace(
    go.Bar(x=df['Year'], y=df['Total Deposit'], name='Total Deposit'),
    secondary_y=False,
)

fig.update_layout(title_text="Number of Accounts and Total Deposit Over Time")
fig.update_xaxes(title_text="Year")
fig.update_yaxes(title_text="Total Deposit", secondary_y=False)
fig.update_yaxes(title_text="Number of Account", secondary_y=True)

fig.show()

In [None]:
import plotly.express as px

bar_df = by_cat.head(10).reset_index()
fig = px.bar(bar_df, x=category_col, y=numeric_col, title=f'Top 10 {category_col} (interactive)')
fig.show()

## Part 6 — Reflection
- What needed adjusting when you switched datasets?
- What was easier/harder vs the workbook?
- What context/limitations should a policymaker know?


## Part 7 — ⚡ Lightning Chart Demos — Teach-Out Instructions

### What is a Lightning Demo?
A **rapid-fire mini-presentation**: 2 minutes per group. Show **one chart** + **one insight**, not your whole notebook.

### Demo Format (2 minutes)
1. **Context (15 sec):** Dataset name + 1-sentence purpose.  
2. **Chart + Insight (75–90 sec):** Show interactive Plotly chart; state one clear finding; name one wrangling step.  
3. **Caveat or Next Step (15–30 sec):** One limitation OR next idea.  
4. **Pass the mic (10 sec):** End with *“Any questions about our chart?”*

### Unique Angles
- Aggregation choice (sum vs mean, raw vs normalized).  
- Data cleaning step (conversion, deduping).  
- Encoding (bar vs line vs scatter).  
- Interactivity (facet, hover, filter).  
- Comparability (per capita vs raw counts).  
- Temporal nuance (trend vs snapshot).  

👉 If another group showed your angle, pick a different one.

### Flow for the Day
- Groups present **clustered by chart type** (bars, lines, scatter, other).  
- Instructor summarizes similarities/differences after each cluster.  
- 15 groups × 2 min = ~30 minutes + transitions = ~40 minutes.

### Presenter Checklist
- [ ] Dataset URL loads in Colab  
- [ ] Columns chosen make sense  
- [ ] One **interactive Plotly figure** with title + labels  
- [ ] Insight stated in one sentence  
- [ ] One cleaning step mentioned  

### Audience Task
After each cluster, jot 1 sentence:  
- What was **similar** across charts?  
- What was **different** (aggregation, encoding, interactivity)?  
- Which chart best supported a **policy decision**—and why?

---
### Quick Rubric (3 pts)
- **Clarity of claim (1 pt)**  
- **Method fit (1 pt)**  
- **Insight/caveat (1 pt)**  
(Bonus +0.5 for meaningful interactivity)

---
### Minimal Code Patterns

**Top-N bar:**
```python
bar_df = df.groupby(category_col)[numeric_col].sum().nlargest(10).reset_index()
fig = px.bar(bar_df, x=category_col, y=numeric_col,
             title=f"Top 10 {category_col} by {numeric_col}")
fig.show()
```

**Time series:**
```python
ts = (df.groupby(time_col)[numeric_col]
        .sum()
        .reset_index()
        .sort_values(time_col))
fig = px.line(ts, x=time_col, y=numeric_col, markers=True,
              title=f"{numeric_col} over time")
fig.show()
```

**Scatter:**
```python
fig = px.scatter(df, x=x_col, y=y_col, trendline="ols",
                 title=f"{y_col} vs {x_col}")
fig.show()
```

---

In [None]:
csv_url = "PASTE-RAW-CSV-URL-HERE"
csv_url