# Corruption Perception Index Dashboard #

## Corruption Perception Index Dashboard — Concise Overview ##

This Python/Dash application visualizes both current (2024) and historical CPI data:

1. **Data Loading & Preprocessing**
   - Reads `CPI2024.csv` and `CPI-historical.csv`.
   - Drops missing values, scales numeric features, and encodes categories.

2. **Feature Engineering**
   - Computes quartile-based corruption levels.
   - Applies PCA to reduce numeric data to two components.
   - Selects top 5 features correlated with corruption levels.

3. **Static Visualizations**
   - ***Histogram***: Distribution of CPI scores.
   - ***Scatter Plot***: CPI Score vs. Rank by region.
   - ***Choropleth Map***: 2024 CPI ranking by country.
   - ***PCA Plot***: 2D projection colored by corruption quartiles.
   - All use a dark Plotly template for consistency.

4. **Historical Trends**
   - ***Line Chart***: Global average CPI over years.
   - ***Text Block***: Average change, top 5 improvers and decliners since earliest records.

5. **Animated Map**
   - ***Animated Choropleth***: Year-by-year CPI changes with slider controls.

6. **Dashboard Layout**
   - Dark-themed Bootstrap (`CYBORG`) with tabs for each section.
   - Uses `dcc.Graph`, `dash_table.DataTable`, and HTML components for text insights.

This concise structure highlights how corruption perception has evolved globally and allows interactive exploration of the data.

In [1]:
import pandas as pd
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
from dash import Dash, html, dcc, dash_table
import dash_bootstrap_components as dbc
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.decomposition import PCA
from sklearn.impute import SimpleImputer
from sklearn.feature_selection import SelectKBest, f_classif

  from pkg_resources import get_distribution, parse_version


In [2]:
cpi_current = pd.read_csv('CPI2024.csv')
cpi_current

Unnamed: 0,Country / Territory,ISO3,Region,CPI 2024 score,Rank,standard error 2024,Number of sources,Lower CI,Upper CI,African Development Bank CPIA,...,Economist Intelligence Unit Country Ratings,Freedom House Nations In Transit,S&P / Global Insights Country Risk Ratings,IMD World Competitiveness Yearbook,PERC Asia Risk Guide,PRS International Country Risk Guide,Varieties of Democracy Project,World Bank CPIA,World Economic Forum EOS,World Justice Project Rule of Law Index
0,Denmark,DNK,WE/EU,90,1,1.98,8,86.75,93.25,,...,83.0,,85.0,97.0,,100.0,75.0,,95.0,87.0
1,Finland,FIN,WE/EU,88,2,1.83,8,85.00,91.00,,...,83.0,,85.0,93.0,,96.0,74.0,,92.0,84.0
2,Singapore,SGP,AP,84,3,1.48,9,81.57,86.43,,...,83.0,,85.0,82.0,89.0,78.0,74.0,,100.0,84.0
3,New Zealand,NZL,AP,83,4,1.77,8,80.10,85.90,,...,83.0,,85.0,87.0,,96.0,74.0,,72.0,83.0
4,Luxembourg,LUX,WE/EU,81,5,2.00,7,77.72,84.28,,...,83.0,,72.0,77.0,,87.0,73.0,,92.0,79.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
175,Yemen,YEM,MENA,13,173,1.72,7,10.18,15.82,,...,18.0,,6.0,,,15.0,11.0,3.0,21.0,
176,Syria,SYR,MENA,12,177,1.83,5,9.00,15.00,,...,18.0,,6.0,,,15.0,14.0,,,
177,Venezuela,VEN,AME,10,178,1.27,8,7.92,12.08,,...,18.0,,6.0,20.0,,6.0,9.0,,7.0,8.0
178,Somalia,SOM,SSA,9,179,2.30,6,5.23,12.77,0.0,...,,,19.0,,,6.0,16.0,3.0,,


In [3]:
cpi_current.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 180 entries, 0 to 179
Data columns (total 22 columns):
 #   Column                                               Non-Null Count  Dtype  
---  ------                                               --------------  -----  
 0   Country / Territory                                  180 non-null    object 
 1   ISO3                                                 180 non-null    object 
 2   Region                                               180 non-null    object 
 3   CPI 2024 score                                       180 non-null    int64  
 4   Rank                                                 180 non-null    int64  
 5   standard error 2024                                  180 non-null    float64
 6   Number of sources                                    180 non-null    int64  
 7   Lower CI                                             180 non-null    float64
 8   Upper CI                                             180 non-null    f

In [4]:
cpi_historical = pd.read_csv('CPI-historical.csv')
cpi_historical

Unnamed: 0,Country / Territory,ISO3,Year,Region,CPI score,Rank,Standard error,Number of sources,Lower CI,Upper CI
0,Afghanistan,AFG,2012,AP,8,174,3.30,3,2,13
1,Afghanistan,AFG,2013,AP,8,175,3.30,3,3,13
2,Afghanistan,AFG,2014,AP,12,172,1.29,4,10,14
3,Afghanistan,AFG,2015,AP,11,166,3.49,4,5,17
4,Afghanistan,AFG,2016,AP,15,169,1.74,5,12,17
...,...,...,...,...,...,...,...,...,...,...
2307,Zimbabwe,ZWE,2020,SSA,24,157,1.35,9,22,26
2308,Zimbabwe,ZWE,2021,SSA,23,157,1.52,8,21,25
2309,Zimbabwe,ZWE,2022,SSA,23,157,1.53,8,20,26
2310,Zimbabwe,ZWE,2023,SSA,24,149,1.30,9,22,26


In [5]:
cpi_historical.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2312 entries, 0 to 2311
Data columns (total 10 columns):
 #   Column               Non-Null Count  Dtype  
---  ------               --------------  -----  
 0   Country / Territory  2312 non-null   object 
 1   ISO3                 2312 non-null   object 
 2   Year                 2312 non-null   int64  
 3   Region               2312 non-null   object 
 4   CPI score            2312 non-null   int64  
 5   Rank                 2312 non-null   int64  
 6   Standard error       2312 non-null   float64
 7   Number of sources    2312 non-null   int64  
 8   Lower CI             2312 non-null   int64  
 9   Upper CI             2312 non-null   int64  
dtypes: float64(1), int64(6), object(3)
memory usage: 180.8+ KB


In [6]:
# Average trend over time
def get_average_historical_figure():
    df_avg = cpi_historical.groupby('Year')['CPI score'].mean().reset_index()
    fig = px.line(
        df_avg,
        x='Year',
        y='CPI score',
        markers=True,
        title='Average Global CPI Score Over Time'
    )
    fig.update_traces(line=dict(color='lightgreen', width=3))
    fig.update_layout(template='plotly_dark')
    return fig

In [7]:
# Historical insights text block
def get_historical_insights_block():
    cpi_delta = (
        cpi_historical.sort_values('Year')
        .groupby('Country / Territory')
        .agg(start_score=('CPI score', lambda x: x.iloc[0]), end_score=('CPI score', lambda x: x.iloc[-1]))
        .assign(change=lambda df: df['end_score'] - df['start_score'])
        .reset_index()
    )
    top_improved = cpi_delta.nlargest(5, 'change')
    top_declined = cpi_delta.nsmallest(5, 'change')
    avg_change = round(cpi_delta['change'].mean(), 2)

    return html.Div([
        html.H4('Historical Trends Insights', style={'color': 'white'}),
        html.P(f"Average global CPI score changed by {avg_change} points since earliest records.", style={'color': 'lightgray'}),
        html.P(f"Top improving countries: {', '.join(top_improved['Country / Territory'])}", style={'color': 'lightgreen'}),
        html.P(f"Top declining countries: {', '.join(top_declined['Country / Territory'])}", style={'color': 'salmon'}),
    ], style={'padding': '1rem', 'backgroundColor': '#111'})

In [8]:
# Insight text from CPI 2024
def get_insights(data):
    avg_score = round(data['CPI 2024 score'].mean(), 2)
    best_country = data.loc[data['CPI 2024 score'].idxmax(), 'Country / Territory']
    worst_country = data.loc[data['CPI 2024 score'].idxmin(), 'Country / Territory']
    return (f"Average CPI Score: {avg_score}. Best ranked country: {best_country}. "
            f"Lowest score: {worst_country}. High CPI correlates with better rank.")

In [9]:
# Processing for Dashboard (CPI2024)
df = cpi_current.copy()
num_cols = df.select_dtypes(include=np.number).columns
cat_cols = df.select_dtypes(include='object').columns

In [10]:
imputer = SimpleImputer(strategy='mean')
df[num_cols] = imputer.fit_transform(df[num_cols])

In [11]:
scaler = StandardScaler()
df[num_cols] = scaler.fit_transform(df[num_cols])

In [12]:
le = LabelEncoder()
for col in cat_cols:
    df[col] = le.fit_transform(df[col].astype(str))

In [13]:
df['Corruption_Level'] = pd.qcut(df['CPI 2024 score'], q=4, labels=['High', 'Moderate', 'Low', 'Very Low'])
le_corr = LabelEncoder()
y = le_corr.fit_transform(df['Corruption_Level'])

In [14]:
pca = PCA(n_components=2)
X_pca = pca.fit_transform(df[num_cols])
df['PCA1'], df['PCA2'] = X_pca[:, 0], X_pca[:, 1]

In [15]:
selector = SelectKBest(score_func=f_classif, k=5)
X_selected = selector.fit_transform(df[num_cols], y)
important_features = df[num_cols].columns[selector.get_support()].tolist()

In [16]:
# Figures
fig_dist = px.histogram(
    cpi_current,
    x='CPI 2024 score',
    nbins=30,
    title='Distribution of CPI Scores'
)
fig_dist.update_layout(template='plotly_dark')

fig_scatter = px.scatter(
    cpi_current,
    x='CPI 2024 score',
    y='Rank',
    color='Region',
    title='CPI Score vs Rank by Region',
    hover_name='Country / Territory'
)
fig_scatter.update_layout(template='plotly_dark')

fig_choropleth = px.choropleth(
    cpi_current,
    color='Rank',
    locations='ISO3',
    hover_name='Country / Territory',
    title='CPI 2024 Ranking',
    color_continuous_scale='Viridis'
)
fig_choropleth.update_layout(template='plotly_dark')

fig_pca = px.scatter(
    df,
    x='PCA1',
    y='PCA2',
    color='Corruption_Level',
    hover_name='Country / Territory',
    title='PCA Projection of CPI Numeric Features',
    color_continuous_scale='Viridis'
)
fig_pca.update_layout(template='plotly_dark')

fig_trend_all = get_average_historical_figure()

In [17]:
# Dash App
app = Dash(__name__, external_stylesheets=[dbc.themes.CYBORG])

app.layout = dbc.Container([
    html.H1(
        'Corruption Perception Index Dashboard',
        className='text-center my-4'),
    dcc.Tabs([
        dcc.Tab(label='Overview', children=[
            html.Br(),
            dcc.Markdown(
                get_insights(cpi_current),
                style={
                    'backgroundColor': '#222',
                    'padding': '1rem'
                }
            ),
            dcc.Graph(figure=fig_dist),
            dcc.Graph(figure=fig_scatter)
        ]),

        dcc.Tab(label='Choropleth Map', children=[
            html.Br(),
            dcc.Graph(figure=fig_choropleth)
        ]),

        dcc.Tab(label='PCA Visualization', children=[
            dcc.Graph(figure=fig_pca)
        ]),

        dcc.Tab(label='Data Table', children=[
            html.Br(),
            dash_table.DataTable(
                columns=[{'name': i, 'id': i} for i in cpi_current.columns],
                data=cpi_current.to_dict('records'),
                page_size=15,
                style_table={'overflowX': 'auto'},
                style_header={
                    'backgroundColor': '#222',
                    'color': 'white'
                },
                style_cell={
                    'backgroundColor': '#111',
                    'color': 'white'
                }
            )
        ]),

        dcc.Tab(label='Historical Trends', children=[
            html.Br(),
            html.P(
                'Average CPI Score Over Time',
                style={
                    'color': 'white',
                    'fontSize': '18px'
                }
            ),
            dcc.Graph(figure=fig_trend_all),
            get_historical_insights_block()
        ])
    ])
], fluid=True)

In [18]:
if __name__ == '__main__':
    app.run(debug=False)

Dash is running on http://127.0.0.1:8050/

 * Serving Flask app '__main__'
 * Debug mode: off


 * Running on http://127.0.0.1:8050
Press CTRL+C to quit
127.0.0.1 - - [16/Jul/2025 11:38:00] "GET / HTTP/1.1" 200 -
127.0.0.1 - - [16/Jul/2025 11:38:00] "GET /_dash-component-suites/dash/deps/polyfill@7.v2_10_0m1751278146.12.1.min.js HTTP/1.1" 200 -
127.0.0.1 - - [16/Jul/2025 11:38:00] "GET /_dash-component-suites/dash_bootstrap_components/_components/dash_bootstrap_components.v1_7_1m1739785353.min.js HTTP/1.1" 200 -
127.0.0.1 - - [16/Jul/2025 11:38:00] "GET /_dash-component-suites/dash/deps/react@16.v2_10_0m1751278146.14.0.min.js HTTP/1.1" 200 -
127.0.0.1 - - [16/Jul/2025 11:38:00] "GET /_dash-component-suites/dash/deps/react-dom@16.v2_10_0m1751278146.14.0.min.js HTTP/1.1" 200 -
127.0.0.1 - - [16/Jul/2025 11:38:00] "GET /_dash-component-suites/dash/deps/prop-types@15.v2_10_0m1751278146.8.1.min.js HTTP/1.1" 200 -
127.0.0.1 - - [16/Jul/2025 11:38:00] "GET /_dash-component-suites/dash/dash-renderer/build/dash_renderer.v2_10_0m1751278146.min.js HTTP/1.1" 200 -
127.0.0.1 - - [16/Jul/2025 