# Visualizing Cybersecurity Incidences
### Goal: transform numbers into impactful visuals.
### Uses:
* Plotly's Dash (for creating local dashboards)
* KaggleHub (for data)

### More about Dash:
* [Dash App Examples](https://plotly.com/examples/)
* [User Guides](https://dash.plotly.com/minimal-app)
* [More about Jupyter Support for Dash](https://github.com/plotly/jupyter-dash?tab=readme-ov-file)
* [Dash Bootstrap Themes](https://hellodash.pythonanywhere.com/adding-themes/color-modes)

Note: as of Dash v2.11, Jupyter support is built into the main Dash package.

## Environment Setup

In [None]:
# Installations
%pip install --q pandas dash kagglehub "plotly[express]" ipywidgets nbformat dash-bootstrap-components

Note: you may need to restart the kernel to use updated packages.


In [47]:
%pip install dash-bootstrap-templates

Collecting dash-bootstrap-templates
  Downloading dash_bootstrap_templates-2.1.0-py3-none-any.whl.metadata (17 kB)
Downloading dash_bootstrap_templates-2.1.0-py3-none-any.whl (100 kB)
Installing collected packages: dash-bootstrap-templates
Successfully installed dash-bootstrap-templates-2.1.0
Note: you may need to restart the kernel to use updated packages.


In [48]:
# Libraries
import kagglehub
import dash
from dash import Dash, html, dcc, callback, Output, Input # We import the dcc module (DCC stands for Dash Core Components). This module includes a Graph component called dcc.Graph, which is used to render interactive graphs.
import plotly.express as px # We also import the plotly.express library to build the interactive graphs.
import pandas as pd
import nbformat
import dash_bootstrap_components as dbc
import plotly.io as pio
from dash_bootstrap_templates import load_figure_template

In [None]:
# Data Downloading
path = kagglehub.dataset_download("atharvasoundankar/global-cybersecurity-threats-2015-2024") # download latest data set version
df = pd.read_csv(path+"/Global_Cybersecurity_Threats_2015-2024.csv")

## Brief Data Exploration, Understanding

In [5]:
df.head(n=5)

Unnamed: 0,Country,Year,Attack Type,Target Industry,Financial Loss (in Million $),Number of Affected Users,Attack Source,Security Vulnerability Type,Defense Mechanism Used,Incident Resolution Time (in Hours)
0,China,2019,Phishing,Education,80.53,773169,Hacker Group,Unpatched Software,VPN,63
1,China,2019,Ransomware,Retail,62.19,295961,Hacker Group,Unpatched Software,Firewall,71
2,India,2017,Man-in-the-Middle,IT,38.65,605895,Hacker Group,Weak Passwords,VPN,20
3,UK,2024,Ransomware,Telecommunications,41.44,659320,Nation-state,Social Engineering,AI-based Detection,7
4,Germany,2018,Man-in-the-Middle,IT,74.41,810682,Insider,Social Engineering,VPN,68


In [6]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3000 entries, 0 to 2999
Data columns (total 10 columns):
 #   Column                               Non-Null Count  Dtype  
---  ------                               --------------  -----  
 0   Country                              3000 non-null   object 
 1   Year                                 3000 non-null   int64  
 2   Attack Type                          3000 non-null   object 
 3   Target Industry                      3000 non-null   object 
 4   Financial Loss (in Million $)        3000 non-null   float64
 5   Number of Affected Users             3000 non-null   int64  
 6   Attack Source                        3000 non-null   object 
 7   Security Vulnerability Type          3000 non-null   object 
 8   Defense Mechanism Used               3000 non-null   object 
 9   Incident Resolution Time (in Hours)  3000 non-null   int64  
dtypes: float64(1), int64(3), object(6)
memory usage: 234.5+ KB


In [5]:
df.describe()

Unnamed: 0,Year,Financial Loss (in Million $),Number of Affected Users,Incident Resolution Time (in Hours)
count,3000.0,3000.0,3000.0,3000.0
mean,2019.570333,50.49297,504684.136333,36.476
std,2.857932,28.791415,289944.084972,20.570768
min,2015.0,0.5,424.0,1.0
25%,2017.0,25.7575,255805.25,19.0
50%,2020.0,50.795,504513.0,37.0
75%,2022.0,75.63,758088.5,55.0
max,2024.0,99.99,999635.0,72.0


In [6]:
df.nunique()

Country                                  10
Year                                     10
Attack Type                               6
Target Industry                           7
Financial Loss (in Million $)          2536
Number of Affected Users               2998
Attack Source                             4
Security Vulnerability Type               4
Defense Mechanism Used                    5
Incident Resolution Time (in Hours)      72
dtype: int64

In [5]:
df['Year'].value_counts()

Year
2017    319
2022    318
2023    315
2020    315
2018    310
2024    299
2021    299
2016    285
2015    277
2019    263
Name: count, dtype: int64

In [22]:
dff = df[df.Year == 2024]
dff.groupby(['Country']).sum(numeric_only=True).reset_index()

Unnamed: 0,Country,Year,Financial Loss (in Million $),Number of Affected Users,Incident Resolution Time (in Hours)
0,Australia,68816,2046.45,15811664,1141
1,Brazil,70840,1844.04,21346703,1277
2,China,74888,1463.52,19273789,1388
3,France,52624,1334.94,14178073,890
4,Germany,58696,1369.53,16238260,959
5,India,60720,1418.26,12512325,1067
6,Japan,50600,1454.21,12756378,866
7,Russia,68816,2016.54,16431091,1204
8,UK,54648,1368.33,13198173,1103
9,USA,44528,1118.47,11334861,848


## Dash App
(Local Dashboard Creation)

In [134]:
#pio.templates.default = 'cyborg' 
years = df.Year.unique().tolist()
years.sort()

# adds  templates to plotly.io
pio.templates.default = "plotly_dark"

app = Dash(__name__, external_stylesheets=[dbc.themes.CYBORG])

app.layout = html.Div([
    html.H1("The Results of Cybersecurity Failures", style={
            "color": "#FFFFFF",
            "fontSize": "60px",
            "fontFamily": "Verdana",
            "textAlign": "center",
            "padding": "10px",
            "font-variant":"small-caps"
        }),

    html.H2("2015-2024",
            style={
                "color": "#efefef",
                "fontSize": "30px",
                "fontFamily": "sans-serif",
                "textAlign": "center",
                "padding": "0px",
                "background-color":"purple"}
            )

    # html.P("Select a year:"),
    # dcc.RadioItems(
    #     id='year',
    #     options=years,
    #     value=2024,
    #     inline=True
    # ),
    # dcc.Graph(id="map_graph", style={"width": "100%", "height": "700px"}),
])


# @app.callback(
#     Output("map_graph", "figure"),
#     Input("year", "value"))

# def display_choropleth(year):
#     dff = df[df.Year == year]

#     fig = px.choropleth(
#         dff.groupby(['Country']).sum(numeric_only=True).reset_index(),
#         locations='Country',           # Column with country codes
#         locationmode='country names',
#         color='Financial Loss (in Million $)',                 # Column with data values
#         color_continuous_scale='Viridis',  # Color scale
#         projection='equirectangular',     # Map projection style
#     ).update_layout(title_x=0.4)

#     return fig


app.run(jupyter_mode="tab", debug=True)

Dash app running on http://127.0.0.1:8050/


<IPython.core.display.Javascript object>