# Visualizing Cybersecurity Incidences
### Goal: transform numbers into impactful visuals.
### Uses:
* Plotly's Dash (for creating local dashboards)
* KaggleHub (for data)

### More about Dash:
* [Dash App Examples](https://plotly.com/examples/)
* [User Guides](https://dash.plotly.com/minimal-app)
* [More about Jupyter Support for Dash](https://github.com/plotly/jupyter-dash?tab=readme-ov-file)
* [Dash Bootstrap Themes](https://hellodash.pythonanywhere.com/adding-themes/color-modes)

Note: as of Dash v2.11, Jupyter support is built into the main Dash package.

### Other data sets
https://huggingface.co/datasets/vinitvek/cybersecurityattacks


## Environment Setup

In [10]:
# Installations
%pip install --q pandas dash "plotly[express]" ipywidgets nbformat dash-bootstrap-components fsspec huggingface_hub dash-bootstrap-templates

Note: you may need to restart the kernel to use updated packages.


In [141]:
# Libraries
import dash
from dash import Dash, html, dcc, callback, Output, Input, dash_table # We import the dcc module (DCC stands for Dash Core Components). This module includes a Graph component called dcc.Graph, which is used to render interactive graphs.
import plotly.express as px # We also import the plotly.express library to build the interactive graphs.
import pandas as pd
import nbformat
import dash_bootstrap_components as dbc
import plotly.io as pio
from dash_bootstrap_templates import load_figure_template

In [12]:
# Download data set from HuggingFace using Pandas
df = pd.read_csv("hf://datasets/vinitvek/cybersecurityattacks/collab dataset.csv")

## Brief Data Exploration, Understanding

In [13]:
df.head(n=5)

Unnamed: 0,slug,event_date,event_year,affected_country,affected_organization,affected_industry,afftected_industry_code,event_type,event_subtype,motive,description,actor,actor_type,actor_country,source_url
0,babb843cbce5db9e,2023-12-31 00:00:00,2023,United Kingdom of Great Britain and Northern I...,Radioactive Waste Management,Administrative and Support and Waste Managemen...,56,Undetermined,Undetermined,Undetermined,Threat actors try to break into Radioactive Wa...,Undetermined,Criminal,Undetermined,https://www.theguardian.com/business/2023/dec/...
1,581e011d5c37c281,2023-12-31 00:00:00,2023,Belarus,BelTA,Information,51,Disruptive,Undetermined,Protest,Belarusian hacktivists from the Belarusian Cyb...,Belarusian Cyber-Partisans,Hacktivist,Belarus,https://www.bankinfosecurity.com/hacktivists-s...
2,fa79c150aac3cf77,2023-12-30 00:00:00,2023,United States of America,Xerox Business Solutions,Administrative and Support and Waste Managemen...,56,Mixed,Exploitation of Application Server,Financial,The U.S. division of Xerox Business Solutions ...,INC Ransom,Criminal,Undetermined,https://www.bleepingcomputer.com/news/security...
3,4d12747a4dd52156,2023-12-30 00:00:00,2023,Iran (Islamic Republic of),SnappFood,Accommodation and Food Services,72,Mixed,Exploitation of Application Server,Financial,Irleaks claims to have broken into the systems...,Irleaks,Criminal,Undetermined,https://www.darkreading.com/cyberattacks-data-...
4,1079752e8fe90b4d,2023-12-29 00:00:00,2023,Canada,Memorial University of Newfoundland,Educational Services,61,Disruptive,Undetermined,Financial,Memorial University of Newfoundland (MUN) is h...,Undetermined,Criminal,Undetermined,https://www.bleepingcomputer.com/news/security...


In [14]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 13407 entries, 0 to 13406
Data columns (total 15 columns):
 #   Column                   Non-Null Count  Dtype 
---  ------                   --------------  ----- 
 0   slug                     13407 non-null  object
 1   event_date               13407 non-null  object
 2   event_year               13407 non-null  int64 
 3   affected_country         13407 non-null  object
 4   affected_organization    13407 non-null  object
 5   affected_industry        13407 non-null  object
 6   afftected_industry_code  13407 non-null  int64 
 7   event_type               13407 non-null  object
 8   event_subtype            13407 non-null  object
 9   motive                   13407 non-null  object
 10  description              13407 non-null  object
 11  actor                    13407 non-null  object
 12  actor_type               13407 non-null  object
 13  actor_country            13407 non-null  object
 14  source_url               13407 non-nul

In [15]:
df.describe()

Unnamed: 0,event_year,afftected_industry_code
count,13407.0,13407.0
mean,2019.703886,63.197434
std,2.803879,18.867469
min,2014.0,11.0
25%,2017.0,51.0
50%,2020.0,61.0
75%,2022.0,81.0
max,2023.0,99.0


In [16]:
df.nunique()

slug                       13407
event_date                  3130
event_year                    10
affected_country             163
affected_organization      12252
affected_industry             22
afftected_industry_code       42
event_type                     4
event_subtype                 86
motive                        10
description                11693
actor                       1135
actor_type                     6
actor_country                 82
source_url                 10768
dtype: int64

## Dash App
(Local Dashboard Creation)

In [28]:
test = df[df.event_year == 2023]
done = pd.DataFrame(test['affected_country'].value_counts().reset_index())
done

Unnamed: 0,affected_country,count
0,United States of America,1304
1,Italy,168
2,United Kingdom of Great Britain and Northern I...,105
3,Canada,60
4,Undetermined,55
...,...,...
92,Cambodia,1
93,Austria,1
94,Sri Lanka,1
95,Iraq,1


In [61]:
for year in years:
    test = df[df.event_year == year]
    done = pd.DataFrame(test['affected_country'].value_counts().reset_index())
    done = done[done['affected_country'] != 'United States of America']
    done['Percent'] = np.round((done['count']/done['count'].sum())*100,2)
    print(done['Percent'].max())

8.86
15.72
8.33
14.41
14.71
10.78
15.48
12.93
14.41
15.7


In [75]:
# Define your color palette and map it to countries
from itertools import cycle
from plotly.express.colors import qualitative

unique_countries = df['affected_country'].unique()
palette = cycle(qualitative.Alphabet)  # use a large qualitative palette

color_map = {country: color for country, color in zip(unique_countries, palette)}

In [47]:
import numpy as np

In [146]:
#pio.templates.default = 'cyborg' 
years = df.event_year.unique().tolist()
years.sort()

# adds  templates to plotly.io
pio.templates.default = "plotly_dark"

dash_table.DataTable(
    data=df.to_dict('records'),
    columns=[{"name": i, "id": i} for i in df.columns]
)

app = Dash(__name__, external_stylesheets=[dbc.themes.CYBORG])

app.layout = html.Div([
    html.H1("The Cybersecurity Failures", style={
            "color": "#FFFFFF",
            "fontSize": "60px",
            "fontFamily": "Verdana",
            "textAlign": "center",
            "marginTop": "10px",
            "marginBottom":"0px",
            "font-variant":"small-caps"
        }),

    html.H2("Data collected from 2014-2023",
            style={
                "color": "#efefef",
                "fontSize": "25px",
                "fontFamily": "sans-serif",
                "textAlign": "center",
                "padding": "0px",
                "marginBottom":"25px",
                "background-color":"#574E92"}
        ),

    html.Div([
        html.Div([
            dcc.RadioItems(
                id='event_year',
                options=years,
                value=2023,
                labelStyle={'display': 'inline-block', 'marginRight': '15px'}
            )
        ], style={'textAlign': 'center', 'marginTop': '10px'}),
        html.Div([
            dcc.Graph(id="map_graph", 
                style={
                    'padding': '0px',
                    'margin': '0px',
                    'height': '550px',  # Optional: control height directly
                    'width': '100%'     # Optional: make full width
                    }
                ),
        ], style={'width': '65%', 'display': 'inline-block', 'verticalAlign': 'top'}),
        
        html.Div([
            dash_table.DataTable(
                id='my-table',
                columns=[{"name": i, "id": i} for i in df.columns],
                data=df.to_dict('records'),
                style_table={'overflowX': 'auto'},
                style_cell={'padding': '5px', 'textAlign': 'left'}
            )
        ], style={'width': '34%', 'display': 'inline-block', 'verticalAlign': 'top', 'paddingLeft': '10px'}),
        
        html.Div(
            "Custom text info goes here." \
            "",
            style={
                'position': 'absolute',
                'top': '10px',
                'right': '50px',
                'backgroundColor': '#2E294F',
                'padding': '10px',
                'borderRadius': '5px',
                'boxShadow': '0 0 10px rgba(0,0,0,0.2)',
                'outline-style': 'outset',
            }
        )
    ], style={'position': 'relative'}),

])


@app.callback(
    Output("map_graph", "figure"),
    Input("event_year", "value"))

def display_choropleth(year):

    dff = df[df.event_year == year]
    dff = pd.DataFrame(dff['affected_country'].value_counts().reset_index())


    fig = px.choropleth(
        dff,
        locations='affected_country',           # Column with country codes
        locationmode='country names',
        color='affected_country',
        hover_data={'count': True},  # Show original value
        projection='equirectangular',     # Map projection style
        color_discrete_map=color_map
    )

    fig = fig.update_layout(showlegend=False,  margin=dict(l=0, r=0, t=0, b=0),)

    return fig


app.run(jupyter_mode="inline", debug=True)