# [DOCUMENTATION] MASHUP PHASE (b)
#### **[WHAT]**
This Jupyter Notebook analyses the mashup datasets for "EmpowerItaly", an open data project regarding the analysis of foreigner workers presence in Italy. <b>

This part represents the last part of the divergent phase of the double diamond - here we already made some assumptions while scraping data - in this phase the main objective is to provide a potential answer to our research question. <br>

- **DEFINITION OF *ACTIVITY RATE*** : 
- **DEFINITION OF *UNEMPLOYMENT RATE*** : 


#### **[HOW]**
The described phase results in the attempt to verify the presence of a potential correlation between ACTIVITY RATE and UNEMPLOYMENT RATE - both for #foreigners and #native citizens.

#### **[WHY]**
This first assumption came to our interest when we actually tried to visualize data for this specific purpose: analyzing the relation between the acitivity and the real condition of the person (without gender distinction).

#### install packages

In [6]:
# install packages
!pip install plotly
!pip install chart_studio



In [7]:
# import packages
import pandas as pd
import numpy as np
import scipy as sp
import plotly.express as px
import chart_studio.plotly as py
import plotly.graph_objects as go

# INVESTIGATION NO.1 - MAIN
TOTAL ACTIVITY RATE X UNEMPLOYMENT RATE
Here the discriminant factor is the activity rate, i.e. the effort to search for opportunities - formally, the activity rate is [TBT definition on GDrive]

In [8]:
# WHAT: ANALYSIS OF THE TOTAL UNEMPLOYMENT X ACTIVITY RATE (i.e. WIHTOUT PAYING ATTENTION TO THE EDUCATIONAL LEVEL)
# INSIGHT: The level of activity has more or less the same starting rate as the second level.
# The more we go upper on the educational level, the more the unemployment rate has a upper-defined span in the line.
# Furthermore, we notice how in 2022 the unemployment rate is downgraded, and the activity rate is raised in a significative way.

dunnoDf = pd.read_csv('https://raw.githubusercontent.com/openaccesstoimmigrants/openaccesstoimmigrants/main/vizEnvironment/dunnoMashup.csv')
# dunnoDf = dunnoDf.replace('italian',0)
# dunnoDf = dunnoDf.replace('foreign',1)

fig2 = px.scatter(
    dunnoDf, #dataframe
    x="Year", #regions
    y="total_y", #activity rate  ||
    size="total_x", #bubble size, directly proportional to  ||
    color="Citizenship",#foreign/italian color relation  ||
    color_continuous_scale=px.colors.sequential.Plotly3, #color theme
    marginal_x="box",
    title="Unemployment Rate x Activity Rate by Region (from 2018 to 2022)", #chart title
)
fig2.update_layout(
    xaxis_tickangle=30,#angle of the tick on x-axis
    title=dict(x=0.5), #set the title in center
    xaxis_tickfont=dict(size=9), #set the font for x-axis
    yaxis_tickfont=dict(size=9), #set the font for y-axis
    margin=dict(l=500, r=20, t=50, b=20), #set the margin
    paper_bgcolor="LightSteelblue", #set the background color for chart
)

# INVESTIGATION NO.1 - COLLATERAL STUDIES

In [9]:
# >>> COLLATERAL STUDY >>> LEVEL OF EDUCATION 1 >>>  NO TERRITORY DISCRIMINANT
# Here we can envision the comparison, by year, of the rate of activity and unemployment for the first level of education.
# NB: for the sake of the visualization, now the ACTIVITY RATE IS ON THE X AXIS.
# What do we notice? - Here we see how during the years foreigners have always been more active than natives, despite of the level of their unemployment rate.
# At the same time, zooming in on the boxplots we may notice some non-significative outliers in the red box, showing that some natives have also been more inactive and unemployed than the foreigners.

# ***go down for the second and third educational level comparison.***

fig3 = px.scatter(dunnoDf, x="ACT_ED_1", y="UNEMP_ED_1", color="Citizenship", facet_col="Year",
                  marginal_x="box")
fig3.show()


In [10]:
# >>> COLLATERAL STUDY >>> LEVEL OF EDUCATION 2 >>>  NO TERRITORY DISCRIMINANT

fig4 = px.scatter(dunnoDf, x="ACT_ED_2", y="UNEMP_ED_2", color="Citizenship", facet_col="Year",
                  marginal_x="box")
fig4.show()


In [11]:
# >>> COLLATERAL STUDY >>> LEVEL OF EDUCATION 3 >>> NO TERRITORY DISCRIMINANT

fig = px.scatter(dunnoDf, x="ACT_ED_3", y="UNEMP_ED_3", color="Citizenship", facet_col="Year",
                  marginal_x="box")
fig.show()


In [12]:
# >>> COLLATERAL STUDY >>> UNEMPLOTMENT X ACTIVITY RATE ACCORDING TO THE FIRST DEGREE OF EDUCATIONAL LEVEL
# INSIGHT: In Mezzogiorno area the levels of unemployment rate are higher, but so that are the level of activity rate.
# This result goes along with the Italian trend of NEET - (not in education, employment, or training) - together with the general sentiment of not being active part of the society (disillusionment).
# Taking a look at the `unemployment rate` for the very basic (i.e. the lower) level of education, we can see how paradoxically the unemployment rate is higher for the native/resident population.
# This makes us think that Italians are potentially more 'desperate' in terms of unemployment, comparing them with foreigners, but at the same time are less active than foreigners.

dunnoDf = pd.read_csv('https://raw.githubusercontent.com/openaccesstoimmigrants/openaccesstoimmigrants/main/vizEnvironment/dunnoMashup.csv')

fig2 = px.scatter(
    dunnoDf, #dataframe
    x="Territory", #x
    y="UNEMP_ED_1", #y
    size="Year", #bubble size
    color="ACT_ED_1",#bubble color
    color_continuous_scale=px.colors.sequential.Plotly3, #color theme
    title="Unemployment Rate x Activity Rate by (Macro)Region", #chart title
)
fig2.update_layout(
    xaxis_tickangle=30,#angle of the tick on x-axis
    title=dict(x=0.5), #set the title in center
    xaxis_tickfont=dict(size=9), #set the font for x-axis
    yaxis_tickfont=dict(size=9), #set the font for y-axis
    margin=dict(l=500, r=20, t=50, b=20), #set the margin
    paper_bgcolor="LightSteelblue", #set the background color for chart
)

# FINAL CONSIDERATIONS (FOR NOW)

## > WE NEED
1. For comment - documentation purposes, we need the exact number of foreigners and Natives per year (2018-2022)


In [13]:
!pip install dash-bootstrap-components

Collecting dash-bootstrap-components
  Downloading dash_bootstrap_components-1.5.0-py3-none-any.whl (221 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m221.2/221.2 kB[0m [31m4.6 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting dash>=2.0.0 (from dash-bootstrap-components)
  Downloading dash-2.14.1-py3-none-any.whl (10.4 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m10.4/10.4 MB[0m [31m26.6 MB/s[0m eta [36m0:00:00[0m
Collecting dash-html-components==2.0.0 (from dash>=2.0.0->dash-bootstrap-components)
  Downloading dash_html_components-2.0.0-py3-none-any.whl (4.1 kB)
Collecting dash-core-components==2.0.0 (from dash>=2.0.0->dash-bootstrap-components)
  Downloading dash_core_components-2.0.0-py3-none-any.whl (3.8 kB)
Collecting dash-table==5.0.0 (from dash>=2.0.0->dash-bootstrap-components)
  Downloading dash_table-5.0.0-py3-none-any.whl (3.9 kB)
Collecting ansi2html (from dash>=2.0.0->dash-bootstrap-components)
  Downloading ansi2html-1.8.0-

In [14]:
a =

SyntaxError: ignored

In [None]:
#DASHBOARD BOILERPLATE
import dash
import dash_bootstrap_components as dbc
from dash import dcc
import dash_html_components as html
from dash.dependencies import Input, Output

In [None]:
app = dash.Dash(__name__)
app.layout = html.Div(
    children=[        html.Div(
            children=[
                html.P(children="🚓", style={'fontSize': "30px",'textAlign': 'center'}, className="header-emoji"), #emoji
                html.H1(
                    children="Crime Analytics",style={'textAlign': 'center'}, className="header-title"
                ), #Header title
                html.H2(
                    children="Analyze the crime records"
                    " by district in New Zealand"
                    " between 1994 and 2014",
                    className="header-description", style={'textAlign': 'center'},
                ),
            ],
            className="header",style={'backgroundColor':'#F5F5F5'},
        ), #Description below the header


        html.Div(
            children=[
                html.Div(children = 'Year', style={'fontSize': "24px"},className = 'menu-title'),
                dcc.Dropdown(
                    id = 'year-filter',
                    options = [
                        {'label': Year, 'value':Year}
                        for Year in dunnoDf.Year.unique()
                    ], #'Year' is the filter
                    value ='2010',
                    clearable = False,
                    searchable = False,
                    className = 'dropdown', style={'fontSize': "24px",'textAlign': 'center'},
                ),
            ],
            className = 'menu',
        ), #the dropdown function

        html.Div(
            children=[
                html.Div(
                children = dcc.Graph(
                    id = 'scatter',
                    figure = fig2,
                  #  config={"displayModeBar": False},
                ),
                style={'width': '50%', 'display': 'inline-block'},
            ),
                html.Div(
                children = dcc.Graph(
                    id = 'bar',
                    figure = fig2,
                    #config={"displayModeBar": False},
                ),
                style={'width': '50%', 'display': 'inline-block'},
            ),
                html.Div(
                children = dcc.Graph(
                    id = 'bibar',
                    figure = fig3,
                    #config={"displayModeBar": False},
                ),
                style={'width': '50%', 'display': 'inline-block'},
            ),
                html.Div(
                children = dcc.Graph(
                    id = 'barscene',
                    figure = fig4,
                    #config={"displayModeBar": False},
                ),
                style={'width': '50%', 'display': 'inline-block'},
            ),
        ],
        className = 'double-graph',
        ),
    ]
) #Four graphs
app

In [None]:
# prima
@app.callback(
    Output("scatter", "figure"), #the output is the scatterchart
    [Input("year-filter", "value")], #the input is the year-filter
)
def update_charts(Year):
    filtered_data = dunnoDf[dunnoDf["Year"] == Year] #the graph/dataframe will be filterd by "Year"
    scatter = px.scatter(
        filtered_data,
        x="UNEMPL_ED_1",
        y="Territory",
        size="total_y",
        color="Citizenship",
        color_continuous_scale=px.colors.sequential.Plotly3,
        title="Offences by Location",
    )
    scatter.update_layout(
        xaxis_tickangle=30,
        title=dict(x=0.5),
        xaxis_tickfont=dict(size=9),
        yaxis_tickfont=dict(size=9),
        margin=dict(l=500, r=20, t=50, b=20),
        paper_bgcolor="LightSteelblue",
    )
    return scatter #return the scatterchart according to the filter

update_charts

In [None]:
#seconda
@app.callback(
    Output("bar", "figure"),
    [Input("year-filter", "value")],
)
def update_charts(Year):
    filtered_data = dunnoDf[dunnoDf["Year"] == Year]
    bar = px.bar(
        filtered_data,
        x=filtered_data.groupby("UNEMPL_ED_1")["Total"].agg(sum),
        y=filtered_data["UNEMPL_ED_1"].unique(),
        color=filtered_data.groupby("UNEMPL_ED_1")["Total"].agg(sum),
        color_continuous_scale=px.colors.sequential.RdBu,
        text=filtered_data.groupby("UNEMPL_ED_1")["Total"].agg(sum),
        title="Recorded Crime by UNEMPL_ED_1",
        orientation="h",
    )
    bar.update_layout(
        title=dict(x=0.5), margin=dict(l=550, r=20, t=60, b=20), paper_bgcolor="#D6EAF8"
    )
    bar.update_traces(texttemplate="%{text:.2s}")
    return bar@app.callback(
    Output("bibar", "figure"),
    [Input("year-filter", "value")],
)
def update_charts(Year):
    filtered_dunnoDf = dunnoDf.loc[dunnoDf['Citizenship'] == 'foreign']
#result
    filtered_dunnoDf = dunnoDfs2[dunnoDf["Year"] == Year]
    trace1 = go.Bar(
        x=filtered_dunnoDf["Location"].unique(),
        y=filtered_dunnoDf.groupby("Location")["Total"].agg(sum),
        text=filtered_dunnoDf.groupby("Location")["Total"].agg(sum),
        textposition="outside",
        marker_color=px.colors.qualitative.Dark24[0],
        name="Resolved",
    )
    trace2 = go.Bar(
        x=filtered_dunnoDf["Location"].unique(),
        y=filtered_dunnoDf.groupby("Location")["Total"].agg(sum),
        text=filtered_dunnoDf.groupby("Location")["Total"].agg(sum),
        textposition="outside",
        marker_color=px.colors.qualitative.Dark24[1],
        name="Unresolved",
    )
    data = [trace1, trace2]
    layout = go.Layout(barmode="group", title="Resolved vs Unresolved")
    bibar = go.Figure(data=data, layout=layout)
    bibar.update_layout(
        title=dict(x=0.5),
        xaxis_title="District",
        yaxis_title="Total",
        paper_bgcolor="aliceblue",
        margin=dict(l=20, r=20, t=60, b=20),
    )
    bibar.update_traces(texttemplate="%{text:.2s}")
    return bibar

In [None]:
# terza

@app.callback(
    Output("barscene", "figure"),
    [Input("year-filter", "value")],
)
def update_charts(Year):
    filtered_data = dunnoDf[dunnoDf["Year"] == Year]
    barscene = px.bar(
        filtered_data,
        x=filtered_data.groupby("Scene")["Total"].agg(sum),
        y=filtered_data["Scene"].unique(),
        labels={"x": "Total Recorded", "y": "Scene"},
        color=filtered_data.groupby("Scene")["Total"].agg(sum),
        color_continuous_scale=px.colors.sequential.Sunset,
        # color_discrete_sequence=['rgb(253,180,98)','rgb(190,186,218)'],
        text=filtered_data.groupby("Scene")["Total"].agg(sum),
        title="Recorded Crime by Scene",
        # ,barmode = 'group'
        orientation="h",
    )
    barscene.update_layout(title=dict(x=0.5), paper_bgcolor="#BDBDBD")
    barscene.update_traces(texttemplate="%{text:.2s}")
    return barscene

In [None]:
app.run_server(mode='inline')