<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Visualization" data-toc-modified-id="Visualization-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Visualization</a></span><ul class="toc-item"><li><span><a href="#How-successful-is-the-company-in-connecting-lenders-and-borrowers?" data-toc-modified-id="How-successful-is-the-company-in-connecting-lenders-and-borrowers?-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>How successful is the company in connecting lenders and borrowers?</a></span><ul class="toc-item"><li><span><a href="#To-note" data-toc-modified-id="To-note-1.1.1"><span class="toc-item-num">1.1.1&nbsp;&nbsp;</span>To note</a></span></li></ul></li><li><span><a href="#Project-success-rate-by-location" data-toc-modified-id="Project-success-rate-by-location-1.2"><span class="toc-item-num">1.2&nbsp;&nbsp;</span>Project success rate by location</a></span></li><li><span><a href="#Who-is-receiving-how-much-funding?" data-toc-modified-id="Who-is-receiving-how-much-funding?-1.3"><span class="toc-item-num">1.3&nbsp;&nbsp;</span>Who is receiving how much funding?</a></span><ul class="toc-item"><li><span><a href="#To-note" data-toc-modified-id="To-note-1.3.1"><span class="toc-item-num">1.3.1&nbsp;&nbsp;</span>To note</a></span></li></ul></li><li><span><a href="#Is-there-a-difference-in-funding-rate-between-borrower-types?" data-toc-modified-id="Is-there-a-difference-in-funding-rate-between-borrower-types?-1.4"><span class="toc-item-num">1.4&nbsp;&nbsp;</span>Is there a difference in funding rate between borrower types?</a></span><ul class="toc-item"><li><span><a href="#To-note" data-toc-modified-id="To-note-1.4.1"><span class="toc-item-num">1.4.1&nbsp;&nbsp;</span>To note</a></span></li></ul></li></ul></li><li><span><a href="#Dashboard" data-toc-modified-id="Dashboard-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Dashboard</a></span><ul class="toc-item"><li><span><a href="#Dashboard-No.1" data-toc-modified-id="Dashboard-No.1-2.1"><span class="toc-item-num">2.1&nbsp;&nbsp;</span>Dashboard No.1</a></span></li><li><span><a href="#Dashboard-No.-2" data-toc-modified-id="Dashboard-No.-2-2.2"><span class="toc-item-num">2.2&nbsp;&nbsp;</span>Dashboard No. 2</a></span></li></ul></li></ul></div>

In [1]:
# import required libraries

# data import and wrangling
import numpy as np
import pandas as pd 

#visualization
import plotly.express as px 

# for dash-app
#!pip install jupyter-dash
from jupyter_dash import JupyterDash
from dash import dcc
from dash import html
from dash.dependencies import Input, Output
from dash import no_update

from plotly.offline import init_notebook_mode, iplot 
init_notebook_mode(connected=True)

In [2]:
# import data

df = pd.read_pickle('df_crowdsourcing_after_preprocessing.pkl')

In [3]:
df.head()

Unnamed: 0,funded_amount,loan_amount,activity,sector,country_code,country,currency,term_in_months,lender_count,borrower_genders,repayment_interval,borrower_type,borrower_type_general,%funded,funding_status,invested_per_lender,continent_code,continent
0,300.0,300.0,Fruits & Vegetables,Food,PK,Pakistan,PKR,12.0,12,female,irregular,Individual Female,female,1.0,fully funded,25.0,AS,Asia
1,575.0,575.0,Rickshaw,Transportation,PK,Pakistan,PKR,11.0,14,"female, female",irregular,Small female group,female,1.0,fully funded,41.1,AS,Asia
2,150.0,150.0,Transportation,Transportation,IN,India,INR,43.0,6,female,bullet,Individual Female,female,1.0,fully funded,25.0,AS,Asia
3,200.0,200.0,Embroidery,Arts,PK,Pakistan,PKR,11.0,8,female,irregular,Individual Female,female,1.0,fully funded,25.0,AS,Asia
4,400.0,400.0,Milk Sales,Food,PK,Pakistan,PKR,14.0,16,female,monthly,Individual Female,female,1.0,fully funded,25.0,AS,Asia


## Visualization


First impression on project locations

In [4]:
# Prepare data
df_group_perc_funded = df.groupby(by="country", as_index=False).agg(func={"%funded": "mean","funded_amount":"count"})

In [5]:
# Overview of project locations
fig_sucess = px.scatter_geo(
                        round(df_group_perc_funded.loc[df_group_perc_funded['%funded']>0.1, :],2), 
                        locations="country",
                        locationmode="country names",
                        color="%funded",
                        color_continuous_scale="RdYlGn",
                        hover_name="country", 
                        size="funded_amount", 
                        projection="natural earth",
                        title="# projects and funding rate by country",
                        labels={"%funded":"Funding rate in %", "funded_amount":"# projects", "country":"Country"}
                        )

fig_sucess.update_layout(title_x=0.45) 
fig_sucess.show()

**To note**

- investements are spread out over multiple continents, Australia is not present in the data
- based on unusually low funding rate and # of projects, USA fall a bit out of line; also considering their other characteristics, which are not contained in the dataset, e.g., GDP
- most projects are in the Philippines and Kenya
- Philippines mean % of funding is 99%


### How successful is the company in connecting lenders and borrowers?
The company is the intermediate between lenders that want to invest and strive to support a good cause and borrowers who need funding for their business or their personal development goals. Thereby, the crowdfunding company profits from each investment placed on their website. As an audit conducted by the lending platform is a prerequiste for borrowers to be able to place their investment on the website, it is assumed that borrowers are more likely to make use of the platform and go through the audit process if their chances of getting funded are high. Hence, it is assumed that the higher the percentage of successfully funded projects, the better is the company's position in the market and the more attractive the company becomes for potential borrowers. The company has therefore an internal motivation to ensure that the right lenders and borrowers are connected and that projects that are published on the website not only support borrowers but also have a fair chance of getting fully funded.

Preparing the data to analyse (visualize) how many investments get fully funded vs how many do not get fully funded, incl. levels of funding level.

In [6]:
# create new dataframe
df_group_funding_status = df.groupby(by='funding_status', as_index=False).agg(count=('funding_status','count'))

# format column
df_group_funding_status['%status']  = round((df_group_funding_status['count'] / df_group_funding_status['count'].sum())*100,2).astype(str)+'%'

In [7]:
#define treemap plot
funding_status_treemap = px.treemap(
    data_frame=df_group_funding_status,
    path=['funding_status'],
    values='count',
    color='funding_status',
    color_discrete_map={
        'fully funded': 'darkseagreen',
        'not funded': 'salmon',
        'up to 20% funded': 'peachpuff',
        '>20% funded': 'moccasin',
        '>40% funded': 'palegoldenrod',
        '>60% funded': 'darkkhaki',
        '>80% funded': 'olivedrab',
        'overfunded': 'darkolivegreen'
    },
    custom_data=df_group_funding_status[['funding_status', '%status']],
    title='Overview of funding status')

# set plots background color
funding_status_treemap.update_traces(root_color='white')

# adjust treemap layout
funding_status_treemap.update_layout(title='<b>Status of loans in 2016<b>',
                                     title_x=0.03,
                                     title_y=0.95,
                                     title_font_color='#283747 ',
                                     hoverlabel=dict(font_size=13,
                                                     font_family="Calibri"),
                                     paper_bgcolor="#d6dbdf",
                                     margin=dict(t=50, l=25, r=25, b=25))

# adjust hover text
funding_status_treemap.data[0].hovertemplate = (
    '<b>%{customdata[1]}</b> of loans <br>were <b>%{customdata[0]}.</b>'
    '<br>' + '<br>')

# add annotation
funding_status_treemap.add_annotation(x=1.0051,
                                      y=-0.045,
                                      showarrow=False,
                                      font_size=8)

#display graph
funding_status_treemap.show()

#### To note
The treemap graph shows that overall company's business model in 2016, seems to be healthy as approx. 93% investments published on the website were fully funded. This indicates that the company is successful in their attempt to connect borrowers to the right lenders that are willing to invest their money to support another person or group.

Potential areas to look into are differences in project success 
- by location
- by gender
- by project category (agriculture vs retail etc.)

### Project success rate by location

To get a better overview of the projects and the borrowers' locations, the data will be visualized geographically.

In [8]:
df_country_funding = df.groupby(by='country', as_index=False).agg(mean_fulfilled=('%funded', 'mean'))

# Adjust numbers for label in next visual
df_country_funding['mean_fulfilled'] = df_country_funding['mean_fulfilled']*100


In [9]:
# choropleth-Plot für Darstellung der Erfolgsquoten bei der Finanzierung
fig = px.choropleth(df_country_funding, 
                    locations="country",
                    locationmode="country names", 
                    labels={'mean_fulfilled':'Funding rate [%]'}, 
                    color="mean_fulfilled",
                    color_continuous_scale="RdYlGn",
                    title='Mean of funding rate in % by country',
                    range_color=(70,100),
                    width=1000
                    )
# Grafik anzeigen
fig.show()

**To note**

- average funding rate in the US (approx. 70%) is significantly lower than in other countries (> 85%)

Usually, one should now dive deeper into the potential reasons. However, in this project I will instead focus on  differences in funding rate by borrower type.

### Who is receiving how much funding? 

Knowing that most investment requests on the platform receive the entire amount that they applied for, the next graph aims to derive some insights into the type of borrower on the platform. This includes the division of borrowers into categories to better understand what type (based on gender and number of borrower) receives how much funding.

The borrower dimension 'borrower gender' was picked as the company mentions that part of their mission is to support female owned businesses and by focusing on this dimension it can be assessed if their claim is also supported by their data. Moreover, lenders on the platform can choose which investment they want to support. If there are differences between the achieved funding amount and/or rate between the sexes, it could indicate that the gender of the borrower influences the success rate of funding for an investment.

In the following graph 
- 'small group' is defined as 2 to 5 individuals
- 'large group' is defined as more than 5 individuals.


Please note, it is not possible to prove a causal relationship between gender and funding rate nor is it aimed for in the following visualization. Other factors such as country, sector or a factor that is not documented in this dataset might be the real cause of differences in the success rate.  

In [10]:
# prepare the data

df_group_loanamount = df.groupby(by=['borrower_type','borrower_type_general'],as_index=False).agg(sum_funded=('funded_amount','sum'),count_borrower=('borrower_type','count'))

In [11]:

#define scatter plot
loan_group_scatter = px.scatter(
    df_group_loanamount,
    y="count_borrower",
    x="sum_funded",
    color="borrower_type_general",
    color_discrete_map={
        'female': 'deeppink',
        'male': 'blue',
        'female and male': 'darkorange'
    },
    symbol="borrower_type",
    symbol_map={
        'Individual Female': 'circle',
        'Individual Male': 'circle',
        'Large female group': 'square',
        'Large male group': 'square',
        'Large mixed group': 'square',
        'Small female group': 'diamond',
        'Small male group': 'diamond',
        'Small mixed group': 'diamond'
    },
    labels={
        'borrower_type': 'Type of borrower',
        'count_borrower': '# borrowers',
        'sum_funded': 'Total amount funded',
        'borrower_type_general': 'Gender',
        
    },
    
    custom_data=df_group_loanamount[[
        'count_borrower', 'sum_funded', 'borrower_type_general',
        'borrower_type'
    ]])

# set marker size of the symbols
loan_group_scatter.update_traces(marker_size=8)

# adjust scatterplot's layout
loan_group_scatter.update_layout(
    title='<b>Number of loans and funding amount per borrower type in 2016<b>',
    xaxis_title='Total funding in USD',
    yaxis_title='# of borrowers',
    title_x=0.06,
    title_y=0.95,
    template='simple_white',
    font_size=10,
    title_font_color='#283747 ',
    hoverlabel=dict(font_size=13, font_family="Calibri"),
    paper_bgcolor="#d6dbdf",
    margin=dict(t=50, l=25, r=25, b=25))


#display scatter plot
loan_group_scatter.show()

#### To note

The scatterplot shows the number of borrowers and the total amount of funding in USD per borrower group. Based on the assumption that the company's revenue is a derivative of both the number of investments as well as their individual amount (the more investments, the higher their revenue and the higher the funding, the higher the revenue), the optimal point on the scatterplot is the far upper right corner. Borrower types in this corner not only place many investment requests on their website but also their individual investment amount was high.

As can be seen in the scatterplot the borrower category with the by far highest funding amount and the highest amount of borrowers is the category ' individual female'. In contrast the the lowest funding amount and the highest amount of borrowers is the category ' large male group female'. This supports the company's claim that one of their target borrower groups is female owned businesses.

It can also be observed that individuals are the most common borrowers group and also receive the highest chunk of the overall funding. Moreover, based on their position in the scatterplot it seems that on average male individuals receive more funding than individual women. 

Additionally, the average funded amount for large mixed and large female groups seems to be the highest.

To summarize

- individuals are the company's largest customer group
- most borrowers are female
- average funded sum (defined as the amount funded divided by number of borrowers) is highest for large female and large mixed groups. 

Based on the correlation table, displayed above, it can be derived that on average large mixed and large female groups also apply for the highest loans.



If desired, there seems to be some more room for expansion for investments of male borrowers. For instance, if the company does not plan to expand their market geographically any further, the company could expand in regions where there are already borrowers present by targeting male borrowers and encouraging them to place an investment on their platform.

However, to assess if this is an advisable step, next the funding success rate per gender category will be analyzed. If men have a lower funding rate (funding amount / loan amount) as women, it might be not advisable to include more male owned businesses on the platform as the current overall funding rate needs to be maintained, or if possible, increased in order to stay attractive for new borrowers.

### Is there a difference in funding rate between borrower types?

To assess if there is a difference between the funding rate between men and women, their funding rate will be normalized / shown per borrower type as a percentage of the overall funded and loan amount. Normalizing the data is necessary as the number of female borrowers exceeds the number of male borrowers by far.

In [12]:
#prepare the data

df_funding_borrower_stacked = df.groupby(['borrower_type', 'funding_status'
                                          ]).size().reset_index()

df_funding_borrower_stacked['Percentage'] = df.groupby(['borrower_type', 'funding_status'
]).size().groupby(level=0).apply(lambda x: 100 * x / float(x.sum())).values

df_funding_borrower_stacked.columns = ['borrower_type', 'funding_status', 
                                       'Counts', 'Percentage']

df_funding_borrower_stacked['Percentage'] = round(df_funding_borrower_stacked['Percentage'].astype(float), 2)


Not prepending group keys to the result index of transform-like apply. In the future, the group keys will be included in the index, regardless of whether the applied function returns a like-indexed object.
To preserve the previous behavior, use

	>>> .groupby(..., group_keys=False)


	>>> .groupby(..., group_keys=True)



In [13]:
# sort values

df_funding_borrower_stacked.sort_values('Percentage',ascending=False, inplace=True)

In [14]:
#define bar plot

borrower_type_bar = px.bar(
    df_funding_borrower_stacked,
    x='borrower_type',
    y='Percentage',
    color='funding_status',
    barmode='stack',
    color_discrete_map={
        'fully funded': 'darkseagreen',
        'not funded': 'salmon',
        'up to 20% funded': 'peachpuff',
        '>20% funded': 'moccasin',
        '>40% funded': 'palegoldenrod',
        '>60% funded': 'darkkhaki',
        '>80% funded': 'olivedrab',
        'overfunded': 'darkolivegreen'
    },
    text=df_funding_borrower_stacked['Percentage'].astype(str) + '%',
    custom_data=df_funding_borrower_stacked[[
        'borrower_type', 'Percentage', 'funding_status'
    ]],
    title='Overview of funding status',
    labels={'funding_status': 'loan status'})

# adjust bar plot's layout
borrower_type_bar.update_layout(
    title='<b>Status of loans per borrower type in 2016<b>',
    xaxis_title='Type of borrower based on borrower count and gender',
    yaxis_title='Percentage of loan status',
    title_x=0.03,
    title_y=0.95,
    yaxis_range=[60,103],
    template='simple_white',
    font_size=9,
    title_font_color='#283747 ',
    hoverlabel=dict(font_size=13, font_family="Calibri"),
    paper_bgcolor="#d6dbdf",
    margin=dict(t=50, l=25, r=25, b=25))

# set hover text
borrower_type_bar.data[0].hovertemplate = (
    '<b>%{customdata[1]}</b>% of loans <br>were <b>%{customdata[2]}</b> for <b>%{customdata[0]}s.</b>'
    '<br>' + '<br>')

#display bar plot
borrower_type_bar.show()

#### To note

It can be observed that female borrower are most successful at reaching their funding goal, followed by mixed groups and male borrowers. There are multiple potential reasons for this, e.g.

- motivation of the lender
- difference in sector of investment
- difference in loan amount
- difference in repayment interval.

For instance, 
- lenders consciously chose female owned businesses as their investment as this supports their values or
- male investments are systematically in different sectors that are not as attractive to lenders.

As could be seen in the scatter plot above, the average female investments tend to be smaller than the ones of men. On the platform, lenders can start investing at 25 USD. Thus, the chances of getting fully funded are higher if the loan amount is set lower. Therefore, for instance, as a next step the median (mean) of each investment by borrower type can be analyzed to see if this aspect might be part of their high funding rate.  

Overall, it can be concluded that the company's business model of bringing together borrowers and lenders is successful. Women are not only their largest customer group but also have a comparably higher funding rate, defined as the division of funded amount and loan amount. This is another indication for the company's business.

## Dashboard

### Dashboard No.1

The dashboard shows the success rate of the individual loan funding rate (defined as funded amount divided by the loan amount) on the most granular level: hence, each loan from the dataset is visualized. Combined with the provided filters for gender type and gender category, the data can be analyzed further.

In [15]:
# Create app
my_crowdsourcing_app = JupyterDash(__name__)

# Layout
my_crowdsourcing_app.layout = html.Div([

    # Set header
    html.H1(id='title',
            children=
            'Loan amount and funded amount by gender and number of borrower',
            style={
                'textAlign': 'center',
                'font-family': 'Arial',
                'font-size': '21px',
                'color': '#082b51'
            }),

    # add a line break
    html.Br(),

    # add label for first dropdown
    html.Label("Select gender category of borrower:",
               style={
                   'font-family': 'Arial',
                   'font-size': '13px',
                   'font-color': '#428bca'
               }),
    # Dropdown for broader category of borrowers' gender
    dcc.RadioItems(id="ri_borrower_type_general",
                   options=[{
                       'label': oberkat,
                       'value': oberkat
                   } for oberkat in df.loc[:,
                                           'borrower_type_general'].unique()],
                   value='female',
                   style={
                       'font-family': 'Arial',
                       'font-size': '13px',
                       "padding": "10px",
                       "max-width": "800px",
                       'display': 'inline-block',
                       'margin-left': '7px'
                   },
                   labelStyle={
                       'display': 'inline-block',
                       'font-weight': '300',
                   }),

    # add a line break
    html.Br(),

    # add label for second dropdown
    html.Label(
        "Specify type of borrower:",
        style={
            #       'font-weight': 'bold',
            'font-family': 'Arial',
            'font-size': '13px',
            'font-color': '#428bca'
        }),

    # Dropdown for more detailed category of borrowers' gender
    dcc.Dropdown(id='dd_borrower_type',
                 options=[],
                 multi=True,
                 style={
                     'width': 660,
                     'font-family': 'Arial',
                     'font-size': '13px'
                 },
                 value=[],
                 optionHeight=20,
                 placeholder="Please select a borrower category."),
    dcc.Graph(id='my_scatter', figure={})
])

# define callbacks

## first callback: Based on the selected value in the first dropdown, provide filtered selection options for the second dropdown


@my_crowdsourcing_app.callback(
    Output(component_id='dd_borrower_type', component_property='options'),
    Input(component_id='ri_borrower_type_general', component_property='value'))
def set_general_gender(general_gender_cat):

    df_broader_gender = df.loc[df['borrower_type_general'] ==
                               general_gender_cat]

    return [{
        'label': detailed_gender,
        'value': detailed_gender
    } for detailed_gender in sorted(
        df_broader_gender.loc[:, 'borrower_type'].unique())]


## second callback: create list of suitable detailed gender categories which can be selected in dashboard


@my_crowdsourcing_app.callback(
    Output(component_id='dd_borrower_type', component_property='value'),
    Input(component_id='dd_borrower_type', component_property='options'))
def set_detailed_gender(detailed_gender_cat):

    return [gender['value'] for gender in detailed_gender_cat]


## third callback: Create scatter plot based on selected value(s) in detailed gender category


@my_crowdsourcing_app.callback(
    Output(component_id='my_scatter', component_property='figure'),
    Input(component_id='dd_borrower_type', component_property='value'),
    Input(component_id='ri_borrower_type_general', component_property='value'))

def update_graph(selected_unterkat, selected_oberkat):

    if selected_unterkat == None:

        return no_update

    else:
        # select data
        df_final = df.loc[
            (df.loc[:, 'borrower_type'].isin(selected_unterkat)) &
            (df.loc[:, 'borrower_type_general'] == selected_oberkat), :]

        # create scatter plot
        my_scatter = px.scatter(
            data_frame=df_final,
            x="loan_amount",
            y="funded_amount",
            color="borrower_type",
            color_discrete_map={
                'Individual Female': 'deeppink',
                'Individual Male': 'blue',
                'Large female group': '#6b5b95',
                'Large male group': '#034f84',
                'Large mixed group': '#ff7b25',
                'Small female group': 'palevioletred',
                'Small male group': '#80ced6',
                'Small mixed group': '#f2ae72'
            },
            symbol="borrower_type",
            symbol_map={
                'Individual Female': 'circle',
                'Individual Male': 'circle',
                'Large female group': 'square',
                'Large male group': 'square',
                'Large mixed group': 'square',
                'Small female group': 'diamond',
                'Small male group': 'diamond',
                'Small mixed group': 'diamond'
            },
            #  color="borrower_type",
            labels={
                "loan_amount": "Loan amount in USD",
                "funded_amount": "Funded amount in USD",
                "borrower_type": "Borrower gender category",
                'borrower_type_general': 'borrower type'
            },
            template="simple_white")

        my_scatter.update_traces(marker=dict(size=8))

    return my_scatter


# Run app
if __name__ == '__main__':
    my_crowdsourcing_app.run_server(mode='inline', port=8092)

### Dashboard No. 2

The dashboard shows the funding amount per sector and continent.

In [16]:
df.head()

Unnamed: 0,funded_amount,loan_amount,activity,sector,country_code,country,currency,term_in_months,lender_count,borrower_genders,repayment_interval,borrower_type,borrower_type_general,%funded,funding_status,invested_per_lender,continent_code,continent
0,300.0,300.0,Fruits & Vegetables,Food,PK,Pakistan,PKR,12.0,12,female,irregular,Individual Female,female,1.0,fully funded,25.0,AS,Asia
1,575.0,575.0,Rickshaw,Transportation,PK,Pakistan,PKR,11.0,14,"female, female",irregular,Small female group,female,1.0,fully funded,41.1,AS,Asia
2,150.0,150.0,Transportation,Transportation,IN,India,INR,43.0,6,female,bullet,Individual Female,female,1.0,fully funded,25.0,AS,Asia
3,200.0,200.0,Embroidery,Arts,PK,Pakistan,PKR,11.0,8,female,irregular,Individual Female,female,1.0,fully funded,25.0,AS,Asia
4,400.0,400.0,Milk Sales,Food,PK,Pakistan,PKR,14.0,16,female,monthly,Individual Female,female,1.0,fully funded,25.0,AS,Asia


In [17]:
# Create app
my_crowdsourcing_app_2 = JupyterDash(__name__)

# Layout
my_crowdsourcing_app_2.layout = html.Div([

# Set header
    html.H1(id='title',
            children=
            'Average amount funded per sector and continent',
            style={
                'textAlign': 'center',
                'font-family': 'Arial',
                'font-size': '21px',
                'color': '#082b51'
            }),

    # add a line break
    html.Br(),
  

    # add label for first dropdown
    html.Label("Select continent:",
               style={
                   'font-family': 'Arial',
                   'font-size': '13px',
                   'font-color': '#428bca'
               }),
    # Dropdown 
    dcc.RadioItems(id="ri_continent",
                   options=[{
                       'label': oberkat,
                       'value': oberkat
                   } for oberkat in df.loc[:,
                                           'continent'].unique()],
                   value='Asia',
                   style={
                       'font-family': 'Arial',
                       'font-size': '13px',
                       "padding": "10px",
                       "max-width": "800px",
                       'display': 'inline-block',
                       'margin-left': '7px'
                   },
                   labelStyle={
                       'display': 'inline-block',
                       'font-weight': '300',
                   }),

    # add a line break
    html.Br(),

    # add label for second dropdown
    html.Label(
        "Specify sector:",
        style={
            #       'font-weight': 'bold',
            'font-family': 'Arial',
            'font-size': '13px',
            'font-color': '#428bca'
        }),

    # Dropdown
    dcc.Dropdown(id='dd_sector',
                 options=[],
                 multi=True,
                 style={
                     'width': 660,
                     'font-family': 'Arial',
                     'font-size': '13px'
                 },
               #  value=[],
                 value=df.sector.unique(),
                 optionHeight=20,
                 placeholder="Please select a sector."),
    dcc.Graph(id='my_bar', figure={})
])

# define callbacks

## first callback


@my_crowdsourcing_app_2.callback(
    Output(component_id='dd_sector', component_property='options'),
    Input(component_id='ri_continent', component_property='value'))

def set_continent(continent_cat):

    df_continent = df.loc[df['continent'] ==
                               continent_cat]

    return [{
        'label': sector,
        'value': sector
    } for sector in sorted(
        df_continent.loc[:, 'sector'].unique())]


## second callback


@my_crowdsourcing_app_2.callback(
    Output(component_id='dd_sector', component_property='value'),
    Input(component_id='dd_sector', component_property='options'))

def set_sector(sector_cat):

    return [sector['value'] for sector in sector_cat]


## third callback

@my_crowdsourcing_app_2.callback(
    Output(component_id='my_bar', component_property='figure'),
    Input(component_id='dd_sector', component_property='value'),
    Input(component_id='ri_continent', component_property='value'))

def update(selected_unterkat, selected_oberkat):

    if selected_unterkat == None:

        return no_update

    else:
        # select data
        df_final = df.loc[
            (df.loc[:, 'sector'].isin(selected_unterkat)) &
            (df.loc[:, 'continent'] == selected_oberkat), :]
        

    # group data
    df_group = df_final.groupby(by=['sector'], as_index=False)["funded_amount"].mean()
    
    # sort data
    df_group = df_group.sort_values(by="funded_amount", ascending=False)
    
    # create bar plot
    my_bar = px.bar(data_frame=df_group,
                    x='funded_amount',
                    y='sector',
                    color='sector',
                    color_discrete_sequence=px.colors.qualitative.Set3,
                    template='simple_white',
                    labels={'sector':'Sector', 'funded_amount': 'Average amount funded'})
    

    return my_bar


# Run app
if __name__ == '__main__':
    my_crowdsourcing_app_2.run_server(mode='inline', port=8091)