# <u>Crowdfunding</u>

### Background
You are a data analyst at a crowdfunding site. For the next quarter, your company will be running a marketing campaign. The marketing manager wants to target those segments that have donated the most in the past year. She turned to you to help her with her upcoming meeting with the CEO.

### Challenge
Create a single visualization that the marketing manager can use to explore the data. Include:
   - What are the top three categories in terms of total donations?
   - What device type has historically provided the most contributions?
   - What age bracket should the campaign target?

In [50]:
# import libraries
import pandas as pd
import plotly.express as px

%matplotlib inline

In [2]:
df = pd.read_csv('crowdfunding.csv')        # read the data

In [3]:
df.head()

Unnamed: 0,category,device,gender,age,amount
0,Fashion,iOS,F,45-54,61.0
1,Sports,android,M,18-24,31.0
2,Technology,android,M,18-24,39.0
3,Technology,iOS,M,18-24,36.0
4,Sports,android,M,18-24,40.0


In [4]:
df.shape

(20658, 5)

In [5]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 20658 entries, 0 to 20657
Data columns (total 5 columns):
 #   Column    Non-Null Count  Dtype  
---  ------    --------------  -----  
 0   category  20658 non-null  object 
 1   device    20658 non-null  object 
 2   gender    20658 non-null  object 
 3   age       20658 non-null  object 
 4   amount    20658 non-null  float64
dtypes: float64(1), object(4)
memory usage: 807.1+ KB


In [7]:
df.describe()

Unnamed: 0,amount
count,20658.0
mean,39.407009
std,14.913658
min,1.0
25%,29.0
50%,39.0
75%,50.0
max,101.0


From the initial analysis it is clear that the dataset has no missing values. There are 5 features out of which 4 are categorical and 1 numerical. The statistical information tells the range and distribution of the amount feature.

### 1. What are the top three categories in terms of total donations?

In [34]:
df_cat = df[['category', 'amount']].groupby('category').sum().reset_index().sort_values(by='amount', ascending=False)
df_cat

Unnamed: 0,category,amount
2,Games,165483.0
3,Sports,163528.0
4,Technology,162731.0
0,Environment,162376.0
1,Fashion,159952.0


In [49]:
px.pie(df_cat, values='amount', names='category', color_discrete_sequence=px.colors.sequential.RdBu, title='Category-wise Amount Distribution')

### 2. What device type has historically provided the most contributions?

In [48]:
df_device = df[['device', 'amount']].groupby('device').sum().reset_index().sort_values(by='amount', ascending=False)
df_device

Unnamed: 0,device,amount
1,iOS,530525.0
0,android,283545.0


In [46]:
px.pie(df_device, values='amount', names='device', color_discrete_sequence=px.colors.sequential.RdBu_r, title='Device-wise Amount Distribution')

### 3. What age bracket should the campaign target?

In [43]:
df_age = df[['age', 'amount']].groupby('age').sum().reset_index().sort_values(by='amount', ascending=False)
df_age

Unnamed: 0,age,amount
0,18-24,411077.0
2,35-44,105597.0
1,25-34,99763.0
4,55+,98938.0
3,45-54,98695.0


In [45]:
px.pie(df_age, values='amount', names='age', color_discrete_sequence=px.colors.sequential.BuGn_r, title='Age-wise Amount Distribution')

### 4. Single Visualization to summarize the above findings

In [33]:
df_top = df[['age', 'category', 'device', 'amount']].groupby(['age', 'category', 'device']).sum().sort_values(by='amount', ascending=False).reset_index()
px.sunburst(df_top, path=['age', 'category', 'device'], title='Crowdfunding Distribution', values='amount', height=700)

### This sunburst summarizes all our above findings. It states what is the contribution of each group in crowdfunding. Some of the key takeaways are:
   1. `18-24` agegroup contributes almost half of the total amount raised
   
   2. `iOS` is more popular than `android` for all the categories
   
   3. All the categories raised almost the same amount with `Games` performing marginally better.
   
   4. `Technology` is the top category among folks between `25-44` whereas `Environment` is liked by people above `55` years of age

### 4. Other Insights

In [52]:
df_gender = df[['gender', 'amount']].groupby('gender').sum().reset_index().sort_values(by='amount', ascending=False)
df_gender

Unnamed: 0,gender,amount
0,F,378779.0
1,M,377553.0
2,U,57738.0


In [54]:
px.pie(df_gender, values='amount', names='gender', color_discrete_sequence=px.colors.sequential.OrRd, title='Gender-wise Amount Distribution')

### 5. Summary

From the above tables and chart we can conclude that:
    
    1. Games is the top category
    2. iOS received almost double the donations received by android
    3. Young adults in the age of 18-24 contributed the most
    4. The sunburst gives a clear view of how the crowdfunding did last year. It summarizes which sector donated how much. Also, it is clear that the marketing campaign should be targeted to young adults as they are the most prospective donars
    5. Finally we also concluded that females contributed more than the other genders