# Crowdfunding Analysis

# Table of contents
1. [Introduction](#introduction)
2. [Data Wrangling](#datawrangling)
    1. [Visual Assessment](#visualassessment)
    2. [Programmatic Assessment](#programmaticassessment)
    3. [Issues Summary](#issuessummary)
    4. [Cleaning Data](#cleaningdata)
3. [Exploratory Analysis](#exploratoryanalysis)


# Introduction <a name="introduction"></a>
Some introduction text, formatted in heading 2 style


In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Data Wrangling <a name="datawrangling"></a>
Some introduction text, formatted in heading 2 style

In [14]:
# read in all files into one dataframe
file_name =  './data/Kickstarter_{}.csv'
kickstarter = pd.concat([pd.read_csv(file_name.format(i)) for i in range(56)])
kickstarter.head()


Unnamed: 0,backers_count,blurb,category,converted_pledged_amount,country,created_at,creator,currency,currency_symbol,currency_trailing_code,...,slug,source_url,spotlight,staff_pick,state,state_changed_at,static_usd_rate,urls,usd_pledged,usd_type
0,21,2006 was almost 7 years ago.... Can you believ...,"{""id"":43,""name"":""Rock"",""slug"":""music/rock"",""po...",802,US,1387659690,"{""id"":1495925645,""name"":""Daniel"",""is_registere...",USD,$,True,...,new-final-round-album,https://www.kickstarter.com/discover/categorie...,True,False,successful,1391899046,1.0,"{""web"":{""project"":""https://www.kickstarter.com...",802.0,international
1,97,An adorable fantasy enamel pin series of princ...,"{""id"":54,""name"":""Mixed Media"",""slug"":""art/mixe...",2259,US,1549659768,"{""id"":1175589980,""name"":""Katherine"",""slug"":""fr...",USD,$,True,...,princess-pals-enamel-pin-series,https://www.kickstarter.com/discover/categorie...,True,False,successful,1551801611,1.0,"{""web"":{""project"":""https://www.kickstarter.com...",2259.0,international
2,88,Helping a community come together to set the s...,"{""id"":280,""name"":""Photobooks"",""slug"":""photogra...",29638,US,1477242384,"{""id"":1196856269,""name"":""MelissaThomas"",""is_re...",USD,$,True,...,their-life-through-their-lens-the-amish-and-me...,https://www.kickstarter.com/discover/categorie...,True,True,successful,1480607932,1.0,"{""web"":{""project"":""https://www.kickstarter.com...",29638.0,international
3,193,Every revolution starts from the bottom and we...,"{""id"":266,""name"":""Footwear"",""slug"":""fashion/fo...",49158,IT,1540369920,"{""id"":1569700626,""name"":""WAO"",""slug"":""wearewao...",EUR,€,False,...,wao-the-eco-effect-shoes,https://www.kickstarter.com/discover/categorie...,True,False,successful,1544309940,1.136525,"{""web"":{""project"":""https://www.kickstarter.com...",49075.152523,international
4,20,Learn to build 10+ Applications in this comple...,"{""id"":51,""name"":""Software"",""slug"":""technology/...",549,US,1425706517,"{""id"":1870845385,""name"":""Kalpit Jain"",""is_regi...",USD,$,True,...,apple-watch-development-course,https://www.kickstarter.com/discover/categorie...,False,False,failed,1428511019,1.0,"{""web"":{""project"":""https://www.kickstarter.com...",549.0,domestic


## Visual Assessment <a name="visualassessment"></a>
Some introduction text, formatted in heading 2 style

In [15]:
kickstarter.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 209222 entries, 0 to 964
Data columns (total 37 columns):
backers_count               209222 non-null int64
blurb                       209214 non-null object
category                    209222 non-null object
converted_pledged_amount    209222 non-null int64
country                     209222 non-null object
created_at                  209222 non-null int64
creator                     209222 non-null object
currency                    209222 non-null object
currency_symbol             209222 non-null object
currency_trailing_code      209222 non-null bool
current_currency            209222 non-null object
deadline                    209222 non-null int64
disable_communication       209222 non-null bool
friends                     300 non-null object
fx_rate                     209222 non-null float64
goal                        209222 non-null float64
id                          209222 non-null int64
is_backing                  300 non

## Programmatic Assessment <a name="programmaticassessment"></a>
Some introduction text, formatted in heading 2 style

In [19]:
# find duplicated projects
kickstarter[kickstarter.id.duplicated()].info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 26958 entries, 1150 to 964
Data columns (total 37 columns):
backers_count               26958 non-null int64
blurb                       26958 non-null object
category                    26958 non-null object
converted_pledged_amount    26958 non-null int64
country                     26958 non-null object
created_at                  26958 non-null int64
creator                     26958 non-null object
currency                    26958 non-null object
currency_symbol             26958 non-null object
currency_trailing_code      26958 non-null bool
current_currency            26958 non-null object
deadline                    26958 non-null int64
disable_communication       26958 non-null bool
friends                     136 non-null object
fx_rate                     26958 non-null float64
goal                        26958 non-null float64
id                          26958 non-null int64
is_backing                  136 non-null object
i

## Issues Summary <a name="issuessummary"></a>
### Tidiness Issues
- 56 separate tables, should be joined to one table 
- category contains json formated string including several variables, category id, category name and category slug (part of a URL which identifies a particular page on a website )

### Quality Issues
- created_at time format is unix time
- the same projects appears in multiple stages as a separate observation   


## Cleaning Data <a name="cleaningdata"></a>
Some introduction text, formatted in heading 2 style

# Exploratory Analysis <a name="exploratoryanalysis"></a>
Some introduction text, formatted in heading 2 style

Possible things to analyse
- Important projects
- Successful projects
- What are characteristics of successful projects 
