# Kickstarter Initial Data Exploration

* **Data Source**: https://webrobots.io/kickstarter-datasets/

**NOTE 1**: Need to ensure that the variables that we incorporate into the model are not giving data leakage. For example, we would need to leave out the staff pick variable (staff are potentially picking things that they believe are going to succeed). 

**NOTE 2**: There is a data dictionary for the kickstarter dataset in the references folder. 

## INTRODUCTION
Kickstarter is a US based global crowd funding platform focused on bringing funding to creative projects.
Since the platform’s launch in 2009, the site has hosted over 159,000 successfully funded projects with over
15 million unique backers. Kickstarter uses an “all-or-nothing” funding system. This means that funds are
only dispersed for projects that meet the original funding goal set by the creator.

## PROJECT OBJECTIVE
Kickstarter earns 5% commission on projects that are successfully funded. Currently, less than 40% of
projects on the platform succeed. My objective is to predict which projects are likely to succeed so that these projects can be highlighted on the site either through 'staff picks' or 'featured product' lists. 

In [1]:
import pandas as pd
import numpy as np 
import matplotlib.pyplot as plt
import seaborn as sns

In [28]:
kickstarter1 = pd.read_csv('../../data/01_raw/Kickstarter_2019-06-13T03_20_35_801Z/Kickstarter.csv')

In [29]:
kickstarter2 = pd.read_csv('../../data/01_raw/Kickstarter_2019-06-13T03_20_35_801Z/Kickstarter055.csv')

In [33]:
len(kickstarter1)

3768

In [35]:
kickstarter1.columns

Index(['backers_count', 'blurb', 'category', 'converted_pledged_amount',
       'country', 'created_at', 'creator', 'currency', 'currency_symbol',
       'currency_trailing_code', 'current_currency', 'deadline',
       'disable_communication', 'friends', 'fx_rate', 'goal', 'id',
       'is_backing', 'is_starrable', 'is_starred', 'launched_at', 'location',
       'name', 'permissions', 'photo', 'pledged', 'profile', 'slug',
       'source_url', 'spotlight', 'staff_pick', 'state', 'state_changed_at',
       'static_usd_rate', 'unread_messages_count', 'unseen_activity_count',
       'urls', 'usd_pledged', 'usd_type'],
      dtype='object')

In [61]:
kickstarter1.iloc[500]

backers_count                                                             182
blurb                       The Moonhouse is the first art project on the ...
category                    {"id":340,"name":"Space Exploration","slug":"t...
converted_pledged_amount                                                 5011
country                                                                    US
created_at                                                         1400074893
creator                     {"id":1317303162,"name":"Mikael Genberg","is_r...
currency                                                                  USD
currency_symbol                                                             $
currency_trailing_code                                                   True
current_currency                                                          USD
deadline                                                           1414537140
disable_communication                                           

In [60]:
kickstarter1.source_url[500]

'https://www.kickstarter.com/discover/categories/technology/space%20exploration'

In [62]:
kickstarter1.head()

Unnamed: 0,backers_count,blurb,category,converted_pledged_amount,country,created_at,creator,currency,currency_symbol,currency_trailing_code,...,spotlight,staff_pick,state,state_changed_at,static_usd_rate,unread_messages_count,unseen_activity_count,urls,usd_pledged,usd_type
0,740,A 40 page (now 60!) coloring book of surreal n...,"{""id"":22,""name"":""Illustration"",""slug"":""art/ill...",18559,US,1448408106,"{""id"":518230978,""name"":""Andy Swartz"",""slug"":""s...",USD,$,True,...,True,False,successful,1456790340,1.0,,,"{""web"":{""project"":""https://www.kickstarter.com...",18559.0,domestic
1,3,"One of the first of it's kind, a story that wi...","{""id"":29,""name"":""Animation"",""slug"":""film & vid...",80,US,1491015630,"{""id"":2031793373,""name"":""Julian Mason"",""is_reg...",USD,$,True,...,False,False,failed,1496113200,1.0,,,"{""web"":{""project"":""https://www.kickstarter.com...",80.0,domestic
2,1752,A limited-edition Blu-ray release of Sonoda Ke...,"{""id"":29,""name"":""Animation"",""slug"":""film & vid...",135589,US,1435491807,"{""id"":1216340396,""name"":""Robert J Woodhead"",""s...",USD,$,True,...,True,True,successful,1459886413,1.0,,,"{""web"":{""project"":""https://www.kickstarter.com...",135589.94,domestic
3,185,"A YA, sword & sorcery fantasy novel!","{""id"":328,""name"":""Young Adult"",""slug"":""publish...",4652,US,1452147235,"{""id"":672576444,""name"":""Tristan J Tarwater"",""i...",USD,$,True,...,True,True,successful,1464670751,1.0,,,"{""web"":{""project"":""https://www.kickstarter.com...",4652.0,domestic
4,136,Production of a feature film from the twisted ...,"{""id"":11,""name"":""Film & Video"",""slug"":""film & ...",14568,AU,1532058959,"{""id"":1622887785,""name"":""Jaan Ranniko"",""is_reg...",AUD,$,True,...,False,False,live,1558319564,0.686201,,,"{""web"":{""project"":""https://www.kickstarter.com...",14474.246942,domestic


In [67]:
kickstarter1.category[2]

'{"id":29,"name":"Animation","slug":"film & video/animation","position":2,"parent_id":11,"color":16734574,"urls":{"web":{"discover":"http://www.kickstarter.com/discover/categories/film%20&%20video/animation"}}}'

**IDEAS**:

1. Look at language complexity in blurbs and see if that has anything to do with success of campaign