## Identification of sustainable-focused campaigns on the kickstarter crowdfunding platform using NLP and ML boosted with swarm intelligence
--- ------------------

### A. Introduction
--- -------------------

The aim of the project is to study how crowdfunding campaigns support sustainable inititatives. This project, in particular, focuses on crowdfunded campaigns in the [kickstarter](https://www.kickstarter.com/) platform and explores a dataset of c.a 184,186 initiatives from different domains (e.g, Technology, Music, Publishing etc.). The goal of the analyses here is to find the most important features that are relevant to initiatives that are both sustainable as well as profitable. The analyses will also explore the possible relationship of the features with each other, and elucidate insights that might contribute to better understanding of the success/failure propsects of current and future environment focused crowdfunded initiatives.


#### B. Details of dataset:
-- -------------------
1. Source: [Kickstarter_File.xlsx](Kickstarter_File.xlsx)
2. Generation mode: provided by researcher
3. Time period considered: 04-2009 to 05-2021 (c.a 146 months).
4. Total entries: 184,185

The initial data preparation consisted of examining the various features and eliminating redundant features & renaming and re-ordering of features and saving the dataframe (found here: [Notebook_Data_Prep](./01.Dataset_Prep.ipynb))

#### C. Imports:
-- ----------

In [5]:
import pandas as pd
pd.set_option('display.max_colwidth', None)

In [8]:
df= pd.read_excel('Kickstarter_File.xlsx')
df.sample(10)
#df.to_csv('dataframe_raw.csv', index=False)

Unnamed: 0,blurb,Environmental,Social,state,Subcategory,Unnamed: 5,converted_pledged_amount,country,country_displayable_name,created_at,...,launched_at,duration,name,pledged,slug,staff_pick,state.1,static_usd_rate,usd_exchange_rate,usd_pledged
15976,"On Aug 28 2010, maniac Glen Beck plans on mocking the memory of Martin Luther King. We will be there to Celebrate the Dream & Reject the Nightmare.",,,failed,Sculpture,Art,190.0,US,the United States,2010-08-24 15:51:08,...,2010-08-24 18:50:22,11.0,Celebrate MLK's Dream - Reject Beck's Nightmare,190.0,celebrate-mlks-dream-reject-becks-nightmare,0.0,failed,1.0,1.0,190.0
37755,"""The uprising of white power in Sweden"" is a book that supposedly will bring some light on the hidden anger of the white power youth in",,,failed,Academic,Publishing,0.0,SE,Sweden,2016-02-18 11:23:35,...,2016-02-23 14:32:39,56.826632,The uprising of white power in Sweden,0.0,the-uprising-of-white-power-in-sweden,0.0,failed,0.118218,0.123711,0.0
328472,,,,,,,,,,NaT,...,NaT,,,,,,,,,
25064,"We're building a real astronomical observatory for Burning Man 2014 and beyond, complete with giant telescope and science exhibits.",,,successful,Public Art,Art,28118.0,US,the United States,2014-04-11 21:56:30,...,2014-06-17 08:10:28,30.0,Black Rock Observatory,28118.82,black-rock-observatory,1.0,successful,1.0,1.0,28118.82
26936,Haley Dreis will be recording her folk-pop-space travel album at Nashville studio 'Forty-one Fifteen' and it's going to be amazing!,,,successful,Indie Rock,Music,15037.0,US,the United States,2016-08-04 18:37:38,...,2016-09-06 14:28:48,30.0,Help Haley record her 4th album with lush strings!,15037.0,help-haley-record-her-4th-album-with-lush-strings,0.0,successful,1.0,1.0,15037.0
361227,,,,,,,,,,NaT,...,NaT,,,,,,,,,
1039302,,,,,,,,,,NaT,...,NaT,,,,,,,,,
28014,A book about travels across the nation in search of the best ballpark hotdog.,,,successful,Nonfiction,Publishing,2069.0,US,the United States,2014-06-11 18:45:20,...,2014-06-16 19:40:48,59.966806,Gone to the Dogs-book about the search for the best hot dog,2069.13,gone-to-the-dogs-book-about-the-search-for-the-bes,1.0,successful,1.0,1.0,2069.13
618930,,,,,,,,,,NaT,...,NaT,,,,,,,,,
776573,,,,,,,,,,NaT,...,NaT,,,,,,,,,


In [22]:
print(df.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1048575 entries, 0 to 1048574
Data columns (total 24 columns):
 #   Column                    Non-Null Count   Dtype         
---  ------                    --------------   -----         
 0   blurb                     184184 non-null  object        
 1   Environmental             2053 non-null    object        
 2   Social                    2053 non-null    object        
 3   state                     184186 non-null  object        
 4   Subcategory               184186 non-null  object        
 5   Unnamed: 5                176465 non-null  object        
 6   converted_pledged_amount  184186 non-null  float64       
 7   country                   184186 non-null  object        
 8   country_displayable_name  184186 non-null  object        
 9   created_at                184186 non-null  datetime64[ns]
 10  currency                  184186 non-null  object        
 11  deadline                  184186 non-null  datetime64[ns]
 12  

In [52]:
pd.set_option('display.max_rows', 20)
print(df['slug'].value_counts(), df['slug'].count())

slug
travel-sax-the-smallest-electronic-saxophone-in-th    2
fuego-lento-novela-grafica-de-ricardo-pelaez          2
tooro-hybrid-polyphonic-supersynth                    2
ag-silver-ions-protective-jacket-0                    2
yokai-cyberpunk-illuminated-jacket                    2
                                                     ..
officer-voof-illustrated-childrens-book               1
help-the-mass-chaos-cd-production                     1
nobody-essays-from-a-lifer-skater                     1
harcelement-benson-et-ses-dames                       1
grandmas-are-life                                     1
Name: count, Length: 183994, dtype: int64 184186


In [27]:
unique_value_counts_environmental = df['Environmental'].value_counts()
unique_value_counts_environmental

unique_value_counts:
 Environmental
No     2010
Yes      43
Name: count, dtype: int64 
 unique_value_counts:
 Social
No     2027
Yes      26
Name: count, dtype: int64


In [19]:
unique_value_counts_social = df['Social'].value_counts()
unique_value_counts_social

Social
No     2027
Yes      26
Name: count, dtype: int64

In [33]:
import nltk
nltk.download('stopwords')

from rake_nltk import Rake

# Uses stopwords for english from NLTK, and all puntuation characters by
# default
r = Rake()
r = Rake(min_length=1, max_length=1)

# Extraction given the list of strings where each string is a sentence.
r.extract_keywords_from_sentences(blurb_is_environmental)

# To get keyword phrases ranked highest to lowest.
r.get_ranked_phrases()

# To get keyword phrases ranked highest to lowest with scores.
r.get_ranked_phrases_with_scores()

state
successful    108423
failed         75763
Name: count, dtype: int64 184186
