



# Determining Success and Failure Factors for Startups
***

## Introduction

The purpose of this project is to find out the success and failure factors behind startup companies. To that end, I am going to investigate a dataset provided by [Metric.am](https://metric.am/), containing inforation about 472 startups and their status: *'Success'* or *'Failed'*. 

## Methodology

For the sake of this project, we will consider two classes of startups: *failed* and *successful*. As already mentioned above, we are interested in the factors behind the success or failure of the startup. If we tried to map the problem to the data, we would be interested in the importance of each feature or attribute while predicting the status of the company. Thus, it would be most meaningful and comfortable to solve the problem using a model which would provide us with this information. 

A model that seems suitable under the mentioned constraints is Logstic Regression. It is a classification algorithm, so it can trained to predict the status of the company. It also associates weights with idividual features, which we can use to determine the importance of the feature, as well as point out if its effect is positive or negative.

So this point onwards, we are going to pursue the following steps:
- [Retrieve the data](#Retrieving-data)
- [Explore the data](#Exploring-data)
- [Prepare the data]()
- [Build the classifier]()
    - [Split the data into train-test sets]()
    - [Build the model]()
    - [Train the model]()
- [Test the accuracy]()
- [Analyze the calculated weights]()
- [Report the conclusions](#Report)

***
## Retrieving data

First of all, let's download the data into a *.csv* file. I will use [wget](https://www.gnu.org/software/wget/) to download raw data from the [GitHub repository](https://github.com/Metricam/Internship_tasks/tree/master/Startup_Success).

In [617]:
!wget -O data.csv https://raw.githubusercontent.com/Metricam/Internship_tasks/master/Startup_Success/data.csv

--2020-05-19 20:22:05--  https://raw.githubusercontent.com/Metricam/Internship_tasks/master/Startup_Success/data.csv
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.16.133
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.16.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 315563 (308K) [text/plain]
Saving to: ‘data.csv’


2020-05-19 20:22:06 (14.2 MB/s) - ‘data.csv’ saved [315563/315563]



We are also provided a *dictionary* dataset, which explains some of the features provided in the main dataset. Let's get that too.

In [618]:
!wget -O dictionary.csv https://raw.githubusercontent.com/Metricam/Internship_tasks/master/Startup_Success/dictionary.csv

--2020-05-19 20:22:07--  https://raw.githubusercontent.com/Metricam/Internship_tasks/master/Startup_Success/dictionary.csv
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.16.133
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.16.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 5183 (5.1K) [text/plain]
Saving to: ‘dictionary.csv’


2020-05-19 20:22:07 (42.5 MB/s) - ‘dictionary.csv’ saved [5183/5183]



***
## Exploring data

Now let's understand the data. For that we will need to import some libraries. (Notice, you might need to install the libraries. If that's the case, un-comment the first four lines and then run the block.)

In [619]:
# import sys
# !{sys.executable} -m pip install numpy
# !{sys.executable} -m pip install pandas
# !{sys.executable} -m pip install matplotlib

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn import preprocessing
import re

### About dataset

What are the attributes?

In [620]:
data_dict = pd.read_csv('dictionary.csv')
pd.options.display.max_rows = None
pd.options.display.max_columns = None
display(data_dict)

Unnamed: 0,Variable,Description
0,Company_Name,
1,Dependent-Company Status,Dependent variable indicating if company succe...
2,year of founding,
3,Age of company in years,
4,Internet Activity Score,How much company is acgtive on social media
5,Short Description of company profile,
6,Industry of company,
7,Focus functions of company,
8,Investors,List of investors
9,Employee Count,


Let's take a look at our data. 

In [621]:
#df = pd.read_csv('data.csv')
#df.head()

The default encoder in read_csv() function is *UTF-8*, however, as we can see from the above error, the data in the *data.csv* file in not UTF-8 encoded. Let's try several other encodings to see if any of them succeeds to decode the data. Here are some: *latin1*, *iso-8859-1*, *cp1252*.

In [622]:
df = pd.read_csv('data.csv', encoding='latin1')
df.head()
#display(df)

Unnamed: 0,Company_Name,Dependent-Company Status,year of founding,Age of company in years,Internet Activity Score,Short Description of company profile,Industry of company,Focus functions of company,Investors,Employee Count,Employees count MoM change,Has the team size grown,Est. Founding Date,Last Funding Date,Last Funding Amount,Country of company,Continent of company,Number of Investors in Seed,Number of Investors in Angel and or VC,Number of Co-founders,Number of of advisors,Team size Senior leadership,Team size all employees,Presence of a top angel or venture fund in previous round of investment,Number of of repeat investors,Number of Sales Support material,Worked in top companies,Average size of companies worked for in the past,Have been part of startups in the past?,Have been part of successful startups in the past?,Was he or she partner in Big 5 consulting?,Consulting experience?,Product or service company?,Catering to product/service across verticals,Focus on private or public data?,Focus on consumer data?,Focus on structured or unstructured data,Subscription based business,Cloud or platform based serive/product?,Local or global player,Linear or Non-linear business model,"Capital intensive business e.g. e-commerce, Engineering products and operations can also cause a business to be capital intensive",Number of of Partners of company,Crowdsourcing based business,Crowdfunding based business,Machine Learning based business,Predictive Analytics business,Speech analytics business,Prescriptive analytics business,Big Data Business,Cross-Channel Analytics/ marketing channels,Owns data or not? (monetization of data) e.g. Factual,Is the company an aggregator/market place? e.g. Bluekai,Online or offline venture - physical location based business or online venture?,B2C or B2B venture?,Top forums like 'Tech crunch' or 'Venture beat' talking about the company/model - How much is it being talked about?,Average Years of experience for founder and co founder,Exposure across the globe,Breadth of experience across verticals,Highest education,Years of education,Specialization of highest education,Relevance of education to venture,Relevance of experience to venture,Degree from a Tier 1 or Tier 2 university?,Renowned in professional circle,Experience in selling and building products,Experience in Fortune 100 organizations,Experience in Fortune 500 organizations,Experience in Fortune 1000 organizations,Top management similarity,Number of Recognitions for Founders and Co-founders,Number of of Research publications,Skills score,Team Composition score,Dificulty of Obtaining Work force,Pricing Strategy,Hyper localisation,Time to market service or product,Employee benefits and salary structures,Long term relationship with other founders,Proprietary or patent position (competitive position),Barriers of entry for the competitors,Company awards,Controversial history of founder or co founder,Legal risk and intellectual property,Client Reputation,google page rank of company website,Technical proficiencies to analyse and interpret unstructured data,Solutions offered,Invested through global incubation competitions?,Industry trend in investing,Disruptiveness of technology,Number of Direct competitors,Employees per year of company existence,Last round of funding received (in milionUSD),"Survival through recession, based on existence of the company through recession times",Time to 1st investment (in months),"Avg time to investment - average across all rounds, measured from previous investment",Gartner hype cycle stage,Time to maturity of technology (in years),Percent_skill_Entrepreneurship,Percent_skill_Operations,Percent_skill_Engineering,Percent_skill_Marketing,Percent_skill_Leadership,Percent_skill_Data Science,Percent_skill_Business Strategy,Percent_skill_Product Management,Percent_skill_Sales,Percent_skill_Domain,Percent_skill_Law,Percent_skill_Consulting,Percent_skill_Finance,Percent_skill_Investment,Renown score
0,Company1,Success,No Info,No Info,-1.0,Video distribution,,operation,KPCB Holdings|Draper Fisher Jurvetson (DFJ)|Kl...,3.0,0.0,No,,5/26/2013,450000.0,United States,North America,2,0,1,2,2,15,Yes,4,Nothing,No,Small,No,No,No,No,Service,No,Private,No,Both,Yes,Platform,Global,Linear,Yes,,No,No,No,No,No,No,No,No,No,Yes,Online,B2C,High,High,Yes,Low,Masters,21,business,Yes,Yes,Tier_1,500,Medium,0,0,0,,0,,0.0,Low,Low,Yes,No,High,No Info,No,No,Yes,No,No,No,No Info,9626884,No,Yes,No,2.0,Low,0,1.5,0.45,No Info,No Info,11.56,,,0.0,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0,0,0,0,0,0
1,Company2,Success,2011,3,125.0,,Market Research|Marketing|Crowdfunding,"Marketing, sales",,,,No,,,,United States,North America,5,0,2,0,4,20,No,0,medium,Yes,Large,Yes,Yes,No,No,Product,No,Public,Yes,Both,No,Platform,Local,Non-Linear,No,Few,Yes,No,Yes,Yes,No,No,Yes,Yes,Yes,No,Online,B2C,Low,High,Yes,High,Masters,21,Supply Chain Management & Entrepreneurship,Yes,Yes,Tier_1,500,High,0,0,0,Medium,13,,34.0,High,Medium,Yes,No,Low,No Info,No,Yes,Yes,No,No,Yes,Medium,1067034,Yes,Yes,No,3.0,Medium,0,6.666666667,5.0,Not Applicable,10,9.0,Trough,2 to 5,15.88235294,11.76470588,15.0,12.94117647,0,8.823529412,21.76470588,10.88235294,2.941176471,0.0,0,0,0,0,8
2,Company3,Success,2011,3,455.0,Event Data Analytics API,Analytics|Cloud Computing|Software Development,operations,TechStars|Streamlined Ventures|Amplify Partner...,14.0,0.0,No,12/1/2011,10/23/2013,2350000.0,United States,North America,15,0,3,0,7,10,No,0,low,Yes,Medium,No,No,No,No,Both,Yes,Private,Yes,Both,Yes,cloud,Local,Non-Linear,No,Few,No,No,No,Yes,No,No,Yes,No,No,No,Online,B2B,Low,Medium,Yes,Low,Bachelors,18,General,Yes,Yes,Tier_2,500,High,0,0,1,Medium,18,,36.0,High,Medium,Yes,No,Low,No Info,Yes,Yes,Yes,No,No,No,Low,71391,Yes,Yes,Yes,3.0,Medium,0,3.333333333,2.35,Not Applicable,2,7.344444444,Trough,2 to 5,9.401709402,0.0,57.47863248,0.0,0,3.846153846,17.09401709,9.401709402,0.0,2.777777778,0,0,0,0,9
3,Company4,Success,2009,5,-99.0,The most advanced analytics for mobile,Mobile|Analytics,Marketing & Sales,Michael Birch|Max Levchin|Sequoia Capital|Keit...,45.0,10.0,No,6/20/2009,5/10/2012,10250000.0,United States,North America,6,0,2,0,4,50,Yes,0,low,No,Large,Yes,Yes,No,No,Product,Yes,Public,Yes,Structured,Yes,Platform,Local,Non-Linear,No,Few,Yes,No,No,No,No,No,No,No,No,No,Online,B2C,Medium,Medium,Yes,Low,Bachelors,18,Computer Systems Engineering,Yes,Yes,Tier_2,No Info,Low,0,0,0,Medium,2,,15.5,Medium,Medium,Yes,No,Low,Good,No,Yes,Yes,No,No,No,Low,11847,No,Yes,Yes,4.0,Medium,2,10.0,10.25,Not Applicable,1,8.7,Trough,2 to 5,0.0,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0,0,0,0,0,5
4,Company5,Success,2010,4,496.0,The Location-Based Marketing Platform,Analytics|Marketing|Enterprise Software,Marketing & Sales,DFJ Frontier|Draper Nexus Ventures|Gil Elbaz|A...,39.0,3.0,No,4/1/2010,12/11/2013,5500000.0,United States,North America,7,0,1,1,8,40,No,0,high,No,Small,No,No,No,No,Product,Yes,Public,Yes,Both,No,Platform,Local,Non-Linear,Yes,Few,No,No,No,No,No,No,Yes,No,No,No,Online,B2B,Low,High,Yes,Medium,Bachelors,18,Industrial Engineering and Computer Science,Yes,Yes,,500,High,0,0,0,Low,5,Few,23.0,Medium,Medium,Yes,No,Low,Bad,Yes,Yes,Yes,No,No,No,Low,201814,Yes,Yes,No,3.0,Medium,0,10.0,5.5,Not Applicable,13,9.822222222,,,0.0,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0,0,0,0,0,6


Lucky for us, *latin1* was the real encoding!

***
## Preparing data

Now let's prepare our data for classification and separate the features from the labels. 

**Let's print the shape of our dataset before cleaning and hope that we'll not destroy it comletely after cleaning. :D**

In [623]:
print("Shape before cleaning: ", df.shape)

Shape before cleaning:  (472, 116)


### Choosing the feature set...

First of all, it seems like following columns don't give too much value for our purpose: *'Company_Name', 'Short Description of company profile', 'Last Funding Date', 'Est. Founding Date', 'Age of company in years', 'Gartner hype cycle stage', 'Time to maturity of technology (in years)'*, so let's drop them!

Also, the status column should not be in the feature set.

In [624]:
features = df.drop(['Dependent-Company Status', 'Company_Name', 'Short Description of company profile', 
                    'Last Funding Date', 'Est. Founding Date', 'Age of company in years', 
                    'Gartner hype cycle stage', 'Time to maturity of technology (in years)'], axis=1)
features.head()

Unnamed: 0,year of founding,Internet Activity Score,Industry of company,Focus functions of company,Investors,Employee Count,Employees count MoM change,Has the team size grown,Last Funding Amount,Country of company,Continent of company,Number of Investors in Seed,Number of Investors in Angel and or VC,Number of Co-founders,Number of of advisors,Team size Senior leadership,Team size all employees,Presence of a top angel or venture fund in previous round of investment,Number of of repeat investors,Number of Sales Support material,Worked in top companies,Average size of companies worked for in the past,Have been part of startups in the past?,Have been part of successful startups in the past?,Was he or she partner in Big 5 consulting?,Consulting experience?,Product or service company?,Catering to product/service across verticals,Focus on private or public data?,Focus on consumer data?,Focus on structured or unstructured data,Subscription based business,Cloud or platform based serive/product?,Local or global player,Linear or Non-linear business model,"Capital intensive business e.g. e-commerce, Engineering products and operations can also cause a business to be capital intensive",Number of of Partners of company,Crowdsourcing based business,Crowdfunding based business,Machine Learning based business,Predictive Analytics business,Speech analytics business,Prescriptive analytics business,Big Data Business,Cross-Channel Analytics/ marketing channels,Owns data or not? (monetization of data) e.g. Factual,Is the company an aggregator/market place? e.g. Bluekai,Online or offline venture - physical location based business or online venture?,B2C or B2B venture?,Top forums like 'Tech crunch' or 'Venture beat' talking about the company/model - How much is it being talked about?,Average Years of experience for founder and co founder,Exposure across the globe,Breadth of experience across verticals,Highest education,Years of education,Specialization of highest education,Relevance of education to venture,Relevance of experience to venture,Degree from a Tier 1 or Tier 2 university?,Renowned in professional circle,Experience in selling and building products,Experience in Fortune 100 organizations,Experience in Fortune 500 organizations,Experience in Fortune 1000 organizations,Top management similarity,Number of Recognitions for Founders and Co-founders,Number of of Research publications,Skills score,Team Composition score,Dificulty of Obtaining Work force,Pricing Strategy,Hyper localisation,Time to market service or product,Employee benefits and salary structures,Long term relationship with other founders,Proprietary or patent position (competitive position),Barriers of entry for the competitors,Company awards,Controversial history of founder or co founder,Legal risk and intellectual property,Client Reputation,google page rank of company website,Technical proficiencies to analyse and interpret unstructured data,Solutions offered,Invested through global incubation competitions?,Industry trend in investing,Disruptiveness of technology,Number of Direct competitors,Employees per year of company existence,Last round of funding received (in milionUSD),"Survival through recession, based on existence of the company through recession times",Time to 1st investment (in months),"Avg time to investment - average across all rounds, measured from previous investment",Percent_skill_Entrepreneurship,Percent_skill_Operations,Percent_skill_Engineering,Percent_skill_Marketing,Percent_skill_Leadership,Percent_skill_Data Science,Percent_skill_Business Strategy,Percent_skill_Product Management,Percent_skill_Sales,Percent_skill_Domain,Percent_skill_Law,Percent_skill_Consulting,Percent_skill_Finance,Percent_skill_Investment,Renown score
0,No Info,-1.0,,operation,KPCB Holdings|Draper Fisher Jurvetson (DFJ)|Kl...,3.0,0.0,No,450000.0,United States,North America,2,0,1,2,2,15,Yes,4,Nothing,No,Small,No,No,No,No,Service,No,Private,No,Both,Yes,Platform,Global,Linear,Yes,,No,No,No,No,No,No,No,No,No,Yes,Online,B2C,High,High,Yes,Low,Masters,21,business,Yes,Yes,Tier_1,500,Medium,0,0,0,,0,,0.0,Low,Low,Yes,No,High,No Info,No,No,Yes,No,No,No,No Info,9626884,No,Yes,No,2.0,Low,0,1.5,0.45,No Info,No Info,11.56,0.0,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0,0,0,0,0,0
1,2011,125.0,Market Research|Marketing|Crowdfunding,"Marketing, sales",,,,No,,United States,North America,5,0,2,0,4,20,No,0,medium,Yes,Large,Yes,Yes,No,No,Product,No,Public,Yes,Both,No,Platform,Local,Non-Linear,No,Few,Yes,No,Yes,Yes,No,No,Yes,Yes,Yes,No,Online,B2C,Low,High,Yes,High,Masters,21,Supply Chain Management & Entrepreneurship,Yes,Yes,Tier_1,500,High,0,0,0,Medium,13,,34.0,High,Medium,Yes,No,Low,No Info,No,Yes,Yes,No,No,Yes,Medium,1067034,Yes,Yes,No,3.0,Medium,0,6.666666667,5.0,Not Applicable,10,9.0,15.88235294,11.76470588,15.0,12.94117647,0,8.823529412,21.76470588,10.88235294,2.941176471,0.0,0,0,0,0,8
2,2011,455.0,Analytics|Cloud Computing|Software Development,operations,TechStars|Streamlined Ventures|Amplify Partner...,14.0,0.0,No,2350000.0,United States,North America,15,0,3,0,7,10,No,0,low,Yes,Medium,No,No,No,No,Both,Yes,Private,Yes,Both,Yes,cloud,Local,Non-Linear,No,Few,No,No,No,Yes,No,No,Yes,No,No,No,Online,B2B,Low,Medium,Yes,Low,Bachelors,18,General,Yes,Yes,Tier_2,500,High,0,0,1,Medium,18,,36.0,High,Medium,Yes,No,Low,No Info,Yes,Yes,Yes,No,No,No,Low,71391,Yes,Yes,Yes,3.0,Medium,0,3.333333333,2.35,Not Applicable,2,7.344444444,9.401709402,0.0,57.47863248,0.0,0,3.846153846,17.09401709,9.401709402,0.0,2.777777778,0,0,0,0,9
3,2009,-99.0,Mobile|Analytics,Marketing & Sales,Michael Birch|Max Levchin|Sequoia Capital|Keit...,45.0,10.0,No,10250000.0,United States,North America,6,0,2,0,4,50,Yes,0,low,No,Large,Yes,Yes,No,No,Product,Yes,Public,Yes,Structured,Yes,Platform,Local,Non-Linear,No,Few,Yes,No,No,No,No,No,No,No,No,No,Online,B2C,Medium,Medium,Yes,Low,Bachelors,18,Computer Systems Engineering,Yes,Yes,Tier_2,No Info,Low,0,0,0,Medium,2,,15.5,Medium,Medium,Yes,No,Low,Good,No,Yes,Yes,No,No,No,Low,11847,No,Yes,Yes,4.0,Medium,2,10.0,10.25,Not Applicable,1,8.7,0.0,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0,0,0,0,0,5
4,2010,496.0,Analytics|Marketing|Enterprise Software,Marketing & Sales,DFJ Frontier|Draper Nexus Ventures|Gil Elbaz|A...,39.0,3.0,No,5500000.0,United States,North America,7,0,1,1,8,40,No,0,high,No,Small,No,No,No,No,Product,Yes,Public,Yes,Both,No,Platform,Local,Non-Linear,Yes,Few,No,No,No,No,No,No,Yes,No,No,No,Online,B2B,Low,High,Yes,Medium,Bachelors,18,Industrial Engineering and Computer Science,Yes,Yes,,500,High,0,0,0,Low,5,Few,23.0,Medium,Medium,Yes,No,Low,Bad,Yes,Yes,Yes,No,No,No,Low,201814,Yes,Yes,No,3.0,Medium,0,10.0,5.5,Not Applicable,13,9.822222222,0.0,0.0,0.0,0.0,0,0.0,0.0,0.0,0.0,0.0,0,0,0,0,6


In [625]:
statuses = df['Dependent-Company Status']
statuses.head()

0    Success
1    Success
2    Success
3    Success
4    Success
Name: Dependent-Company Status, dtype: object

### Handling NaN values...

Before we rush to drop the rows that have at least one NaN value, let's see which features have the most NaN values. Maybe the presence of those features is less important compared to the amount of data we'll lose...

In [626]:
nan_df = pd.DataFrame(features.isna().sum().sort_values(ascending=False))
print(nan_df[nan_df[0] != 0].count())
nan_df[nan_df[0] != 0]

0    12
dtype: int64


Unnamed: 0,0
Employees count MoM change,205
Employee Count,166
Last Funding Amount,160
Investors,140
Industry of company,124
Specialization of highest education,97
Industry trend in investing,82
Continent of company,71
Country of company,71
Internet Activity Score,65


The overall picture is not that bad, from 116 features, only 12 have NaN values, the rest are either fully filled in or the NaN values are expressed verbally, like 'no info'. Let's now quickly go over the list and see if it makes sense to keep these features.

- **'Employees count MoM change'** makes nearly the half of the dataset drop, so let's just get rid of that feature, as I don't even know what MoM is (Ministry of Manpower?).
- **'Employee Count'** who doesn't have this record? It seems like such a  primitive question! Anyways, this is an important data, and, sadly, we have to drop 166 lines!
- **'Last Funding Amount'** let's just drop this column.
- **'Investors'** this one is different, my intuition is that the rows that don't have investors have NaN values, so instead of dropping them, I'm going to substitute those with empty strings. Later, when I will make this variable binary, those rows will just have 0s for all the Investors. 
- **'Industry of company'** is pretty important in my opinion, so I'm going to drop the rows that have NaN.
- **'Specialization of highest education'** is a crazy field: some have meaningful answers and some have things like _"PhD"_ or _"INDUSTRI"_ or _"computers"_. Plus, this field is not directly connected to the success, in my opinion, the connection between the field of the startup and the specialization is what matters. We could filter all the fields out from specialization in complex ways and then compare with the fields of the startup, but if we take a quick look at the dataset, we'll see that the answers vary a lot and maybe it's best in this case to just discard this column and not over-engineer.
- The rest are fairly important and don't have drastic results, so we'll just drom the NaN rows.

In [627]:
# Drop the discussed columns.
features = features.drop(['Employees count MoM change', 'Last Funding Amount', 
                          'Specialization of highest education'], axis=1)

# Sustitute NaNs with empty strings.
features['Investors'] = features['Investors'].apply(lambda x: x if isinstance(x, str) else '')

# Drop the rest of the rows.
features = features.dropna()

features.head()

Unnamed: 0,year of founding,Internet Activity Score,Industry of company,Focus functions of company,Investors,Employee Count,Has the team size grown,Country of company,Continent of company,Number of Investors in Seed,Number of Investors in Angel and or VC,Number of Co-founders,Number of of advisors,Team size Senior leadership,Team size all employees,Presence of a top angel or venture fund in previous round of investment,Number of of repeat investors,Number of Sales Support material,Worked in top companies,Average size of companies worked for in the past,Have been part of startups in the past?,Have been part of successful startups in the past?,Was he or she partner in Big 5 consulting?,Consulting experience?,Product or service company?,Catering to product/service across verticals,Focus on private or public data?,Focus on consumer data?,Focus on structured or unstructured data,Subscription based business,Cloud or platform based serive/product?,Local or global player,Linear or Non-linear business model,"Capital intensive business e.g. e-commerce, Engineering products and operations can also cause a business to be capital intensive",Number of of Partners of company,Crowdsourcing based business,Crowdfunding based business,Machine Learning based business,Predictive Analytics business,Speech analytics business,Prescriptive analytics business,Big Data Business,Cross-Channel Analytics/ marketing channels,Owns data or not? (monetization of data) e.g. Factual,Is the company an aggregator/market place? e.g. Bluekai,Online or offline venture - physical location based business or online venture?,B2C or B2B venture?,Top forums like 'Tech crunch' or 'Venture beat' talking about the company/model - How much is it being talked about?,Average Years of experience for founder and co founder,Exposure across the globe,Breadth of experience across verticals,Highest education,Years of education,Relevance of education to venture,Relevance of experience to venture,Degree from a Tier 1 or Tier 2 university?,Renowned in professional circle,Experience in selling and building products,Experience in Fortune 100 organizations,Experience in Fortune 500 organizations,Experience in Fortune 1000 organizations,Top management similarity,Number of Recognitions for Founders and Co-founders,Number of of Research publications,Skills score,Team Composition score,Dificulty of Obtaining Work force,Pricing Strategy,Hyper localisation,Time to market service or product,Employee benefits and salary structures,Long term relationship with other founders,Proprietary or patent position (competitive position),Barriers of entry for the competitors,Company awards,Controversial history of founder or co founder,Legal risk and intellectual property,Client Reputation,google page rank of company website,Technical proficiencies to analyse and interpret unstructured data,Solutions offered,Invested through global incubation competitions?,Industry trend in investing,Disruptiveness of technology,Number of Direct competitors,Employees per year of company existence,Last round of funding received (in milionUSD),"Survival through recession, based on existence of the company through recession times",Time to 1st investment (in months),"Avg time to investment - average across all rounds, measured from previous investment",Percent_skill_Entrepreneurship,Percent_skill_Operations,Percent_skill_Engineering,Percent_skill_Marketing,Percent_skill_Leadership,Percent_skill_Data Science,Percent_skill_Business Strategy,Percent_skill_Product Management,Percent_skill_Sales,Percent_skill_Domain,Percent_skill_Law,Percent_skill_Consulting,Percent_skill_Finance,Percent_skill_Investment,Renown score
2,2011,455.0,Analytics|Cloud Computing|Software Development,operations,TechStars|Streamlined Ventures|Amplify Partner...,14.0,No,United States,North America,15,0,3,0,7,10,No,0,low,Yes,Medium,No,No,No,No,Both,Yes,Private,Yes,Both,Yes,cloud,Local,Non-Linear,No,Few,No,No,No,Yes,No,No,Yes,No,No,No,Online,B2B,Low,Medium,Yes,Low,Bachelors,18,Yes,Yes,Tier_2,500,High,0,0,1,Medium,18,,36.0,High,Medium,Yes,No,Low,No Info,Yes,Yes,Yes,No,No,No,Low,71391,Yes,Yes,Yes,3.0,Medium,0,3.333333333,2.35,Not Applicable,2,7.344444444,9.401709402,0,57.47863248,0.0,0.0,3.846153846,17.09401709,9.401709402,0.0,2.777777778,0,0,0,0,9
3,2009,-99.0,Mobile|Analytics,Marketing & Sales,Michael Birch|Max Levchin|Sequoia Capital|Keit...,45.0,No,United States,North America,6,0,2,0,4,50,Yes,0,low,No,Large,Yes,Yes,No,No,Product,Yes,Public,Yes,Structured,Yes,Platform,Local,Non-Linear,No,Few,Yes,No,No,No,No,No,No,No,No,No,Online,B2C,Medium,Medium,Yes,Low,Bachelors,18,Yes,Yes,Tier_2,No Info,Low,0,0,0,Medium,2,,15.5,Medium,Medium,Yes,No,Low,Good,No,Yes,Yes,No,No,No,Low,11847,No,Yes,Yes,4.0,Medium,2,10.0,10.25,Not Applicable,1,8.7,0.0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0,0,5
4,2010,496.0,Analytics|Marketing|Enterprise Software,Marketing & Sales,DFJ Frontier|Draper Nexus Ventures|Gil Elbaz|A...,39.0,No,United States,North America,7,0,1,1,8,40,No,0,high,No,Small,No,No,No,No,Product,Yes,Public,Yes,Both,No,Platform,Local,Non-Linear,Yes,Few,No,No,No,No,No,No,Yes,No,No,No,Online,B2B,Low,High,Yes,Medium,Bachelors,18,Yes,Yes,,500,High,0,0,0,Low,5,Few,23.0,Medium,Medium,Yes,No,Low,Bad,Yes,Yes,Yes,No,No,No,Low,201814,Yes,Yes,No,3.0,Medium,0,10.0,5.5,Not Applicable,13,9.822222222,0.0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0,0,6
5,2010,106.0,Food & Beverages|Hospitality,analytics,Pritzker Group Venture Capital|Excelerate Labs...,14.0,No,United States,North America,2,0,4,0,4,14,No,2,medium,No,Medium,No,No,No,No,Service,No,Public,Yes,Both,No,Platform,Local,Non-Linear,No,Few,No,No,No,No,No,No,Yes,No,Yes,No,Online,B2B,Low,High,Yes,Medium,Masters,21,No,No,,500,Low,0,0,0,Low,5,,25.0,Medium,Medium,Yes,No,Low,Good,No,Yes,No,Yes,No,No,High,591816,Yes,Yes,Yes,4.0,High,0,3.5,1.0,Not Applicable,12,9.322222222,6.25,0,3.125,15.625,9.375,3.125,6.25,3.125,3.125,0.0,0,0,0,0,6
6,2011,39.0,Analytics,Research,Plug & Play Ventures|Correlation Ventures|Cros...,7.0,No,United States,North America,7,0,2,9,2,15,No,4,medium,No,Small,Yes,Yes,No,No,Both,Yes,Public,Yes,Unstructured,Yes,Platform,Local,Non-Linear,No,Few,No,No,No,Yes,No,No,Yes,No,No,No,Online,B2B,Low,High,Yes,Medium,PhD,25,Yes,Yes,,No Info,Low,0,0,0,Low,1,,21.0,Medium,Medium,Yes,No,Medium,No Info,No,No,Yes,No,No,Yes,Low,2345574,Yes,Yes,No,3.0,Medium,0,5.0,2.0,Not Applicable,11,7.311111111,0.0,0,66.66666667,5.555555556,0.0,22.22222222,0.0,0.0,0.0,5.555555556,0,0,0,0,0


In [628]:
features.shape

(239, 105)

In [629]:
display(features)

Unnamed: 0,year of founding,Internet Activity Score,Industry of company,Focus functions of company,Investors,Employee Count,Has the team size grown,Country of company,Continent of company,Number of Investors in Seed,Number of Investors in Angel and or VC,Number of Co-founders,Number of of advisors,Team size Senior leadership,Team size all employees,Presence of a top angel or venture fund in previous round of investment,Number of of repeat investors,Number of Sales Support material,Worked in top companies,Average size of companies worked for in the past,Have been part of startups in the past?,Have been part of successful startups in the past?,Was he or she partner in Big 5 consulting?,Consulting experience?,Product or service company?,Catering to product/service across verticals,Focus on private or public data?,Focus on consumer data?,Focus on structured or unstructured data,Subscription based business,Cloud or platform based serive/product?,Local or global player,Linear or Non-linear business model,"Capital intensive business e.g. e-commerce, Engineering products and operations can also cause a business to be capital intensive",Number of of Partners of company,Crowdsourcing based business,Crowdfunding based business,Machine Learning based business,Predictive Analytics business,Speech analytics business,Prescriptive analytics business,Big Data Business,Cross-Channel Analytics/ marketing channels,Owns data or not? (monetization of data) e.g. Factual,Is the company an aggregator/market place? e.g. Bluekai,Online or offline venture - physical location based business or online venture?,B2C or B2B venture?,Top forums like 'Tech crunch' or 'Venture beat' talking about the company/model - How much is it being talked about?,Average Years of experience for founder and co founder,Exposure across the globe,Breadth of experience across verticals,Highest education,Years of education,Relevance of education to venture,Relevance of experience to venture,Degree from a Tier 1 or Tier 2 university?,Renowned in professional circle,Experience in selling and building products,Experience in Fortune 100 organizations,Experience in Fortune 500 organizations,Experience in Fortune 1000 organizations,Top management similarity,Number of Recognitions for Founders and Co-founders,Number of of Research publications,Skills score,Team Composition score,Dificulty of Obtaining Work force,Pricing Strategy,Hyper localisation,Time to market service or product,Employee benefits and salary structures,Long term relationship with other founders,Proprietary or patent position (competitive position),Barriers of entry for the competitors,Company awards,Controversial history of founder or co founder,Legal risk and intellectual property,Client Reputation,google page rank of company website,Technical proficiencies to analyse and interpret unstructured data,Solutions offered,Invested through global incubation competitions?,Industry trend in investing,Disruptiveness of technology,Number of Direct competitors,Employees per year of company existence,Last round of funding received (in milionUSD),"Survival through recession, based on existence of the company through recession times",Time to 1st investment (in months),"Avg time to investment - average across all rounds, measured from previous investment",Percent_skill_Entrepreneurship,Percent_skill_Operations,Percent_skill_Engineering,Percent_skill_Marketing,Percent_skill_Leadership,Percent_skill_Data Science,Percent_skill_Business Strategy,Percent_skill_Product Management,Percent_skill_Sales,Percent_skill_Domain,Percent_skill_Law,Percent_skill_Consulting,Percent_skill_Finance,Percent_skill_Investment,Renown score
2,2011,455.0,Analytics|Cloud Computing|Software Development,operations,TechStars|Streamlined Ventures|Amplify Partner...,14.0,No,United States,North America,15,0,3,0,7,10,No,0,low,Yes,Medium,No,No,No,No,Both,Yes,Private,Yes,Both,Yes,cloud,Local,Non-Linear,No,Few,No,No,No,Yes,No,No,Yes,No,No,No,Online,B2B,Low,Medium,Yes,Low,Bachelors,18,Yes,Yes,Tier_2,500,High,0,0,1,Medium,18,,36.0,High,Medium,Yes,No,Low,No Info,Yes,Yes,Yes,No,No,No,Low,71391,Yes,Yes,Yes,3.0,Medium,0,3.333333333,2.35,Not Applicable,2,7.344444444,9.401709402,0.0,57.47863248,0.0,0.0,3.846153846,17.09401709,9.401709402,0.0,2.777777778,0.0,0.0,0.0,0.0,9
3,2009,-99.0,Mobile|Analytics,Marketing & Sales,Michael Birch|Max Levchin|Sequoia Capital|Keit...,45.0,No,United States,North America,6,0,2,0,4,50,Yes,0,low,No,Large,Yes,Yes,No,No,Product,Yes,Public,Yes,Structured,Yes,Platform,Local,Non-Linear,No,Few,Yes,No,No,No,No,No,No,No,No,No,Online,B2C,Medium,Medium,Yes,Low,Bachelors,18,Yes,Yes,Tier_2,No Info,Low,0,0,0,Medium,2,,15.5,Medium,Medium,Yes,No,Low,Good,No,Yes,Yes,No,No,No,Low,11847,No,Yes,Yes,4.0,Medium,2,10,10.25,Not Applicable,1,8.7,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,5
4,2010,496.0,Analytics|Marketing|Enterprise Software,Marketing & Sales,DFJ Frontier|Draper Nexus Ventures|Gil Elbaz|A...,39.0,No,United States,North America,7,0,1,1,8,40,No,0,high,No,Small,No,No,No,No,Product,Yes,Public,Yes,Both,No,Platform,Local,Non-Linear,Yes,Few,No,No,No,No,No,No,Yes,No,No,No,Online,B2B,Low,High,Yes,Medium,Bachelors,18,Yes,Yes,,500,High,0,0,0,Low,5,Few,23.0,Medium,Medium,Yes,No,Low,Bad,Yes,Yes,Yes,No,No,No,Low,201814,Yes,Yes,No,3.0,Medium,0,10,5.5,Not Applicable,13,9.822222222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,6
5,2010,106.0,Food & Beverages|Hospitality,analytics,Pritzker Group Venture Capital|Excelerate Labs...,14.0,No,United States,North America,2,0,4,0,4,14,No,2,medium,No,Medium,No,No,No,No,Service,No,Public,Yes,Both,No,Platform,Local,Non-Linear,No,Few,No,No,No,No,No,No,Yes,No,Yes,No,Online,B2B,Low,High,Yes,Medium,Masters,21,No,No,,500,Low,0,0,0,Low,5,,25.0,Medium,Medium,Yes,No,Low,Good,No,Yes,No,Yes,No,No,High,591816,Yes,Yes,Yes,4.0,High,0,3.5,1,Not Applicable,12,9.322222222,6.25,0.0,3.125,15.625,9.375,3.125,6.25,3.125,3.125,0.0,0.0,0.0,0.0,0.0,6
6,2011,39.0,Analytics,Research,Plug & Play Ventures|Correlation Ventures|Cros...,7.0,No,United States,North America,7,0,2,9,2,15,No,4,medium,No,Small,Yes,Yes,No,No,Both,Yes,Public,Yes,Unstructured,Yes,Platform,Local,Non-Linear,No,Few,No,No,No,Yes,No,No,Yes,No,No,No,Online,B2B,Low,High,Yes,Medium,PhD,25,Yes,Yes,,No Info,Low,0,0,0,Low,1,,21.0,Medium,Medium,Yes,No,Medium,No Info,No,No,Yes,No,No,Yes,Low,2345574,Yes,Yes,No,3.0,Medium,0,5,2,Not Applicable,11,7.311111111,0.0,0.0,66.66666667,5.555555556,0.0,22.22222222,0.0,0.0,0.0,5.555555556,0.0,0.0,0.0,0.0,0
7,2010,139.0,Cloud Computing|Network / Hosting / Infrastruc...,Computing,Norwest Venture Partners|Bessemer Venture Part...,29.0,No,United States,North America,0,0,3,4,4,40,No,0,medium,No,Large,Yes,Yes,No,No,Both,Yes,Private,No,Both,No,Platform,Local,Non-Linear,No,Few,No,No,No,Yes,No,Yes,Yes,No,No,No,Online,B2B,Medium,High,Yes,Medium,Masters,21,Yes,Yes,Tier_1,500,Medium,0,0,0,High,5,Few,4.5,Medium,Medium,Yes,No,Low,Bad,No,No,Yes,No,No,No,Low,1015027,Yes,Yes,No,3.0,Medium,0,10,6.7,Not Applicable,20,6.4,0.0,0.0,100.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2
8,2011,306.0,Analytics|Mobile|Marketing,Marketing,Promus Ventures|SoftTech VC|Costanoa Venture C...,16.0,No,United States,North America,13,0,2,0,2,50,No,0,medium,Yes,Medium,Yes,Yes,No,No,Product,Yes,Public,Yes,Both,No,Platform,Local,Non-Linear,No,Few,Yes,No,No,No,No,No,Yes,No,Yes,No,Online,B2B,Medium,High,Yes,High,Masters,21,Yes,Yes,Tier_1,500,High,0,0,0,Medium,13,,48.0,Low,Medium,No,No,Low,No Info,No,Yes,No,No,No,No,Low,256921,Yes,Yes,No,3.0,Medium,0,16.66666667,11,Not Applicable,18,12,8.333333333,0.0,46.73202614,5.718954248,8.333333333,0.0,19.77124183,2.777777778,2.777777778,0.0,0.0,0.0,0.0,5.555555556,5
9,2013,53.0,Healthcare|Pharmaceuticals|Analytics,Research,Khosla Ventures,3.0,No,United States,North America,0,0,3,4,3,3,No,0,Nothing,Yes,Medium,Yes,Yes,No,No,Product,No,Public,Yes,Structured,No,Platform,Global,Non-Linear,No,,No,No,No,Yes,No,No,Yes,No,No,No,Online,B2B,Low,High,Yes,High,PhD,25,Yes,Yes,Tier_1,500,High,0,0,0,Medium,1,,25.5,Medium,High,Yes,No,Medium,No Info,Yes,No,Yes,No,No,Yes,Low,3639779,No,Yes,No,1.0,Low,0,3,3,Not Applicable,5,5,8.333333333,0.0,27.08333333,19.79166667,0.0,23.95833333,0.0,0.0,0.0,20.83333333,0.0,0.0,0.0,0.0,4
10,2011,762.0,Analytics|Enterprise Software,"Sales, marketing",Redpoint Ventures|First Round Capital|PivotNor...,34.0,No,United States,North America,0,0,2,1,8,50,No,0,medium,No,Small,Yes,Yes,No,No,Both,No,Public,Yes,Both,No,Platform,Global,Non-Linear,No,Many,No,No,No,No,No,No,Yes,No,No,No,Online,B2B,Medium,High,Yes,High,Bachelors,18,No,Yes,,500,High,0,0,1,,4,,14.5,High,Medium,Yes,No,Low,No Info,No,Yes,Yes,No,No,No,High,142907,Yes,Yes,No,4.0,Medium,4,25,16,Not Applicable,24,2.666666667,3.846153846,0.0,26.92307692,0.0,3.846153846,3.846153846,7.692307692,0.0,3.846153846,0.0,0.0,0.0,0.0,0.0,6
12,2010,115.0,Media|Finance|Marketing,Marketing \nsales,Battery Ventures|In-Q-Tel|Jump Capital|SAP Ven...,47.0,No,United States,North America,0,0,0,1,8,50,No,1,high,No,Medium,Yes,Yes,No,Yes,Product,Yes,Private,Yes,Structured,No,Cloud,Local,Non-Linear,Yes,Few,No,No,No,No,No,No,No,Yes,No,No,Online,B2B,Medium,High,Yes,Medium,Bachelors,18,Yes,Yes,,500,High,1,0,0,,0,Few,0.0,High,Medium,Yes,No,Low,Good,No,Yes,Yes,Yes,No,No,Low,172054,No,Yes,No,3.0,Medium,0,16.6,11.5,Not Applicable,12,6,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0


### Handling not NaN empty values...

Although we removed a bunch of rows with NaN values, there are still some records that in principle are not NaN but have the same effect. As those values can be anything, I am going to manually look into the dataset to detect them. To, nevertheless, ease the job, I'll look into the set of unique values of each column. 

In [630]:
col_sets = []
for c in features.columns:
    col_sets.append(c)
    col_sets.append(features[c].unique())
    
#col_sets
col_sets[20:26] # Just showing a sample batch.

['Number of Investors in Angel and or VC',
 array(['0', '2', '3', '6', '1', 'No Info', '4', '5', '7', '9', '8'],
       dtype=object),
 'Number of Co-founders',
 array([3, 2, 1, 4, 0, 5, 7, 6]),
 'Number of of advisors',
 array([ 0,  1,  9,  4,  8,  2,  6,  3,  7,  5, 11])]

 **Based on my obervations, here are the NaN value strings:**

- 'No Info' 
<!-- 'Number of Investors in Seed', 'Number of Investors in Angel and or VC', 'Presence of a top angel or venture fund in previous round of investment', 'year of founding', 'Team size all employees', 'Presence of a top angel or venture fund in previous round of investment', 'Subscription based business', 'Local or global player', 'Linear or Non-linear business model', 'Capital intensive business e.g. e-commerce, Engineering products and operations can also cause a business to be capital intensive', 'Number of  of Partners of company', 'Online or offline venture - physical location based business or online venture?', 'Exposure across the globe', 'Renowned in professional circle', 'Breadth of experience across verticals', 'Highest education', 'Years of education', 'Relevance of education to venture', 'Relevance of experience to venture', 'Degree from a Tier 1 or Tier 2 university?', 'Employees per year of company existence', 'Time to 1st investment (in months)', 'Avg time to investment - average across all rounds, measured from previous investment', 'Team Composition score', 'Pricing Strategy', 'Time to market service or product', 'Employee benefits and salary structures', 'Long term relationship with other founders', 'Client Reputation', 'Solutions offered', 'Invested through global incubation competitions?', 'Survival through recession, based on existence of the company through recession times' -->
- '\\\\' 
- 'N'
- 'unknown amount'
 

Let's see what are the losses of cleaning in this case.

In [631]:
no_info_df = pd.DataFrame(features.apply(lambda x: (x == "No Info") | (x == "\\") 
                        | (x == "N") | (x == "unknown amount"))
                          .sum().sort_values(ascending=False))

print(no_info_df[no_info_df[0] != 0].count())
no_info_df[no_info_df[0] != 0]

0    36
dtype: int64


  result = method(y)


Unnamed: 0,0
Employee benefits and salary structures,156
Client Reputation,103
Last round of funding received (in milionUSD),43
Presence of a top angel or venture fund in previous round of investment,36
google page rank of company website,26
Invested through global incubation competitions?,21
Years of education,13
Highest education,13
Employees per year of company existence,13
Number of of Partners of company,11


Let's drop the first 3 columns, as they cause major data loss, and clear the rest.

In [632]:
features = features.drop(['Employee benefits and salary structures', 'Client Reputation', 
                          'Presence of a top angel or venture fund in previous round of investment'], axis=1)

In [633]:
for col in features.columns:
    features = features[(features[col] != "No Info") & (features[col] != "\\") 
                        & (features[col] != "N") & (features[col] != "unknown amount")]
    
print(features.shape)
features = features.reset_index(drop=True)
features.head()

(118, 102)


Unnamed: 0,year of founding,Internet Activity Score,Industry of company,Focus functions of company,Investors,Employee Count,Has the team size grown,Country of company,Continent of company,Number of Investors in Seed,Number of Investors in Angel and or VC,Number of Co-founders,Number of of advisors,Team size Senior leadership,Team size all employees,Number of of repeat investors,Number of Sales Support material,Worked in top companies,Average size of companies worked for in the past,Have been part of startups in the past?,Have been part of successful startups in the past?,Was he or she partner in Big 5 consulting?,Consulting experience?,Product or service company?,Catering to product/service across verticals,Focus on private or public data?,Focus on consumer data?,Focus on structured or unstructured data,Subscription based business,Cloud or platform based serive/product?,Local or global player,Linear or Non-linear business model,"Capital intensive business e.g. e-commerce, Engineering products and operations can also cause a business to be capital intensive",Number of of Partners of company,Crowdsourcing based business,Crowdfunding based business,Machine Learning based business,Predictive Analytics business,Speech analytics business,Prescriptive analytics business,Big Data Business,Cross-Channel Analytics/ marketing channels,Owns data or not? (monetization of data) e.g. Factual,Is the company an aggregator/market place? e.g. Bluekai,Online or offline venture - physical location based business or online venture?,B2C or B2B venture?,Top forums like 'Tech crunch' or 'Venture beat' talking about the company/model - How much is it being talked about?,Average Years of experience for founder and co founder,Exposure across the globe,Breadth of experience across verticals,Highest education,Years of education,Relevance of education to venture,Relevance of experience to venture,Degree from a Tier 1 or Tier 2 university?,Renowned in professional circle,Experience in selling and building products,Experience in Fortune 100 organizations,Experience in Fortune 500 organizations,Experience in Fortune 1000 organizations,Top management similarity,Number of Recognitions for Founders and Co-founders,Number of of Research publications,Skills score,Team Composition score,Dificulty of Obtaining Work force,Pricing Strategy,Hyper localisation,Time to market service or product,Long term relationship with other founders,Proprietary or patent position (competitive position),Barriers of entry for the competitors,Company awards,Controversial history of founder or co founder,Legal risk and intellectual property,google page rank of company website,Technical proficiencies to analyse and interpret unstructured data,Solutions offered,Invested through global incubation competitions?,Industry trend in investing,Disruptiveness of technology,Number of Direct competitors,Employees per year of company existence,Last round of funding received (in milionUSD),"Survival through recession, based on existence of the company through recession times",Time to 1st investment (in months),"Avg time to investment - average across all rounds, measured from previous investment",Percent_skill_Entrepreneurship,Percent_skill_Operations,Percent_skill_Engineering,Percent_skill_Marketing,Percent_skill_Leadership,Percent_skill_Data Science,Percent_skill_Business Strategy,Percent_skill_Product Management,Percent_skill_Sales,Percent_skill_Domain,Percent_skill_Law,Percent_skill_Consulting,Percent_skill_Finance,Percent_skill_Investment,Renown score
0,2011,455.0,Analytics|Cloud Computing|Software Development,operations,TechStars|Streamlined Ventures|Amplify Partner...,14.0,No,United States,North America,15,0,3,0,7,10,0,low,Yes,Medium,No,No,No,No,Both,Yes,Private,Yes,Both,Yes,cloud,Local,Non-Linear,No,Few,No,No,No,Yes,No,No,Yes,No,No,No,Online,B2B,Low,Medium,Yes,Low,Bachelors,18,Yes,Yes,Tier_2,500,High,0,0,1,Medium,18,,36.0,High,Medium,Yes,No,Low,Yes,Yes,Yes,No,No,No,71391,Yes,Yes,Yes,3.0,Medium,0,3.333333333,2.35,Not Applicable,2,7.344444444,9.401709402,0,57.47863248,0.0,0.0,3.846153846,17.09401709,9.401709402,0.0,2.777777778,0,0,0,0.0,9
1,2010,496.0,Analytics|Marketing|Enterprise Software,Marketing & Sales,DFJ Frontier|Draper Nexus Ventures|Gil Elbaz|A...,39.0,No,United States,North America,7,0,1,1,8,40,0,high,No,Small,No,No,No,No,Product,Yes,Public,Yes,Both,No,Platform,Local,Non-Linear,Yes,Few,No,No,No,No,No,No,Yes,No,No,No,Online,B2B,Low,High,Yes,Medium,Bachelors,18,Yes,Yes,,500,High,0,0,0,Low,5,Few,23.0,Medium,Medium,Yes,No,Low,Yes,Yes,Yes,No,No,No,201814,Yes,Yes,No,3.0,Medium,0,10.0,5.5,Not Applicable,13,9.822222222,0.0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0,0.0,6
2,2010,106.0,Food & Beverages|Hospitality,analytics,Pritzker Group Venture Capital|Excelerate Labs...,14.0,No,United States,North America,2,0,4,0,4,14,2,medium,No,Medium,No,No,No,No,Service,No,Public,Yes,Both,No,Platform,Local,Non-Linear,No,Few,No,No,No,No,No,No,Yes,No,Yes,No,Online,B2B,Low,High,Yes,Medium,Masters,21,No,No,,500,Low,0,0,0,Low,5,,25.0,Medium,Medium,Yes,No,Low,No,Yes,No,Yes,No,No,591816,Yes,Yes,Yes,4.0,High,0,3.5,1.0,Not Applicable,12,9.322222222,6.25,0,3.125,15.625,9.375,3.125,6.25,3.125,3.125,0.0,0,0,0,0.0,6
3,2010,139.0,Cloud Computing|Network / Hosting / Infrastruc...,Computing,Norwest Venture Partners|Bessemer Venture Part...,29.0,No,United States,North America,0,0,3,4,4,40,0,medium,No,Large,Yes,Yes,No,No,Both,Yes,Private,No,Both,No,Platform,Local,Non-Linear,No,Few,No,No,No,Yes,No,Yes,Yes,No,No,No,Online,B2B,Medium,High,Yes,Medium,Masters,21,Yes,Yes,Tier_1,500,Medium,0,0,0,High,5,Few,4.5,Medium,Medium,Yes,No,Low,No,No,Yes,No,No,No,1015027,Yes,Yes,No,3.0,Medium,0,10.0,6.7,Not Applicable,20,6.4,0.0,0,100.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0,0.0,2
4,2011,306.0,Analytics|Mobile|Marketing,Marketing,Promus Ventures|SoftTech VC|Costanoa Venture C...,16.0,No,United States,North America,13,0,2,0,2,50,0,medium,Yes,Medium,Yes,Yes,No,No,Product,Yes,Public,Yes,Both,No,Platform,Local,Non-Linear,No,Few,Yes,No,No,No,No,No,Yes,No,Yes,No,Online,B2B,Medium,High,Yes,High,Masters,21,Yes,Yes,Tier_1,500,High,0,0,0,Medium,13,,48.0,Low,Medium,No,No,Low,No,Yes,No,No,No,No,256921,Yes,Yes,No,3.0,Medium,0,16.66666667,11.0,Not Applicable,18,12.0,8.333333333,0,46.73202614,5.718954248,8.333333333,0.0,19.77124183,2.777777778,2.777777778,0.0,0,0,0,5.555555556,5


### Converting list values into binary variables...

Let's now convert the following values into lists, so that we can then make them binary variables.

 - *'Industry of company'*
 - *'Investors'*
 - *'Focus functions of company'*

In [634]:
features['Industry of company'] = features['Industry of company'].apply(lambda x: x.lower().strip().split('|'))
features['Investors'] = features['Investors'].apply(lambda x: x.lower().strip().split('|'))

def process_str(x) :
    str_list = re.split(',|&|\n|\|', x.lower())
    str_list = list(map(str.strip, str_list))
    return str_list

features['Focus functions of company'] = features['Focus functions of company'].apply(lambda x: process_str(x))

features.head()

Unnamed: 0,year of founding,Internet Activity Score,Industry of company,Focus functions of company,Investors,Employee Count,Has the team size grown,Country of company,Continent of company,Number of Investors in Seed,Number of Investors in Angel and or VC,Number of Co-founders,Number of of advisors,Team size Senior leadership,Team size all employees,Number of of repeat investors,Number of Sales Support material,Worked in top companies,Average size of companies worked for in the past,Have been part of startups in the past?,Have been part of successful startups in the past?,Was he or she partner in Big 5 consulting?,Consulting experience?,Product or service company?,Catering to product/service across verticals,Focus on private or public data?,Focus on consumer data?,Focus on structured or unstructured data,Subscription based business,Cloud or platform based serive/product?,Local or global player,Linear or Non-linear business model,"Capital intensive business e.g. e-commerce, Engineering products and operations can also cause a business to be capital intensive",Number of of Partners of company,Crowdsourcing based business,Crowdfunding based business,Machine Learning based business,Predictive Analytics business,Speech analytics business,Prescriptive analytics business,Big Data Business,Cross-Channel Analytics/ marketing channels,Owns data or not? (monetization of data) e.g. Factual,Is the company an aggregator/market place? e.g. Bluekai,Online or offline venture - physical location based business or online venture?,B2C or B2B venture?,Top forums like 'Tech crunch' or 'Venture beat' talking about the company/model - How much is it being talked about?,Average Years of experience for founder and co founder,Exposure across the globe,Breadth of experience across verticals,Highest education,Years of education,Relevance of education to venture,Relevance of experience to venture,Degree from a Tier 1 or Tier 2 university?,Renowned in professional circle,Experience in selling and building products,Experience in Fortune 100 organizations,Experience in Fortune 500 organizations,Experience in Fortune 1000 organizations,Top management similarity,Number of Recognitions for Founders and Co-founders,Number of of Research publications,Skills score,Team Composition score,Dificulty of Obtaining Work force,Pricing Strategy,Hyper localisation,Time to market service or product,Long term relationship with other founders,Proprietary or patent position (competitive position),Barriers of entry for the competitors,Company awards,Controversial history of founder or co founder,Legal risk and intellectual property,google page rank of company website,Technical proficiencies to analyse and interpret unstructured data,Solutions offered,Invested through global incubation competitions?,Industry trend in investing,Disruptiveness of technology,Number of Direct competitors,Employees per year of company existence,Last round of funding received (in milionUSD),"Survival through recession, based on existence of the company through recession times",Time to 1st investment (in months),"Avg time to investment - average across all rounds, measured from previous investment",Percent_skill_Entrepreneurship,Percent_skill_Operations,Percent_skill_Engineering,Percent_skill_Marketing,Percent_skill_Leadership,Percent_skill_Data Science,Percent_skill_Business Strategy,Percent_skill_Product Management,Percent_skill_Sales,Percent_skill_Domain,Percent_skill_Law,Percent_skill_Consulting,Percent_skill_Finance,Percent_skill_Investment,Renown score
0,2011,455.0,"[analytics, cloud computing, software developm...",[operations],"[techstars, streamlined ventures, amplify part...",14.0,No,United States,North America,15,0,3,0,7,10,0,low,Yes,Medium,No,No,No,No,Both,Yes,Private,Yes,Both,Yes,cloud,Local,Non-Linear,No,Few,No,No,No,Yes,No,No,Yes,No,No,No,Online,B2B,Low,Medium,Yes,Low,Bachelors,18,Yes,Yes,Tier_2,500,High,0,0,1,Medium,18,,36.0,High,Medium,Yes,No,Low,Yes,Yes,Yes,No,No,No,71391,Yes,Yes,Yes,3.0,Medium,0,3.333333333,2.35,Not Applicable,2,7.344444444,9.401709402,0,57.47863248,0.0,0.0,3.846153846,17.09401709,9.401709402,0.0,2.777777778,0,0,0,0.0,9
1,2010,496.0,"[analytics, marketing, enterprise software]","[marketing, sales]","[dfj frontier, draper nexus ventures, gil elba...",39.0,No,United States,North America,7,0,1,1,8,40,0,high,No,Small,No,No,No,No,Product,Yes,Public,Yes,Both,No,Platform,Local,Non-Linear,Yes,Few,No,No,No,No,No,No,Yes,No,No,No,Online,B2B,Low,High,Yes,Medium,Bachelors,18,Yes,Yes,,500,High,0,0,0,Low,5,Few,23.0,Medium,Medium,Yes,No,Low,Yes,Yes,Yes,No,No,No,201814,Yes,Yes,No,3.0,Medium,0,10.0,5.5,Not Applicable,13,9.822222222,0.0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0,0.0,6
2,2010,106.0,"[food & beverages, hospitality]",[analytics],"[pritzker group venture capital, excelerate la...",14.0,No,United States,North America,2,0,4,0,4,14,2,medium,No,Medium,No,No,No,No,Service,No,Public,Yes,Both,No,Platform,Local,Non-Linear,No,Few,No,No,No,No,No,No,Yes,No,Yes,No,Online,B2B,Low,High,Yes,Medium,Masters,21,No,No,,500,Low,0,0,0,Low,5,,25.0,Medium,Medium,Yes,No,Low,No,Yes,No,Yes,No,No,591816,Yes,Yes,Yes,4.0,High,0,3.5,1.0,Not Applicable,12,9.322222222,6.25,0,3.125,15.625,9.375,3.125,6.25,3.125,3.125,0.0,0,0,0,0.0,6
3,2010,139.0,"[cloud computing, network / hosting / infrastr...",[computing],"[norwest venture partners, bessemer venture pa...",29.0,No,United States,North America,0,0,3,4,4,40,0,medium,No,Large,Yes,Yes,No,No,Both,Yes,Private,No,Both,No,Platform,Local,Non-Linear,No,Few,No,No,No,Yes,No,Yes,Yes,No,No,No,Online,B2B,Medium,High,Yes,Medium,Masters,21,Yes,Yes,Tier_1,500,Medium,0,0,0,High,5,Few,4.5,Medium,Medium,Yes,No,Low,No,No,Yes,No,No,No,1015027,Yes,Yes,No,3.0,Medium,0,10.0,6.7,Not Applicable,20,6.4,0.0,0,100.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0,0.0,2
4,2011,306.0,"[analytics, mobile, marketing]",[marketing],"[promus ventures, softtech vc, costanoa ventur...",16.0,No,United States,North America,13,0,2,0,2,50,0,medium,Yes,Medium,Yes,Yes,No,No,Product,Yes,Public,Yes,Both,No,Platform,Local,Non-Linear,No,Few,Yes,No,No,No,No,No,Yes,No,Yes,No,Online,B2B,Medium,High,Yes,High,Masters,21,Yes,Yes,Tier_1,500,High,0,0,0,Medium,13,,48.0,Low,Medium,No,No,Low,No,Yes,No,No,No,No,256921,Yes,Yes,No,3.0,Medium,0,16.66666667,11.0,Not Applicable,18,12.0,8.333333333,0,46.73202614,5.718954248,8.333333333,0.0,19.77124183,2.777777778,2.777777778,0.0,0,0,0,5.555555556,5


In [635]:
# For every row in the dataframe, iterate through the list of industries 
# and place a 1 into the corresponding column.
for index, row in features.iterrows():
    for industry in row['Industry of company']:
        features.at[index, industry] = 1
        
# Filling in the NaN values with 0 to show that a company doesn't belong to that column's industry.
features = features.fillna(0)

# Same for investors.
for index, row in features.iterrows():
    for investor in row['Investors']:
        features.at[index, investor] = 1

features = features.fillna(0)

# And for focus functions of company.
for index, row in features.iterrows():
    for focus in row['Focus functions of company']:
        features.at[index, focus] = 1
        
features = features.fillna(0)
        
features.head()

Unnamed: 0,year of founding,Internet Activity Score,Industry of company,Focus functions of company,Investors,Employee Count,Has the team size grown,Country of company,Continent of company,Number of Investors in Seed,Number of Investors in Angel and or VC,Number of Co-founders,Number of of advisors,Team size Senior leadership,Team size all employees,Number of of repeat investors,Number of Sales Support material,Worked in top companies,Average size of companies worked for in the past,Have been part of startups in the past?,Have been part of successful startups in the past?,Was he or she partner in Big 5 consulting?,Consulting experience?,Product or service company?,Catering to product/service across verticals,Focus on private or public data?,Focus on consumer data?,Focus on structured or unstructured data,Subscription based business,Cloud or platform based serive/product?,Local or global player,Linear or Non-linear business model,"Capital intensive business e.g. e-commerce, Engineering products and operations can also cause a business to be capital intensive",Number of of Partners of company,Crowdsourcing based business,Crowdfunding based business,Machine Learning based business,Predictive Analytics business,Speech analytics business,Prescriptive analytics business,Big Data Business,Cross-Channel Analytics/ marketing channels,Owns data or not? (monetization of data) e.g. Factual,Is the company an aggregator/market place? e.g. Bluekai,Online or offline venture - physical location based business or online venture?,B2C or B2B venture?,Top forums like 'Tech crunch' or 'Venture beat' talking about the company/model - How much is it being talked about?,Average Years of experience for founder and co founder,Exposure across the globe,Breadth of experience across verticals,Highest education,Years of education,Relevance of education to venture,Relevance of experience to venture,Degree from a Tier 1 or Tier 2 university?,Renowned in professional circle,Experience in selling and building products,Experience in Fortune 100 organizations,Experience in Fortune 500 organizations,Experience in Fortune 1000 organizations,Top management similarity,Number of Recognitions for Founders and Co-founders,Number of of Research publications,Skills score,Team Composition score,Dificulty of Obtaining Work force,Pricing Strategy,Hyper localisation,Time to market service or product,Long term relationship with other founders,Proprietary or patent position (competitive position),Barriers of entry for the competitors,Company awards,Controversial history of founder or co founder,Legal risk and intellectual property,google page rank of company website,Technical proficiencies to analyse and interpret unstructured data,Solutions offered,Invested through global incubation competitions?,Industry trend in investing,Disruptiveness of technology,Number of Direct competitors,Employees per year of company existence,Last round of funding received (in milionUSD),"Survival through recession, based on existence of the company through recession times",Time to 1st investment (in months),"Avg time to investment - average across all rounds, measured from previous investment",Percent_skill_Entrepreneurship,Percent_skill_Operations,Percent_skill_Engineering,Percent_skill_Marketing,Percent_skill_Leadership,Percent_skill_Data Science,Percent_skill_Business Strategy,Percent_skill_Product Management,Percent_skill_Sales,Percent_skill_Domain,Percent_skill_Law,Percent_skill_Consulting,Percent_skill_Finance,Percent_skill_Investment,Renown score,analytics,cloud computing,software development,marketing,enterprise software,food & beverages,hospitality,network / hosting / infrastructure,mobile,healthcare,pharmaceuticals,media,finance,e-commerce,gaming,email,security,publishing,education,advertising,entertainment,transportation,retail,social networking,search,insurance,cleantech,energy,market research,deals,telecommunications,music,techstars,streamlined ventures,amplify partners,rincon venture partners,pelion venture partners,500 startups,loren siebert,jason seats,xg ventures,george karidis,sam choi,morris wheeler,data collective,pejman nozad,ullas naik,dirk elmendorf,galvanize,pat matthews,paul kedrosky,matt ocko,cloud power capital,jared kopf,anne johnson,issac roth,george karutz,jim deters,zachary aarons,zack bogue,dfj frontier,draper nexus ventures,gil elbaz,auren hoffman,walter kortschak,mi ventures,brand ventures,daher capital,double m partners,gold hill capital,clark landry,draper associates,mi ventures llc,signia venture partners,pritzker group venture capital,excelerate labs,hyde park venture partners,chicago ventures,amicus capital,ideo,olive ventures,kd capital l.l.c,norwest venture partners,bessemer venture partners,atlas venture,promus ventures,softtech vc,costanoa venture capital,lee linden,chamath palihapitiya,raj de datta,tim kendall,omar siddiqui,david vivero,sequoia capital,khosla ventures,redpoint ventures,first round capital,pivotnorth capital,battery ventures,in-q-tel,jump capital,sap ventures,northwestern university,harvard business school angels,tech coast angels,hummer winblad venture partners,us venture partners,presidio ventures,mohr davidow ventures,grotech ventures,access venture partners,ffp holdings,matrix partners,allan zeise,thomas shannon,allen zeise,wisconsin investment partners,silicon pastures,thomas vincent,ross bjella,david lisle,ed karrels,jeff harris,tom demell,thomas demell,jeffery harris,michael kluiber,peter skanaivs,andy wojack,crista wojack,eric kessenich,john schmidt,angels on the water,ea media syndicate i llc,glen surnamer,jeff schweiger,defense advanced research projects agency,sv angel,harrison metal capital,baseline ventures,greylock partners,dick costolo,reid hoffman,jeff jordan,harrison metal,bain capital ventures,square 1 bank,high line venture partners,allen debevoise,matt coffin,paul bricault,jonah goodhart,dean gilbert,tony nethercutt,firstmark capital,ben ling,lerer ventures,cit gap funds,jaffray woodriff,hyde park angels,i2a fund,social leverage,morningstar,reed elsevier ventures,active up,guess,audilion,jean-luc halleux,dany donnen,startupinvest,launchub,launch hub,Unnamed: 276,boulas ventures,seed4soft,seventure partners,elaia partners,isai,capital spreads,rockaway capital,stephen bullock,chris underhill,scottish equity partners,seedcamp,innovation warehouse,stefan glaenzer,sherry coutu,ab banerjee,shamil chandaria,anish chandaria,bill emmott,ditlev schwanenflugel,ken olisa,azeem azhar,anthemis group,meridian venture partners,tom glocer,sanford dickert,alastair mitchell,andy mcloughlin,tim jackson,phil wilkinson,guy westlake,sean cornwell,ned cranborne,shan drummond,innova kapital,cantabria capital,voyager capital,vegastechfund,lars-henrik friis molin,amicus group,kolind a/s,appian ventures,core capital partners,qed investors,tech wildcatters,start-up chile,aleph,capital innovators,cultivation capital,eric kwan,google ventures,kenny van zant,franklyn chien,ryan merket,bobby goodlatte,yishan wong,brian mcclendon,charles river ventures,new enterprise associates,y combinator,gabor cselle,george zachary,jim young,alison rosenthal,jerry yang,paul buchheit,tie angels,hub angels investment group,jit saxena,rob soni,john simon,gaugarin oliver,prakash khot,launchcapital,chris devore,tom peterson,aaron bird,founder's co-op,howard lindzon,friends and family,jeremie berrebi,sutter hill ventures,crunchfund,rtp ventures,almaz capital,ru-net holdings,nick rau,bob jacobson,jimmy barge,jor law,lux capital,fidelity biosciences,venrock,highland capital partners,gerson lehrman group,jonathan bush,ed park,john goldsmith,james golden,ia ventures,digital sky technologies,raymond tonsing,accel partners,start fund,sam altman,ashton kutcher,guy oseary,ron garret,karl jacob,marco bergmann,mark larosa,tom mcinerney,janis krums,cross atlantic capital partners,edison ventures,rikki tahta,simon murdoch,roland beaulieu,gary mueller,spring lake equity partners,epic ventures,kevin o'connor,omidyar network,marc andreessen,ron conway,al avery,floodgate,svb financial group,kevin rose,floodgate fund,resolute.vc,founder collective,thrive capital,alexis ohanian,bubba murarka,reinmkr capital,shasta ventures,felicis ventures,jeffrey p. parker,tom falus,matt dwyer,sam wohlstadter,high peaks venture partners,kec ventures,softbank capital,jeff parker,fintech collective,intel capital,north bridge venture partners,vision ridge partners,green tree equity,jove equity partners,david cohen,brett jackson,bart lorang,walt winshall,valero capital,five mill ventures,westly group,the menlo park,javelin venture partners,alireza masrour,plug & play ventures,united parcel service,osage university partners,scott mcnealy,andreessen horowitz,point nine capital,sand hill angels,dingman center angels,interwest partners,tenoneten ventures,foundry group,upfront ventures,kbs+ ventures,neu venture capital,apricot capital,globespan capital partners,menlo ventures,flybridge capital partners,commonwealth capital ventures,national science foundation,michael chaney,frederick farrar,t. mills kelly,wayne buder,gregory t stern,kenneth gilbert,gregory t.stern,peter j. toren,marco giberti,paolo rubatto,greg stern,peter toren,david neithercutt,investment group of santa barbara,madrona venture group,mhs capital,rob glaser,kleiner perkins caufield & byers,dag ventures,austin ventures,contour venture partners,allegro venture partners,mack capital,brett hurt,sam decker,adam ross,tom meredith,dean drako,vulcan capital,geoff entress,bloomberg beta,ignition partners,next world capital,software ag,workday,citi ventures,august capital,morado venture partners,ame cloud ventures,sopris partners,voodoo ventures,safeguard scientifics,charles f. dolan,golden seeds,springboard enterprises,miramar venture partners,ata ventures,foundation capital,stanford university,e.on,zig capital,jcb investments,moore venture partners,la jolla holding,tomorrowventures,venture capitals,mehdi daoudi,boldstart ventures,oak investment partners,diamondhead ventures,ampersand capital partners,qualcomm ventures,american express ventures,maine technology institute,libra future fund,i2e,seedstep angels group,divergent ventures,john ives,lew moorman,john engates,revel partners,commonangels,boston seed capital,dharmesh shah,waikit lau,nick ducoff,jacob perkins,dan casey,crosslink capital,kapor capital,giza venture capital,endeavor partners,naval ravikant,mark goines,josh james,ecosystem ventures,paige craig,correlation ventures,quest venture partners,sigma prime ventures,kepha partners,sigma partners,deutsche telekom,index ventures,t-venture,gerard govaerts,fortino,vendep oy,eden ventures,pentech ventures,oxford technology management,imperial innovations,bootstrap incubation,startup bootcamp,startupbootcamp,dan somers,technology strategy board,christoph janz,alexander bruehl,dn capital,slow ventures,draper fisher jurvetson (dfj),mayfield fund,draper richards,mangrove capital partners,lehman brothers,sierra ventures,team europe,fabrice grinda,jose marin,james gutierrez,embarcadero ventures,inovia capital,greycroft partners,brian s. cohen,phil grieshaber,john taysom,justin siegel,jeffrey silverman,jerry newman,bruno bowden,jeff hammerbacher,michael abbott,andrew mccollum,ed roberts,jean hammond,quotidian ventures,general catalyst partners,lowercase capital,lightbank,babak nivi,steamboat ventures,the mail room fund,launchpad la,at&t,stage one capital,mercury fund,ff venture capital,dundee venture capital,mfi capital,jim pallotta,josh mailman,radcliff group,hadi partovi,ali partovi,e.ventures,satya patel,redpoint eventures,split rock partners,ggv capital,canopy ventures,allegis capital,trident capital,canopy group,dry canyon holdings,cross creek capital,indous venture partners,triplepoint capital,jim clark,mike ramsay,gabriell weinberg,david cancel,joshua schachter,roy rodenstein,project 11 ventures,meakem becker venture capital,sunstone capital,operations,sales,computing,research,database management,technology,analytic,software,social media management,customer servce,customer service,app revenue,data visualization,service,operation,accounting,training,risk,data collection,social news,consumer web,data management,strategy,tool,inventory management,energy saving,optimization,crm,pricing,customer targeting,search enginenoptimization,customer engagement,social media analytics,content marketing,presentations,social media,dashboards,localized behaviour,wireless,sale,social network,music intelligece,network optimization
0,2011,455.0,"[analytics, cloud computing, software developm...",[operations],"[techstars, streamlined ventures, amplify part...",14.0,No,United States,North America,15,0,3,0,7,10,0,low,Yes,Medium,No,No,No,No,Both,Yes,Private,Yes,Both,Yes,cloud,Local,Non-Linear,No,Few,No,No,No,Yes,No,No,Yes,No,No,No,Online,B2B,Low,Medium,Yes,Low,Bachelors,18,Yes,Yes,Tier_2,500,High,0,0,1,Medium,18,,36.0,High,Medium,Yes,No,Low,Yes,Yes,Yes,No,No,No,71391,Yes,Yes,Yes,3.0,Medium,0,3.333333333,2.35,Not Applicable,2,7.344444444,9.401709402,0,57.47863248,0.0,0.0,3.846153846,17.09401709,9.401709402,0.0,2.777777778,0,0,0,0.0,9,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,2010,496.0,"[analytics, marketing, enterprise software]","[marketing, sales]","[dfj frontier, draper nexus ventures, gil elba...",39.0,No,United States,North America,7,0,1,1,8,40,0,high,No,Small,No,No,No,No,Product,Yes,Public,Yes,Both,No,Platform,Local,Non-Linear,Yes,Few,No,No,No,No,No,No,Yes,No,No,No,Online,B2B,Low,High,Yes,Medium,Bachelors,18,Yes,Yes,,500,High,0,0,0,Low,5,Few,23.0,Medium,Medium,Yes,No,Low,Yes,Yes,Yes,No,No,No,201814,Yes,Yes,No,3.0,Medium,0,10.0,5.5,Not Applicable,13,9.822222222,0.0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0,0.0,6,1.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,2010,106.0,"[food & beverages, hospitality]",[analytics],"[pritzker group venture capital, excelerate la...",14.0,No,United States,North America,2,0,4,0,4,14,2,medium,No,Medium,No,No,No,No,Service,No,Public,Yes,Both,No,Platform,Local,Non-Linear,No,Few,No,No,No,No,No,No,Yes,No,Yes,No,Online,B2B,Low,High,Yes,Medium,Masters,21,No,No,,500,Low,0,0,0,Low,5,,25.0,Medium,Medium,Yes,No,Low,No,Yes,No,Yes,No,No,591816,Yes,Yes,Yes,4.0,High,0,3.5,1.0,Not Applicable,12,9.322222222,6.25,0,3.125,15.625,9.375,3.125,6.25,3.125,3.125,0.0,0,0,0,0.0,6,1.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,2010,139.0,"[cloud computing, network / hosting / infrastr...",[computing],"[norwest venture partners, bessemer venture pa...",29.0,No,United States,North America,0,0,3,4,4,40,0,medium,No,Large,Yes,Yes,No,No,Both,Yes,Private,No,Both,No,Platform,Local,Non-Linear,No,Few,No,No,No,Yes,No,Yes,Yes,No,No,No,Online,B2B,Medium,High,Yes,Medium,Masters,21,Yes,Yes,Tier_1,500,Medium,0,0,0,High,5,Few,4.5,Medium,Medium,Yes,No,Low,No,No,Yes,No,No,No,1015027,Yes,Yes,No,3.0,Medium,0,10.0,6.7,Not Applicable,20,6.4,0.0,0,100.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0,0.0,2,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,2011,306.0,"[analytics, mobile, marketing]",[marketing],"[promus ventures, softtech vc, costanoa ventur...",16.0,No,United States,North America,13,0,2,0,2,50,0,medium,Yes,Medium,Yes,Yes,No,No,Product,Yes,Public,Yes,Both,No,Platform,Local,Non-Linear,No,Few,Yes,No,No,No,No,No,Yes,No,Yes,No,Online,B2B,Medium,High,Yes,High,Masters,21,Yes,Yes,Tier_1,500,High,0,0,0,Medium,13,,48.0,Low,Medium,No,No,Low,No,Yes,No,No,No,No,256921,Yes,Yes,No,3.0,Medium,0,16.66666667,11.0,Not Applicable,18,12.0,8.333333333,0,46.73202614,5.718954248,8.333333333,0.0,19.77124183,2.777777778,2.777777778,0.0,0,0,0,5.555555556,5,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


And drop the original columns with lists.

In [636]:
features = features.drop('Industry of company', 1).drop('Investors', 1).drop('Focus functions of company', 1)
features.head()

Unnamed: 0,year of founding,Internet Activity Score,Employee Count,Has the team size grown,Country of company,Continent of company,Number of Investors in Seed,Number of Investors in Angel and or VC,Number of Co-founders,Number of of advisors,Team size Senior leadership,Team size all employees,Number of of repeat investors,Number of Sales Support material,Worked in top companies,Average size of companies worked for in the past,Have been part of startups in the past?,Have been part of successful startups in the past?,Was he or she partner in Big 5 consulting?,Consulting experience?,Product or service company?,Catering to product/service across verticals,Focus on private or public data?,Focus on consumer data?,Focus on structured or unstructured data,Subscription based business,Cloud or platform based serive/product?,Local or global player,Linear or Non-linear business model,"Capital intensive business e.g. e-commerce, Engineering products and operations can also cause a business to be capital intensive",Number of of Partners of company,Crowdsourcing based business,Crowdfunding based business,Machine Learning based business,Predictive Analytics business,Speech analytics business,Prescriptive analytics business,Big Data Business,Cross-Channel Analytics/ marketing channels,Owns data or not? (monetization of data) e.g. Factual,Is the company an aggregator/market place? e.g. Bluekai,Online or offline venture - physical location based business or online venture?,B2C or B2B venture?,Top forums like 'Tech crunch' or 'Venture beat' talking about the company/model - How much is it being talked about?,Average Years of experience for founder and co founder,Exposure across the globe,Breadth of experience across verticals,Highest education,Years of education,Relevance of education to venture,Relevance of experience to venture,Degree from a Tier 1 or Tier 2 university?,Renowned in professional circle,Experience in selling and building products,Experience in Fortune 100 organizations,Experience in Fortune 500 organizations,Experience in Fortune 1000 organizations,Top management similarity,Number of Recognitions for Founders and Co-founders,Number of of Research publications,Skills score,Team Composition score,Dificulty of Obtaining Work force,Pricing Strategy,Hyper localisation,Time to market service or product,Long term relationship with other founders,Proprietary or patent position (competitive position),Barriers of entry for the competitors,Company awards,Controversial history of founder or co founder,Legal risk and intellectual property,google page rank of company website,Technical proficiencies to analyse and interpret unstructured data,Solutions offered,Invested through global incubation competitions?,Industry trend in investing,Disruptiveness of technology,Number of Direct competitors,Employees per year of company existence,Last round of funding received (in milionUSD),"Survival through recession, based on existence of the company through recession times",Time to 1st investment (in months),"Avg time to investment - average across all rounds, measured from previous investment",Percent_skill_Entrepreneurship,Percent_skill_Operations,Percent_skill_Engineering,Percent_skill_Marketing,Percent_skill_Leadership,Percent_skill_Data Science,Percent_skill_Business Strategy,Percent_skill_Product Management,Percent_skill_Sales,Percent_skill_Domain,Percent_skill_Law,Percent_skill_Consulting,Percent_skill_Finance,Percent_skill_Investment,Renown score,analytics,cloud computing,software development,marketing,enterprise software,food & beverages,hospitality,network / hosting / infrastructure,mobile,healthcare,pharmaceuticals,media,finance,e-commerce,gaming,email,security,publishing,education,advertising,entertainment,transportation,retail,social networking,search,insurance,cleantech,energy,market research,deals,telecommunications,music,techstars,streamlined ventures,amplify partners,rincon venture partners,pelion venture partners,500 startups,loren siebert,jason seats,xg ventures,george karidis,sam choi,morris wheeler,data collective,pejman nozad,ullas naik,dirk elmendorf,galvanize,pat matthews,paul kedrosky,matt ocko,cloud power capital,jared kopf,anne johnson,issac roth,george karutz,jim deters,zachary aarons,zack bogue,dfj frontier,draper nexus ventures,gil elbaz,auren hoffman,walter kortschak,mi ventures,brand ventures,daher capital,double m partners,gold hill capital,clark landry,draper associates,mi ventures llc,signia venture partners,pritzker group venture capital,excelerate labs,hyde park venture partners,chicago ventures,amicus capital,ideo,olive ventures,kd capital l.l.c,norwest venture partners,bessemer venture partners,atlas venture,promus ventures,softtech vc,costanoa venture capital,lee linden,chamath palihapitiya,raj de datta,tim kendall,omar siddiqui,david vivero,sequoia capital,khosla ventures,redpoint ventures,first round capital,pivotnorth capital,battery ventures,in-q-tel,jump capital,sap ventures,northwestern university,harvard business school angels,tech coast angels,hummer winblad venture partners,us venture partners,presidio ventures,mohr davidow ventures,grotech ventures,access venture partners,ffp holdings,matrix partners,allan zeise,thomas shannon,allen zeise,wisconsin investment partners,silicon pastures,thomas vincent,ross bjella,david lisle,ed karrels,jeff harris,tom demell,thomas demell,jeffery harris,michael kluiber,peter skanaivs,andy wojack,crista wojack,eric kessenich,john schmidt,angels on the water,ea media syndicate i llc,glen surnamer,jeff schweiger,defense advanced research projects agency,sv angel,harrison metal capital,baseline ventures,greylock partners,dick costolo,reid hoffman,jeff jordan,harrison metal,bain capital ventures,square 1 bank,high line venture partners,allen debevoise,matt coffin,paul bricault,jonah goodhart,dean gilbert,tony nethercutt,firstmark capital,ben ling,lerer ventures,cit gap funds,jaffray woodriff,hyde park angels,i2a fund,social leverage,morningstar,reed elsevier ventures,active up,guess,audilion,jean-luc halleux,dany donnen,startupinvest,launchub,launch hub,Unnamed: 273,boulas ventures,seed4soft,seventure partners,elaia partners,isai,capital spreads,rockaway capital,stephen bullock,chris underhill,scottish equity partners,seedcamp,innovation warehouse,stefan glaenzer,sherry coutu,ab banerjee,shamil chandaria,anish chandaria,bill emmott,ditlev schwanenflugel,ken olisa,azeem azhar,anthemis group,meridian venture partners,tom glocer,sanford dickert,alastair mitchell,andy mcloughlin,tim jackson,phil wilkinson,guy westlake,sean cornwell,ned cranborne,shan drummond,innova kapital,cantabria capital,voyager capital,vegastechfund,lars-henrik friis molin,amicus group,kolind a/s,appian ventures,core capital partners,qed investors,tech wildcatters,start-up chile,aleph,capital innovators,cultivation capital,eric kwan,google ventures,kenny van zant,franklyn chien,ryan merket,bobby goodlatte,yishan wong,brian mcclendon,charles river ventures,new enterprise associates,y combinator,gabor cselle,george zachary,jim young,alison rosenthal,jerry yang,paul buchheit,tie angels,hub angels investment group,jit saxena,rob soni,john simon,gaugarin oliver,prakash khot,launchcapital,chris devore,tom peterson,aaron bird,founder's co-op,howard lindzon,friends and family,jeremie berrebi,sutter hill ventures,crunchfund,rtp ventures,almaz capital,ru-net holdings,nick rau,bob jacobson,jimmy barge,jor law,lux capital,fidelity biosciences,venrock,highland capital partners,gerson lehrman group,jonathan bush,ed park,john goldsmith,james golden,ia ventures,digital sky technologies,raymond tonsing,accel partners,start fund,sam altman,ashton kutcher,guy oseary,ron garret,karl jacob,marco bergmann,mark larosa,tom mcinerney,janis krums,cross atlantic capital partners,edison ventures,rikki tahta,simon murdoch,roland beaulieu,gary mueller,spring lake equity partners,epic ventures,kevin o'connor,omidyar network,marc andreessen,ron conway,al avery,floodgate,svb financial group,kevin rose,floodgate fund,resolute.vc,founder collective,thrive capital,alexis ohanian,bubba murarka,reinmkr capital,shasta ventures,felicis ventures,jeffrey p. parker,tom falus,matt dwyer,sam wohlstadter,high peaks venture partners,kec ventures,softbank capital,jeff parker,fintech collective,intel capital,north bridge venture partners,vision ridge partners,green tree equity,jove equity partners,david cohen,brett jackson,bart lorang,walt winshall,valero capital,five mill ventures,westly group,the menlo park,javelin venture partners,alireza masrour,plug & play ventures,united parcel service,osage university partners,scott mcnealy,andreessen horowitz,point nine capital,sand hill angels,dingman center angels,interwest partners,tenoneten ventures,foundry group,upfront ventures,kbs+ ventures,neu venture capital,apricot capital,globespan capital partners,menlo ventures,flybridge capital partners,commonwealth capital ventures,national science foundation,michael chaney,frederick farrar,t. mills kelly,wayne buder,gregory t stern,kenneth gilbert,gregory t.stern,peter j. toren,marco giberti,paolo rubatto,greg stern,peter toren,david neithercutt,investment group of santa barbara,madrona venture group,mhs capital,rob glaser,kleiner perkins caufield & byers,dag ventures,austin ventures,contour venture partners,allegro venture partners,mack capital,brett hurt,sam decker,adam ross,tom meredith,dean drako,vulcan capital,geoff entress,bloomberg beta,ignition partners,next world capital,software ag,workday,citi ventures,august capital,morado venture partners,ame cloud ventures,sopris partners,voodoo ventures,safeguard scientifics,charles f. dolan,golden seeds,springboard enterprises,miramar venture partners,ata ventures,foundation capital,stanford university,e.on,zig capital,jcb investments,moore venture partners,la jolla holding,tomorrowventures,venture capitals,mehdi daoudi,boldstart ventures,oak investment partners,diamondhead ventures,ampersand capital partners,qualcomm ventures,american express ventures,maine technology institute,libra future fund,i2e,seedstep angels group,divergent ventures,john ives,lew moorman,john engates,revel partners,commonangels,boston seed capital,dharmesh shah,waikit lau,nick ducoff,jacob perkins,dan casey,crosslink capital,kapor capital,giza venture capital,endeavor partners,naval ravikant,mark goines,josh james,ecosystem ventures,paige craig,correlation ventures,quest venture partners,sigma prime ventures,kepha partners,sigma partners,deutsche telekom,index ventures,t-venture,gerard govaerts,fortino,vendep oy,eden ventures,pentech ventures,oxford technology management,imperial innovations,bootstrap incubation,startup bootcamp,startupbootcamp,dan somers,technology strategy board,christoph janz,alexander bruehl,dn capital,slow ventures,draper fisher jurvetson (dfj),mayfield fund,draper richards,mangrove capital partners,lehman brothers,sierra ventures,team europe,fabrice grinda,jose marin,james gutierrez,embarcadero ventures,inovia capital,greycroft partners,brian s. cohen,phil grieshaber,john taysom,justin siegel,jeffrey silverman,jerry newman,bruno bowden,jeff hammerbacher,michael abbott,andrew mccollum,ed roberts,jean hammond,quotidian ventures,general catalyst partners,lowercase capital,lightbank,babak nivi,steamboat ventures,the mail room fund,launchpad la,at&t,stage one capital,mercury fund,ff venture capital,dundee venture capital,mfi capital,jim pallotta,josh mailman,radcliff group,hadi partovi,ali partovi,e.ventures,satya patel,redpoint eventures,split rock partners,ggv capital,canopy ventures,allegis capital,trident capital,canopy group,dry canyon holdings,cross creek capital,indous venture partners,triplepoint capital,jim clark,mike ramsay,gabriell weinberg,david cancel,joshua schachter,roy rodenstein,project 11 ventures,meakem becker venture capital,sunstone capital,operations,sales,computing,research,database management,technology,analytic,software,social media management,customer servce,customer service,app revenue,data visualization,service,operation,accounting,training,risk,data collection,social news,consumer web,data management,strategy,tool,inventory management,energy saving,optimization,crm,pricing,customer targeting,search enginenoptimization,customer engagement,social media analytics,content marketing,presentations,social media,dashboards,localized behaviour,wireless,sale,social network,music intelligece,network optimization
0,2011,455.0,14.0,No,United States,North America,15,0,3,0,7,10,0,low,Yes,Medium,No,No,No,No,Both,Yes,Private,Yes,Both,Yes,cloud,Local,Non-Linear,No,Few,No,No,No,Yes,No,No,Yes,No,No,No,Online,B2B,Low,Medium,Yes,Low,Bachelors,18,Yes,Yes,Tier_2,500,High,0,0,1,Medium,18,,36.0,High,Medium,Yes,No,Low,Yes,Yes,Yes,No,No,No,71391,Yes,Yes,Yes,3.0,Medium,0,3.333333333,2.35,Not Applicable,2,7.344444444,9.401709402,0,57.47863248,0.0,0.0,3.846153846,17.09401709,9.401709402,0.0,2.777777778,0,0,0,0.0,9,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,2010,496.0,39.0,No,United States,North America,7,0,1,1,8,40,0,high,No,Small,No,No,No,No,Product,Yes,Public,Yes,Both,No,Platform,Local,Non-Linear,Yes,Few,No,No,No,No,No,No,Yes,No,No,No,Online,B2B,Low,High,Yes,Medium,Bachelors,18,Yes,Yes,,500,High,0,0,0,Low,5,Few,23.0,Medium,Medium,Yes,No,Low,Yes,Yes,Yes,No,No,No,201814,Yes,Yes,No,3.0,Medium,0,10.0,5.5,Not Applicable,13,9.822222222,0.0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0,0.0,6,1.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,2010,106.0,14.0,No,United States,North America,2,0,4,0,4,14,2,medium,No,Medium,No,No,No,No,Service,No,Public,Yes,Both,No,Platform,Local,Non-Linear,No,Few,No,No,No,No,No,No,Yes,No,Yes,No,Online,B2B,Low,High,Yes,Medium,Masters,21,No,No,,500,Low,0,0,0,Low,5,,25.0,Medium,Medium,Yes,No,Low,No,Yes,No,Yes,No,No,591816,Yes,Yes,Yes,4.0,High,0,3.5,1.0,Not Applicable,12,9.322222222,6.25,0,3.125,15.625,9.375,3.125,6.25,3.125,3.125,0.0,0,0,0,0.0,6,1.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,2010,139.0,29.0,No,United States,North America,0,0,3,4,4,40,0,medium,No,Large,Yes,Yes,No,No,Both,Yes,Private,No,Both,No,Platform,Local,Non-Linear,No,Few,No,No,No,Yes,No,Yes,Yes,No,No,No,Online,B2B,Medium,High,Yes,Medium,Masters,21,Yes,Yes,Tier_1,500,Medium,0,0,0,High,5,Few,4.5,Medium,Medium,Yes,No,Low,No,No,Yes,No,No,No,1015027,Yes,Yes,No,3.0,Medium,0,10.0,6.7,Not Applicable,20,6.4,0.0,0,100.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0,0.0,2,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,2011,306.0,16.0,No,United States,North America,13,0,2,0,2,50,0,medium,Yes,Medium,Yes,Yes,No,No,Product,Yes,Public,Yes,Both,No,Platform,Local,Non-Linear,No,Few,Yes,No,No,No,No,No,Yes,No,Yes,No,Online,B2B,Medium,High,Yes,High,Masters,21,Yes,Yes,Tier_1,500,High,0,0,0,Medium,13,,48.0,Low,Medium,No,No,Low,No,Yes,No,No,No,No,256921,Yes,Yes,No,3.0,Medium,0,16.66666667,11.0,Not Applicable,18,12.0,8.333333333,0,46.73202614,5.718954248,8.333333333,0.0,19.77124183,2.777777778,2.777777778,0.0,0,0,0,5.555555556,5,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


### Encoding categorical variables as numericals...

First, let's encode the statuses.

In [637]:
statuses.unique()

array(['Success', 'Failed'], dtype=object)

In [638]:
statuses.replace(to_replace=['Success', 'Failed'], value=[1,0], inplace=True)
statuses.head()

0    1
1    1
2    1
3    1
4    1
Name: Dependent-Company Status, dtype: int64

Let's consider answers having less than 20 unique values as categorical in our feature list.

In [639]:
cat_columns = []
cat_vals = []
for col in features.columns:
    if len(features[col].unique()) < 21:
        if isinstance(features[col][0], str):
            col_set = sorted(features[col].apply(lambda x: x.lower()).unique())
        else:
            col_set = sorted(features[col].unique())
        if not((len(col_set) == 2) and (col_set[0] == 0) and (col_set[1] == 1)):
            cat_columns.append(col)
            cat_vals.append(col_set)
        
categoricals = pd.DataFrame()
categoricals['Feature'] = cat_columns
categoricals['Categories'] = cat_vals

categoricals

Unnamed: 0,Feature,Categories
0,year of founding,"[1999, 2000, 2002, 2004, 2005, 2006, 2007, 200..."
1,Has the team size grown,"[no, yes]"
2,Country of company,"[belgium, bulgaria, czech republic, denmark, f..."
3,Continent of company,"[asia, europe, north america]"
4,Number of Investors in Seed,"[0, 1, 10, 13, 15, 17, 2, 22, 24, 3, 4, 5, 6, ..."
5,Number of Investors in Angel and or VC,"[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]"
6,Number of Co-founders,"[0, 1, 2, 3, 4, 5, 7]"
7,Number of of advisors,"[0, 1, 2, 3, 4, 5, 6, 7, 8, 11]"
8,Team size Senior leadership,"[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]"
9,Number of of repeat investors,"[0, 1, 10, 2, 3, 4]"


Let's take out all the features that have two options and replace them with 0/1s. This includes yes/no, linear/non-linear, local/global, etc..

In [640]:
for row, col in zip(range(len(categoricals)), categoricals.Feature):
    if isinstance(features[col][0], str):
        col_set = categoricals['Categories'][row]
        if (len(col_set) == 2):
            features[col] = features[col].apply(lambda x: x.lower())
            features[col].replace(to_replace=col_set, value=[0,1], inplace=True)
            categoricals = categoricals[categoricals['Feature'] != col]
features.head()

Unnamed: 0,year of founding,Internet Activity Score,Employee Count,Has the team size grown,Country of company,Continent of company,Number of Investors in Seed,Number of Investors in Angel and or VC,Number of Co-founders,Number of of advisors,Team size Senior leadership,Team size all employees,Number of of repeat investors,Number of Sales Support material,Worked in top companies,Average size of companies worked for in the past,Have been part of startups in the past?,Have been part of successful startups in the past?,Was he or she partner in Big 5 consulting?,Consulting experience?,Product or service company?,Catering to product/service across verticals,Focus on private or public data?,Focus on consumer data?,Focus on structured or unstructured data,Subscription based business,Cloud or platform based serive/product?,Local or global player,Linear or Non-linear business model,"Capital intensive business e.g. e-commerce, Engineering products and operations can also cause a business to be capital intensive",Number of of Partners of company,Crowdsourcing based business,Crowdfunding based business,Machine Learning based business,Predictive Analytics business,Speech analytics business,Prescriptive analytics business,Big Data Business,Cross-Channel Analytics/ marketing channels,Owns data or not? (monetization of data) e.g. Factual,Is the company an aggregator/market place? e.g. Bluekai,Online or offline venture - physical location based business or online venture?,B2C or B2B venture?,Top forums like 'Tech crunch' or 'Venture beat' talking about the company/model - How much is it being talked about?,Average Years of experience for founder and co founder,Exposure across the globe,Breadth of experience across verticals,Highest education,Years of education,Relevance of education to venture,Relevance of experience to venture,Degree from a Tier 1 or Tier 2 university?,Renowned in professional circle,Experience in selling and building products,Experience in Fortune 100 organizations,Experience in Fortune 500 organizations,Experience in Fortune 1000 organizations,Top management similarity,Number of Recognitions for Founders and Co-founders,Number of of Research publications,Skills score,Team Composition score,Dificulty of Obtaining Work force,Pricing Strategy,Hyper localisation,Time to market service or product,Long term relationship with other founders,Proprietary or patent position (competitive position),Barriers of entry for the competitors,Company awards,Controversial history of founder or co founder,Legal risk and intellectual property,google page rank of company website,Technical proficiencies to analyse and interpret unstructured data,Solutions offered,Invested through global incubation competitions?,Industry trend in investing,Disruptiveness of technology,Number of Direct competitors,Employees per year of company existence,Last round of funding received (in milionUSD),"Survival through recession, based on existence of the company through recession times",Time to 1st investment (in months),"Avg time to investment - average across all rounds, measured from previous investment",Percent_skill_Entrepreneurship,Percent_skill_Operations,Percent_skill_Engineering,Percent_skill_Marketing,Percent_skill_Leadership,Percent_skill_Data Science,Percent_skill_Business Strategy,Percent_skill_Product Management,Percent_skill_Sales,Percent_skill_Domain,Percent_skill_Law,Percent_skill_Consulting,Percent_skill_Finance,Percent_skill_Investment,Renown score,analytics,cloud computing,software development,marketing,enterprise software,food & beverages,hospitality,network / hosting / infrastructure,mobile,healthcare,pharmaceuticals,media,finance,e-commerce,gaming,email,security,publishing,education,advertising,entertainment,transportation,retail,social networking,search,insurance,cleantech,energy,market research,deals,telecommunications,music,techstars,streamlined ventures,amplify partners,rincon venture partners,pelion venture partners,500 startups,loren siebert,jason seats,xg ventures,george karidis,sam choi,morris wheeler,data collective,pejman nozad,ullas naik,dirk elmendorf,galvanize,pat matthews,paul kedrosky,matt ocko,cloud power capital,jared kopf,anne johnson,issac roth,george karutz,jim deters,zachary aarons,zack bogue,dfj frontier,draper nexus ventures,gil elbaz,auren hoffman,walter kortschak,mi ventures,brand ventures,daher capital,double m partners,gold hill capital,clark landry,draper associates,mi ventures llc,signia venture partners,pritzker group venture capital,excelerate labs,hyde park venture partners,chicago ventures,amicus capital,ideo,olive ventures,kd capital l.l.c,norwest venture partners,bessemer venture partners,atlas venture,promus ventures,softtech vc,costanoa venture capital,lee linden,chamath palihapitiya,raj de datta,tim kendall,omar siddiqui,david vivero,sequoia capital,khosla ventures,redpoint ventures,first round capital,pivotnorth capital,battery ventures,in-q-tel,jump capital,sap ventures,northwestern university,harvard business school angels,tech coast angels,hummer winblad venture partners,us venture partners,presidio ventures,mohr davidow ventures,grotech ventures,access venture partners,ffp holdings,matrix partners,allan zeise,thomas shannon,allen zeise,wisconsin investment partners,silicon pastures,thomas vincent,ross bjella,david lisle,ed karrels,jeff harris,tom demell,thomas demell,jeffery harris,michael kluiber,peter skanaivs,andy wojack,crista wojack,eric kessenich,john schmidt,angels on the water,ea media syndicate i llc,glen surnamer,jeff schweiger,defense advanced research projects agency,sv angel,harrison metal capital,baseline ventures,greylock partners,dick costolo,reid hoffman,jeff jordan,harrison metal,bain capital ventures,square 1 bank,high line venture partners,allen debevoise,matt coffin,paul bricault,jonah goodhart,dean gilbert,tony nethercutt,firstmark capital,ben ling,lerer ventures,cit gap funds,jaffray woodriff,hyde park angels,i2a fund,social leverage,morningstar,reed elsevier ventures,active up,guess,audilion,jean-luc halleux,dany donnen,startupinvest,launchub,launch hub,Unnamed: 273,boulas ventures,seed4soft,seventure partners,elaia partners,isai,capital spreads,rockaway capital,stephen bullock,chris underhill,scottish equity partners,seedcamp,innovation warehouse,stefan glaenzer,sherry coutu,ab banerjee,shamil chandaria,anish chandaria,bill emmott,ditlev schwanenflugel,ken olisa,azeem azhar,anthemis group,meridian venture partners,tom glocer,sanford dickert,alastair mitchell,andy mcloughlin,tim jackson,phil wilkinson,guy westlake,sean cornwell,ned cranborne,shan drummond,innova kapital,cantabria capital,voyager capital,vegastechfund,lars-henrik friis molin,amicus group,kolind a/s,appian ventures,core capital partners,qed investors,tech wildcatters,start-up chile,aleph,capital innovators,cultivation capital,eric kwan,google ventures,kenny van zant,franklyn chien,ryan merket,bobby goodlatte,yishan wong,brian mcclendon,charles river ventures,new enterprise associates,y combinator,gabor cselle,george zachary,jim young,alison rosenthal,jerry yang,paul buchheit,tie angels,hub angels investment group,jit saxena,rob soni,john simon,gaugarin oliver,prakash khot,launchcapital,chris devore,tom peterson,aaron bird,founder's co-op,howard lindzon,friends and family,jeremie berrebi,sutter hill ventures,crunchfund,rtp ventures,almaz capital,ru-net holdings,nick rau,bob jacobson,jimmy barge,jor law,lux capital,fidelity biosciences,venrock,highland capital partners,gerson lehrman group,jonathan bush,ed park,john goldsmith,james golden,ia ventures,digital sky technologies,raymond tonsing,accel partners,start fund,sam altman,ashton kutcher,guy oseary,ron garret,karl jacob,marco bergmann,mark larosa,tom mcinerney,janis krums,cross atlantic capital partners,edison ventures,rikki tahta,simon murdoch,roland beaulieu,gary mueller,spring lake equity partners,epic ventures,kevin o'connor,omidyar network,marc andreessen,ron conway,al avery,floodgate,svb financial group,kevin rose,floodgate fund,resolute.vc,founder collective,thrive capital,alexis ohanian,bubba murarka,reinmkr capital,shasta ventures,felicis ventures,jeffrey p. parker,tom falus,matt dwyer,sam wohlstadter,high peaks venture partners,kec ventures,softbank capital,jeff parker,fintech collective,intel capital,north bridge venture partners,vision ridge partners,green tree equity,jove equity partners,david cohen,brett jackson,bart lorang,walt winshall,valero capital,five mill ventures,westly group,the menlo park,javelin venture partners,alireza masrour,plug & play ventures,united parcel service,osage university partners,scott mcnealy,andreessen horowitz,point nine capital,sand hill angels,dingman center angels,interwest partners,tenoneten ventures,foundry group,upfront ventures,kbs+ ventures,neu venture capital,apricot capital,globespan capital partners,menlo ventures,flybridge capital partners,commonwealth capital ventures,national science foundation,michael chaney,frederick farrar,t. mills kelly,wayne buder,gregory t stern,kenneth gilbert,gregory t.stern,peter j. toren,marco giberti,paolo rubatto,greg stern,peter toren,david neithercutt,investment group of santa barbara,madrona venture group,mhs capital,rob glaser,kleiner perkins caufield & byers,dag ventures,austin ventures,contour venture partners,allegro venture partners,mack capital,brett hurt,sam decker,adam ross,tom meredith,dean drako,vulcan capital,geoff entress,bloomberg beta,ignition partners,next world capital,software ag,workday,citi ventures,august capital,morado venture partners,ame cloud ventures,sopris partners,voodoo ventures,safeguard scientifics,charles f. dolan,golden seeds,springboard enterprises,miramar venture partners,ata ventures,foundation capital,stanford university,e.on,zig capital,jcb investments,moore venture partners,la jolla holding,tomorrowventures,venture capitals,mehdi daoudi,boldstart ventures,oak investment partners,diamondhead ventures,ampersand capital partners,qualcomm ventures,american express ventures,maine technology institute,libra future fund,i2e,seedstep angels group,divergent ventures,john ives,lew moorman,john engates,revel partners,commonangels,boston seed capital,dharmesh shah,waikit lau,nick ducoff,jacob perkins,dan casey,crosslink capital,kapor capital,giza venture capital,endeavor partners,naval ravikant,mark goines,josh james,ecosystem ventures,paige craig,correlation ventures,quest venture partners,sigma prime ventures,kepha partners,sigma partners,deutsche telekom,index ventures,t-venture,gerard govaerts,fortino,vendep oy,eden ventures,pentech ventures,oxford technology management,imperial innovations,bootstrap incubation,startup bootcamp,startupbootcamp,dan somers,technology strategy board,christoph janz,alexander bruehl,dn capital,slow ventures,draper fisher jurvetson (dfj),mayfield fund,draper richards,mangrove capital partners,lehman brothers,sierra ventures,team europe,fabrice grinda,jose marin,james gutierrez,embarcadero ventures,inovia capital,greycroft partners,brian s. cohen,phil grieshaber,john taysom,justin siegel,jeffrey silverman,jerry newman,bruno bowden,jeff hammerbacher,michael abbott,andrew mccollum,ed roberts,jean hammond,quotidian ventures,general catalyst partners,lowercase capital,lightbank,babak nivi,steamboat ventures,the mail room fund,launchpad la,at&t,stage one capital,mercury fund,ff venture capital,dundee venture capital,mfi capital,jim pallotta,josh mailman,radcliff group,hadi partovi,ali partovi,e.ventures,satya patel,redpoint eventures,split rock partners,ggv capital,canopy ventures,allegis capital,trident capital,canopy group,dry canyon holdings,cross creek capital,indous venture partners,triplepoint capital,jim clark,mike ramsay,gabriell weinberg,david cancel,joshua schachter,roy rodenstein,project 11 ventures,meakem becker venture capital,sunstone capital,operations,sales,computing,research,database management,technology,analytic,software,social media management,customer servce,customer service,app revenue,data visualization,service,operation,accounting,training,risk,data collection,social news,consumer web,data management,strategy,tool,inventory management,energy saving,optimization,crm,pricing,customer targeting,search enginenoptimization,customer engagement,social media analytics,content marketing,presentations,social media,dashboards,localized behaviour,wireless,sale,social network,music intelligece,network optimization
0,2011,455.0,14.0,0,United States,North America,15,0,3,0,7,10,0,low,1,Medium,0,0,0,0,Both,1,Private,1,Both,1,cloud,1,1,0,Few,0,0,0,1,0,0,1,0,0,0,Online,0,Low,Medium,1,Low,Bachelors,18,1,1,Tier_2,500,High,0,0,1,Medium,18,,36.0,High,Medium,1,0,Low,1,1,1,0,No,0,71391,1,1,1,3.0,Medium,0,3.333333333,2.35,Not Applicable,2,7.344444444,9.401709402,0,57.47863248,0.0,0.0,3.846153846,17.09401709,9.401709402,0.0,2.777777778,0,0,0,0.0,9,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,2010,496.0,39.0,0,United States,North America,7,0,1,1,8,40,0,high,0,Small,0,0,0,0,Product,1,Public,1,Both,0,Platform,1,1,1,Few,0,0,0,0,0,0,1,0,0,0,Online,0,Low,High,1,Medium,Bachelors,18,1,1,,500,High,0,0,0,Low,5,Few,23.0,Medium,Medium,1,0,Low,1,1,1,0,No,0,201814,1,1,0,3.0,Medium,0,10.0,5.5,Not Applicable,13,9.822222222,0.0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0,0.0,6,1.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,2010,106.0,14.0,0,United States,North America,2,0,4,0,4,14,2,medium,0,Medium,0,0,0,0,Service,0,Public,1,Both,0,Platform,1,1,0,Few,0,0,0,0,0,0,1,0,1,0,Online,0,Low,High,1,Medium,Masters,21,0,0,,500,Low,0,0,0,Low,5,,25.0,Medium,Medium,1,0,Low,0,1,0,1,No,0,591816,1,1,1,4.0,High,0,3.5,1.0,Not Applicable,12,9.322222222,6.25,0,3.125,15.625,9.375,3.125,6.25,3.125,3.125,0.0,0,0,0,0.0,6,1.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,2010,139.0,29.0,0,United States,North America,0,0,3,4,4,40,0,medium,0,Large,1,1,0,0,Both,1,Private,0,Both,0,Platform,1,1,0,Few,0,0,0,1,0,1,1,0,0,0,Online,0,Medium,High,1,Medium,Masters,21,1,1,Tier_1,500,Medium,0,0,0,High,5,Few,4.5,Medium,Medium,1,0,Low,0,0,1,0,No,0,1015027,1,1,0,3.0,Medium,0,10.0,6.7,Not Applicable,20,6.4,0.0,0,100.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0,0.0,2,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,2011,306.0,16.0,0,United States,North America,13,0,2,0,2,50,0,medium,1,Medium,1,1,0,0,Product,1,Public,1,Both,0,Platform,1,1,0,Few,1,0,0,0,0,0,1,0,1,0,Online,0,Medium,High,1,High,Masters,21,1,1,Tier_1,500,High,0,0,0,Medium,13,,48.0,Low,Medium,0,0,Low,0,1,0,0,No,0,256921,1,1,0,3.0,Medium,0,16.66666667,11.0,Not Applicable,18,12.0,8.333333333,0,46.73202614,5.718954248,8.333333333,0.0,19.77124183,2.777777778,2.777777778,0.0,0,0,0,5.555555556,5,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [641]:
categoricals = categoricals.reset_index(drop=True)
categoricals

Unnamed: 0,Feature,Categories
0,year of founding,"[1999, 2000, 2002, 2004, 2005, 2006, 2007, 200..."
1,Country of company,"[belgium, bulgaria, czech republic, denmark, f..."
2,Continent of company,"[asia, europe, north america]"
3,Number of Investors in Seed,"[0, 1, 10, 13, 15, 17, 2, 22, 24, 3, 4, 5, 6, ..."
4,Number of Investors in Angel and or VC,"[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]"
5,Number of Co-founders,"[0, 1, 2, 3, 4, 5, 7]"
6,Number of of advisors,"[0, 1, 2, 3, 4, 5, 6, 7, 8, 11]"
7,Team size Senior leadership,"[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]"
8,Number of of repeat investors,"[0, 1, 10, 2, 3, 4]"
9,Number of Sales Support material,"[high, low, medium, nothing]"


As we can see, the *'Controversial history of founder or co founder'* feature has a unique answer throughout the database, so let's just drop it, as it does not give us any information.

In [642]:
features = features.drop('Controversial history of founder or co founder', 1)
categoricals = categoricals[categoricals['Feature'] != 'Controversial history of founder or co founder']

categoricals

Unnamed: 0,Feature,Categories
0,year of founding,"[1999, 2000, 2002, 2004, 2005, 2006, 2007, 200..."
1,Country of company,"[belgium, bulgaria, czech republic, denmark, f..."
2,Continent of company,"[asia, europe, north america]"
3,Number of Investors in Seed,"[0, 1, 10, 13, 15, 17, 2, 22, 24, 3, 4, 5, 6, ..."
4,Number of Investors in Angel and or VC,"[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]"
5,Number of Co-founders,"[0, 1, 2, 3, 4, 5, 7]"
6,Number of of advisors,"[0, 1, 2, 3, 4, 5, 6, 7, 8, 11]"
7,Team size Senior leadership,"[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]"
8,Number of of repeat investors,"[0, 1, 10, 2, 3, 4]"
9,Number of Sales Support material,"[high, low, medium, nothing]"


Let's now rule out the Features that don't seem feasible to be viewed as categoricals. These features are:

- 'Percent_skill_Consulting'
- 'Percent_skill_Finance'
- 'Renowned in professional circle'

The rest are better off as categoricals, as the dependency should not neccessarily be linear, and we can better analyze having those features as categoricals.

In [643]:
categoricals = categoricals[categoricals['Feature'] != 'Percent_skill_Consulting']
categoricals = categoricals[categoricals['Feature'] != 'Percent_skill_Finance']
categoricals = categoricals[categoricals['Feature'] != 'Renowned in professional circle']
categoricals = categoricals.reset_index(drop=True)

categoricals

Unnamed: 0,Feature,Categories
0,year of founding,"[1999, 2000, 2002, 2004, 2005, 2006, 2007, 200..."
1,Country of company,"[belgium, bulgaria, czech republic, denmark, f..."
2,Continent of company,"[asia, europe, north america]"
3,Number of Investors in Seed,"[0, 1, 10, 13, 15, 17, 2, 22, 24, 3, 4, 5, 6, ..."
4,Number of Investors in Angel and or VC,"[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]"
5,Number of Co-founders,"[0, 1, 2, 3, 4, 5, 7]"
6,Number of of advisors,"[0, 1, 2, 3, 4, 5, 6, 7, 8, 11]"
7,Team size Senior leadership,"[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]"
8,Number of of repeat investors,"[0, 1, 10, 2, 3, 4]"
9,Number of Sales Support material,"[high, low, medium, nothing]"


Great! Now we are left only with the features that should be encoded as binary variables, like we did in case of lists. We just ahve to make sure that every column has the original feature name attached to the answer so that noting gets overwrote (e.g. *'both'*, *'none'*). 

In [644]:
for index, ignored in features.iterrows():
    for row, feature in zip(range(len(categoricals)), categoricals.Feature):
        for option in categoricals['Categories'][row]:
            col_name = feature + " " + str(option)
            features.at[index, col_name] = 1
        
features = features.fillna(0)

features.head()

Unnamed: 0,year of founding,Internet Activity Score,Employee Count,Has the team size grown,Country of company,Continent of company,Number of Investors in Seed,Number of Investors in Angel and or VC,Number of Co-founders,Number of of advisors,Team size Senior leadership,Team size all employees,Number of of repeat investors,Number of Sales Support material,Worked in top companies,Average size of companies worked for in the past,Have been part of startups in the past?,Have been part of successful startups in the past?,Was he or she partner in Big 5 consulting?,Consulting experience?,Product or service company?,Catering to product/service across verticals,Focus on private or public data?,Focus on consumer data?,Focus on structured or unstructured data,Subscription based business,Cloud or platform based serive/product?,Local or global player,Linear or Non-linear business model,"Capital intensive business e.g. e-commerce, Engineering products and operations can also cause a business to be capital intensive",Number of of Partners of company,Crowdsourcing based business,Crowdfunding based business,Machine Learning based business,Predictive Analytics business,Speech analytics business,Prescriptive analytics business,Big Data Business,Cross-Channel Analytics/ marketing channels,Owns data or not? (monetization of data) e.g. Factual,Is the company an aggregator/market place? e.g. Bluekai,Online or offline venture - physical location based business or online venture?,B2C or B2B venture?,Top forums like 'Tech crunch' or 'Venture beat' talking about the company/model - How much is it being talked about?,Average Years of experience for founder and co founder,Exposure across the globe,Breadth of experience across verticals,Highest education,Years of education,Relevance of education to venture,Relevance of experience to venture,Degree from a Tier 1 or Tier 2 university?,Renowned in professional circle,Experience in selling and building products,Experience in Fortune 100 organizations,Experience in Fortune 500 organizations,Experience in Fortune 1000 organizations,Top management similarity,Number of Recognitions for Founders and Co-founders,Number of of Research publications,Skills score,Team Composition score,Dificulty of Obtaining Work force,Pricing Strategy,Hyper localisation,Time to market service or product,Long term relationship with other founders,Proprietary or patent position (competitive position),Barriers of entry for the competitors,Company awards,Legal risk and intellectual property,google page rank of company website,Technical proficiencies to analyse and interpret unstructured data,Solutions offered,Invested through global incubation competitions?,Industry trend in investing,Disruptiveness of technology,Number of Direct competitors,Employees per year of company existence,Last round of funding received (in milionUSD),"Survival through recession, based on existence of the company through recession times",Time to 1st investment (in months),"Avg time to investment - average across all rounds, measured from previous investment",Percent_skill_Entrepreneurship,Percent_skill_Operations,Percent_skill_Engineering,Percent_skill_Marketing,Percent_skill_Leadership,Percent_skill_Data Science,Percent_skill_Business Strategy,Percent_skill_Product Management,Percent_skill_Sales,Percent_skill_Domain,Percent_skill_Law,Percent_skill_Consulting,Percent_skill_Finance,Percent_skill_Investment,Renown score,analytics,cloud computing,software development,marketing,enterprise software,food & beverages,hospitality,network / hosting / infrastructure,mobile,healthcare,pharmaceuticals,media,finance,e-commerce,gaming,email,security,publishing,education,advertising,entertainment,transportation,retail,social networking,search,insurance,cleantech,energy,market research,deals,telecommunications,music,techstars,streamlined ventures,amplify partners,rincon venture partners,pelion venture partners,500 startups,loren siebert,jason seats,xg ventures,george karidis,sam choi,morris wheeler,data collective,pejman nozad,ullas naik,dirk elmendorf,galvanize,pat matthews,paul kedrosky,matt ocko,cloud power capital,jared kopf,anne johnson,issac roth,george karutz,jim deters,zachary aarons,zack bogue,dfj frontier,draper nexus ventures,gil elbaz,auren hoffman,walter kortschak,mi ventures,brand ventures,daher capital,double m partners,gold hill capital,clark landry,draper associates,mi ventures llc,signia venture partners,pritzker group venture capital,excelerate labs,hyde park venture partners,chicago ventures,amicus capital,ideo,olive ventures,kd capital l.l.c,norwest venture partners,bessemer venture partners,atlas venture,promus ventures,softtech vc,costanoa venture capital,lee linden,chamath palihapitiya,raj de datta,tim kendall,omar siddiqui,david vivero,sequoia capital,khosla ventures,redpoint ventures,first round capital,pivotnorth capital,battery ventures,in-q-tel,jump capital,sap ventures,northwestern university,harvard business school angels,tech coast angels,hummer winblad venture partners,us venture partners,presidio ventures,mohr davidow ventures,grotech ventures,access venture partners,ffp holdings,matrix partners,allan zeise,thomas shannon,allen zeise,wisconsin investment partners,silicon pastures,thomas vincent,ross bjella,david lisle,ed karrels,jeff harris,tom demell,thomas demell,jeffery harris,michael kluiber,peter skanaivs,andy wojack,crista wojack,eric kessenich,john schmidt,angels on the water,ea media syndicate i llc,glen surnamer,jeff schweiger,defense advanced research projects agency,sv angel,harrison metal capital,baseline ventures,greylock partners,dick costolo,reid hoffman,jeff jordan,harrison metal,bain capital ventures,square 1 bank,high line venture partners,allen debevoise,matt coffin,paul bricault,jonah goodhart,dean gilbert,tony nethercutt,firstmark capital,ben ling,lerer ventures,cit gap funds,jaffray woodriff,hyde park angels,i2a fund,social leverage,morningstar,reed elsevier ventures,active up,guess,audilion,jean-luc halleux,dany donnen,startupinvest,launchub,launch hub,Unnamed: 272,boulas ventures,seed4soft,seventure partners,elaia partners,isai,capital spreads,rockaway capital,stephen bullock,chris underhill,scottish equity partners,seedcamp,innovation warehouse,stefan glaenzer,sherry coutu,ab banerjee,shamil chandaria,anish chandaria,bill emmott,ditlev schwanenflugel,ken olisa,azeem azhar,anthemis group,meridian venture partners,tom glocer,sanford dickert,alastair mitchell,andy mcloughlin,tim jackson,phil wilkinson,guy westlake,sean cornwell,ned cranborne,shan drummond,innova kapital,cantabria capital,voyager capital,vegastechfund,lars-henrik friis molin,amicus group,kolind a/s,appian ventures,core capital partners,qed investors,tech wildcatters,start-up chile,aleph,capital innovators,cultivation capital,eric kwan,google ventures,kenny van zant,franklyn chien,ryan merket,bobby goodlatte,yishan wong,brian mcclendon,charles river ventures,new enterprise associates,y combinator,gabor cselle,george zachary,jim young,alison rosenthal,jerry yang,paul buchheit,tie angels,hub angels investment group,jit saxena,rob soni,john simon,gaugarin oliver,prakash khot,launchcapital,chris devore,tom peterson,aaron bird,founder's co-op,howard lindzon,friends and family,jeremie berrebi,sutter hill ventures,crunchfund,rtp ventures,almaz capital,ru-net holdings,nick rau,bob jacobson,jimmy barge,jor law,lux capital,fidelity biosciences,venrock,highland capital partners,gerson lehrman group,jonathan bush,ed park,john goldsmith,james golden,ia ventures,digital sky technologies,raymond tonsing,accel partners,start fund,sam altman,ashton kutcher,guy oseary,ron garret,karl jacob,marco bergmann,mark larosa,tom mcinerney,janis krums,cross atlantic capital partners,edison ventures,rikki tahta,simon murdoch,roland beaulieu,gary mueller,spring lake equity partners,epic ventures,kevin o'connor,omidyar network,marc andreessen,ron conway,al avery,floodgate,svb financial group,kevin rose,floodgate fund,resolute.vc,founder collective,thrive capital,alexis ohanian,bubba murarka,reinmkr capital,shasta ventures,felicis ventures,jeffrey p. parker,tom falus,matt dwyer,sam wohlstadter,high peaks venture partners,kec ventures,softbank capital,jeff parker,fintech collective,intel capital,north bridge venture partners,vision ridge partners,green tree equity,jove equity partners,david cohen,brett jackson,bart lorang,walt winshall,valero capital,five mill ventures,westly group,the menlo park,javelin venture partners,alireza masrour,plug & play ventures,united parcel service,osage university partners,scott mcnealy,andreessen horowitz,point nine capital,sand hill angels,dingman center angels,interwest partners,tenoneten ventures,foundry group,upfront ventures,kbs+ ventures,neu venture capital,apricot capital,globespan capital partners,menlo ventures,flybridge capital partners,commonwealth capital ventures,national science foundation,michael chaney,frederick farrar,t. mills kelly,wayne buder,gregory t stern,kenneth gilbert,gregory t.stern,peter j. toren,marco giberti,paolo rubatto,greg stern,peter toren,david neithercutt,investment group of santa barbara,madrona venture group,mhs capital,rob glaser,kleiner perkins caufield & byers,dag ventures,austin ventures,contour venture partners,allegro venture partners,mack capital,brett hurt,sam decker,adam ross,tom meredith,dean drako,vulcan capital,geoff entress,bloomberg beta,ignition partners,next world capital,software ag,workday,citi ventures,august capital,morado venture partners,ame cloud ventures,sopris partners,voodoo ventures,safeguard scientifics,charles f. dolan,golden seeds,springboard enterprises,miramar venture partners,ata ventures,foundation capital,stanford university,e.on,zig capital,jcb investments,moore venture partners,la jolla holding,tomorrowventures,venture capitals,mehdi daoudi,boldstart ventures,oak investment partners,diamondhead ventures,ampersand capital partners,qualcomm ventures,american express ventures,maine technology institute,libra future fund,i2e,seedstep angels group,divergent ventures,john ives,lew moorman,john engates,revel partners,commonangels,boston seed capital,dharmesh shah,waikit lau,nick ducoff,jacob perkins,dan casey,crosslink capital,kapor capital,giza venture capital,endeavor partners,naval ravikant,mark goines,josh james,ecosystem ventures,paige craig,correlation ventures,quest venture partners,sigma prime ventures,kepha partners,sigma partners,deutsche telekom,index ventures,t-venture,gerard govaerts,fortino,vendep oy,eden ventures,pentech ventures,oxford technology management,imperial innovations,bootstrap incubation,startup bootcamp,startupbootcamp,dan somers,technology strategy board,christoph janz,alexander bruehl,dn capital,slow ventures,draper fisher jurvetson (dfj),mayfield fund,draper richards,mangrove capital partners,lehman brothers,sierra ventures,team europe,fabrice grinda,jose marin,james gutierrez,embarcadero ventures,inovia capital,greycroft partners,brian s. cohen,phil grieshaber,john taysom,justin siegel,jeffrey silverman,jerry newman,bruno bowden,jeff hammerbacher,michael abbott,andrew mccollum,ed roberts,jean hammond,quotidian ventures,general catalyst partners,lowercase capital,lightbank,babak nivi,steamboat ventures,the mail room fund,launchpad la,at&t,stage one capital,mercury fund,ff venture capital,dundee venture capital,mfi capital,jim pallotta,josh mailman,radcliff group,hadi partovi,ali partovi,e.ventures,satya patel,redpoint eventures,split rock partners,ggv capital,canopy ventures,allegis capital,trident capital,canopy group,dry canyon holdings,cross creek capital,indous venture partners,triplepoint capital,jim clark,mike ramsay,gabriell weinberg,david cancel,joshua schachter,roy rodenstein,project 11 ventures,meakem becker venture capital,sunstone capital,operations,sales,computing,research,database management,technology,analytic,software,social media management,customer servce,customer service,app revenue,data visualization,service,operation,accounting,training,risk,data collection,social news,consumer web,data management,strategy,tool,inventory management,energy saving,optimization,crm,pricing,customer targeting,search enginenoptimization,customer engagement,social media analytics,content marketing,presentations,social media,dashboards,localized behaviour,wireless,sale,social network,music intelligece,network optimization,year of founding 1999,year of founding 2000,year of founding 2002,year of founding 2004,year of founding 2005,year of founding 2006,year of founding 2007,year of founding 2008,year of founding 2009,year of founding 2010,year of founding 2011,year of founding 2012,year of founding 2013,Country of company belgium,Country of company bulgaria,Country of company czech republic,Country of company denmark,Country of company finland,Country of company france,Country of company india,Country of company russian federation,Country of company singapore,Country of company switzerland,Country of company united kingdom,Country of company united states,Continent of company asia,Continent of company europe,Continent of company north america,Number of Investors in Seed 0,Number of Investors in Seed 1,Number of Investors in Seed 10,Number of Investors in Seed 13,Number of Investors in Seed 15,Number of Investors in Seed 17,Number of Investors in Seed 2,Number of Investors in Seed 22,Number of Investors in Seed 24,Number of Investors in Seed 3,Number of Investors in Seed 4,Number of Investors in Seed 5,Number of Investors in Seed 6,Number of Investors in Seed 7,Number of Investors in Seed 8,Number of Investors in Seed 9,Number of Investors in Angel and or VC 0,Number of Investors in Angel and or VC 1,Number of Investors in Angel and or VC 2,Number of Investors in Angel and or VC 3,Number of Investors in Angel and or VC 4,Number of Investors in Angel and or VC 5,Number of Investors in Angel and or VC 6,Number of Investors in Angel and or VC 7,Number of Investors in Angel and or VC 8,Number of Investors in Angel and or VC 9,Number of Co-founders 0,Number of Co-founders 1,Number of Co-founders 2,Number of Co-founders 3,Number of Co-founders 4,Number of Co-founders 5,Number of Co-founders 7,Number of of advisors 0,Number of of advisors 1,Number of of advisors 2,Number of of advisors 3,Number of of advisors 4,Number of of advisors 5,Number of of advisors 6,Number of of advisors 7,Number of of advisors 8,Number of of advisors 11,Team size Senior leadership 1,Team size Senior leadership 2,Team size Senior leadership 3,Team size Senior leadership 4,Team size Senior leadership 5,Team size Senior leadership 6,Team size Senior leadership 7,Team size Senior leadership 8,Team size Senior leadership 9,Team size Senior leadership 10,Team size Senior leadership 11,Team size Senior leadership 12,Number of of repeat investors 0,Number of of repeat investors 1,Number of of repeat investors 10,Number of of repeat investors 2,Number of of repeat investors 3,Number of of repeat investors 4,Number of Sales Support material high,Number of Sales Support material low,Number of Sales Support material medium,Number of Sales Support material nothing,Average size of companies worked for in the past large,Average size of companies worked for in the past medium,Average size of companies worked for in the past small,Product or service company? both,Product or service company? product,Product or service company? service,Focus on private or public data? both,Focus on private or public data? no,Focus on private or public data? private,Focus on private or public data? public,Focus on structured or unstructured data both,Focus on structured or unstructured data no,Focus on structured or unstructured data not applicable,Focus on structured or unstructured data structured,Focus on structured or unstructured data unstructured,Cloud or platform based serive/product? both,Cloud or platform based serive/product? cloud,Cloud or platform based serive/product? none,Cloud or platform based serive/product? platform,Number of of Partners of company few,Number of of Partners of company many,Number of of Partners of company none,Online or offline venture - physical location based business or online venture? both,Online or offline venture - physical location based business or online venture? offline,Online or offline venture - physical location based business or online venture? online,Top forums like 'Tech crunch' or 'Venture beat' talking about the company/model - How much is it being talked about? high,Top forums like 'Tech crunch' or 'Venture beat' talking about the company/model - How much is it being talked about? low,Top forums like 'Tech crunch' or 'Venture beat' talking about the company/model - How much is it being talked about? medium,Top forums like 'Tech crunch' or 'Venture beat' talking about the company/model - How much is it being talked about? none,Average Years of experience for founder and co founder high,Average Years of experience for founder and co founder low,Average Years of experience for founder and co founder medium,Breadth of experience across verticals high,Breadth of experience across verticals low,Breadth of experience across verticals medium,Highest education bachelors,Highest education masters,Highest education phd,Years of education 18,Years of education 21,Years of education 25,Degree from a Tier 1 or Tier 2 university? both,Degree from a Tier 1 or Tier 2 university? none,Degree from a Tier 1 or Tier 2 university? tier_1,Degree from a Tier 1 or Tier 2 university? tier_2,Experience in selling and building products high,Experience in selling and building products low,Experience in selling and building products medium,Experience in selling and building products none,Top management similarity high,Top management similarity low,Top management similarity medium,Top management similarity none,Number of of Research publications few,Number of of Research publications many,Number of of Research publications none,Team Composition score high,Team Composition score low,Team Composition score medium,Dificulty of Obtaining Work force high,Dificulty of Obtaining Work force low,Dificulty of Obtaining Work force medium,Time to market service or product high,Time to market service or product low,Time to market service or product medium,Industry trend in investing 1.0,Industry trend in investing 2.0,Industry trend in investing 3.0,Industry trend in investing 4.0,Disruptiveness of technology high,Disruptiveness of technology low,Disruptiveness of technology medium,Number of Direct competitors 0,Number of Direct competitors 1,Number of Direct competitors 10,Number of Direct competitors 11,Number of Direct competitors 19,Number of Direct competitors 2,Number of Direct competitors 28,Number of Direct competitors 3,Number of Direct competitors 33,Number of Direct competitors 4,Number of Direct competitors 5,Number of Direct competitors 6,Number of Direct competitors 8,"Survival through recession, based on existence of the company through recession times no","Survival through recession, based on existence of the company through recession times not applicable","Survival through recession, based on existence of the company through recession times yes",Renown score 0,Renown score 1,Renown score 10,Renown score 2,Renown score 3,Renown score 4,Renown score 5,Renown score 6,Renown score 7,Renown score 8,Renown score 9
0,2011,455.0,14.0,0,United States,North America,15,0,3,0,7,10,0,low,1,Medium,0,0,0,0,Both,1,Private,1,Both,1,cloud,1,1,0,Few,0,0,0,1,0,0,1,0,0,0,Online,0,Low,Medium,1,Low,Bachelors,18,1,1,Tier_2,500,High,0,0,1,Medium,18,,36.0,High,Medium,1,0,Low,1,1,1,0,0,71391,1,1,1,3.0,Medium,0,3.333333333,2.35,Not Applicable,2,7.344444444,9.401709402,0,57.47863248,0.0,0.0,3.846153846,17.09401709,9.401709402,0.0,2.777777778,0,0,0,0.0,9,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0
1,2010,496.0,39.0,0,United States,North America,7,0,1,1,8,40,0,high,0,Small,0,0,0,0,Product,1,Public,1,Both,0,Platform,1,1,1,Few,0,0,0,0,0,0,1,0,0,0,Online,0,Low,High,1,Medium,Bachelors,18,1,1,,500,High,0,0,0,Low,5,Few,23.0,Medium,Medium,1,0,Low,1,1,1,0,0,201814,1,1,0,3.0,Medium,0,10.0,5.5,Not Applicable,13,9.822222222,0.0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0,0.0,6,1.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0
2,2010,106.0,14.0,0,United States,North America,2,0,4,0,4,14,2,medium,0,Medium,0,0,0,0,Service,0,Public,1,Both,0,Platform,1,1,0,Few,0,0,0,0,0,0,1,0,1,0,Online,0,Low,High,1,Medium,Masters,21,0,0,,500,Low,0,0,0,Low,5,,25.0,Medium,Medium,1,0,Low,0,1,0,1,0,591816,1,1,1,4.0,High,0,3.5,1.0,Not Applicable,12,9.322222222,6.25,0,3.125,15.625,9.375,3.125,6.25,3.125,3.125,0.0,0,0,0,0.0,6,1.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0
3,2010,139.0,29.0,0,United States,North America,0,0,3,4,4,40,0,medium,0,Large,1,1,0,0,Both,1,Private,0,Both,0,Platform,1,1,0,Few,0,0,0,1,0,1,1,0,0,0,Online,0,Medium,High,1,Medium,Masters,21,1,1,Tier_1,500,Medium,0,0,0,High,5,Few,4.5,Medium,Medium,1,0,Low,0,0,1,0,0,1015027,1,1,0,3.0,Medium,0,10.0,6.7,Not Applicable,20,6.4,0.0,0,100.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0,0.0,2,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0
4,2011,306.0,16.0,0,United States,North America,13,0,2,0,2,50,0,medium,1,Medium,1,1,0,0,Product,1,Public,1,Both,0,Platform,1,1,0,Few,1,0,0,0,0,0,1,0,1,0,Online,0,Medium,High,1,High,Masters,21,1,1,Tier_1,500,High,0,0,0,Medium,13,,48.0,Low,Medium,0,0,Low,0,1,0,0,0,256921,1,1,0,3.0,Medium,0,16.66666667,11.0,Not Applicable,18,12.0,8.333333333,0,46.73202614,5.718954248,8.333333333,0.0,19.77124183,2.777777778,2.777777778,0.0,0,0,0,5.555555556,5,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0


Let's drop the original columns.

In [645]:
features = features.reset_index(drop=True)

for feature in categoricals['Feature']:
    features = features.drop(feature, 1)
    
features.head()

Unnamed: 0,Internet Activity Score,Employee Count,Has the team size grown,Team size all employees,Worked in top companies,Have been part of startups in the past?,Have been part of successful startups in the past?,Was he or she partner in Big 5 consulting?,Consulting experience?,Catering to product/service across verticals,Focus on consumer data?,Subscription based business,Local or global player,Linear or Non-linear business model,"Capital intensive business e.g. e-commerce, Engineering products and operations can also cause a business to be capital intensive",Crowdsourcing based business,Crowdfunding based business,Machine Learning based business,Predictive Analytics business,Speech analytics business,Prescriptive analytics business,Big Data Business,Cross-Channel Analytics/ marketing channels,Owns data or not? (monetization of data) e.g. Factual,Is the company an aggregator/market place? e.g. Bluekai,B2C or B2B venture?,Exposure across the globe,Relevance of education to venture,Relevance of experience to venture,Renowned in professional circle,Experience in Fortune 100 organizations,Experience in Fortune 500 organizations,Experience in Fortune 1000 organizations,Number of Recognitions for Founders and Co-founders,Skills score,Pricing Strategy,Hyper localisation,Long term relationship with other founders,Proprietary or patent position (competitive position),Barriers of entry for the competitors,Company awards,Legal risk and intellectual property,google page rank of company website,Technical proficiencies to analyse and interpret unstructured data,Solutions offered,Invested through global incubation competitions?,Employees per year of company existence,Last round of funding received (in milionUSD),Time to 1st investment (in months),"Avg time to investment - average across all rounds, measured from previous investment",Percent_skill_Entrepreneurship,Percent_skill_Operations,Percent_skill_Engineering,Percent_skill_Marketing,Percent_skill_Leadership,Percent_skill_Data Science,Percent_skill_Business Strategy,Percent_skill_Product Management,Percent_skill_Sales,Percent_skill_Domain,Percent_skill_Law,Percent_skill_Consulting,Percent_skill_Finance,Percent_skill_Investment,analytics,cloud computing,software development,marketing,enterprise software,food & beverages,hospitality,network / hosting / infrastructure,mobile,healthcare,pharmaceuticals,media,finance,e-commerce,gaming,email,security,publishing,education,advertising,entertainment,transportation,retail,social networking,search,insurance,cleantech,energy,market research,deals,telecommunications,music,techstars,streamlined ventures,amplify partners,rincon venture partners,pelion venture partners,500 startups,loren siebert,jason seats,xg ventures,george karidis,sam choi,morris wheeler,data collective,pejman nozad,ullas naik,dirk elmendorf,galvanize,pat matthews,paul kedrosky,matt ocko,cloud power capital,jared kopf,anne johnson,issac roth,george karutz,jim deters,zachary aarons,zack bogue,dfj frontier,draper nexus ventures,gil elbaz,auren hoffman,walter kortschak,mi ventures,brand ventures,daher capital,double m partners,gold hill capital,clark landry,draper associates,mi ventures llc,signia venture partners,pritzker group venture capital,excelerate labs,hyde park venture partners,chicago ventures,amicus capital,ideo,olive ventures,kd capital l.l.c,norwest venture partners,bessemer venture partners,atlas venture,promus ventures,softtech vc,costanoa venture capital,lee linden,chamath palihapitiya,raj de datta,tim kendall,omar siddiqui,david vivero,sequoia capital,khosla ventures,redpoint ventures,first round capital,pivotnorth capital,battery ventures,in-q-tel,jump capital,sap ventures,northwestern university,harvard business school angels,tech coast angels,hummer winblad venture partners,us venture partners,presidio ventures,mohr davidow ventures,grotech ventures,access venture partners,ffp holdings,matrix partners,allan zeise,thomas shannon,allen zeise,wisconsin investment partners,silicon pastures,thomas vincent,ross bjella,david lisle,ed karrels,jeff harris,tom demell,thomas demell,jeffery harris,michael kluiber,peter skanaivs,andy wojack,crista wojack,eric kessenich,john schmidt,angels on the water,ea media syndicate i llc,glen surnamer,jeff schweiger,defense advanced research projects agency,sv angel,harrison metal capital,baseline ventures,greylock partners,dick costolo,reid hoffman,jeff jordan,harrison metal,bain capital ventures,square 1 bank,high line venture partners,allen debevoise,matt coffin,paul bricault,jonah goodhart,dean gilbert,tony nethercutt,firstmark capital,ben ling,lerer ventures,cit gap funds,jaffray woodriff,hyde park angels,i2a fund,social leverage,morningstar,reed elsevier ventures,active up,guess,audilion,jean-luc halleux,dany donnen,startupinvest,launchub,launch hub,Unnamed: 238,boulas ventures,seed4soft,seventure partners,elaia partners,isai,capital spreads,rockaway capital,stephen bullock,chris underhill,scottish equity partners,seedcamp,innovation warehouse,stefan glaenzer,sherry coutu,ab banerjee,shamil chandaria,anish chandaria,bill emmott,ditlev schwanenflugel,ken olisa,azeem azhar,anthemis group,meridian venture partners,tom glocer,sanford dickert,alastair mitchell,andy mcloughlin,tim jackson,phil wilkinson,guy westlake,sean cornwell,ned cranborne,shan drummond,innova kapital,cantabria capital,voyager capital,vegastechfund,lars-henrik friis molin,amicus group,kolind a/s,appian ventures,core capital partners,qed investors,tech wildcatters,start-up chile,aleph,capital innovators,cultivation capital,eric kwan,google ventures,kenny van zant,franklyn chien,ryan merket,bobby goodlatte,yishan wong,brian mcclendon,charles river ventures,new enterprise associates,y combinator,gabor cselle,george zachary,jim young,alison rosenthal,jerry yang,paul buchheit,tie angels,hub angels investment group,jit saxena,rob soni,john simon,gaugarin oliver,prakash khot,launchcapital,chris devore,tom peterson,aaron bird,founder's co-op,howard lindzon,friends and family,jeremie berrebi,sutter hill ventures,crunchfund,rtp ventures,almaz capital,ru-net holdings,nick rau,bob jacobson,jimmy barge,jor law,lux capital,fidelity biosciences,venrock,highland capital partners,gerson lehrman group,jonathan bush,ed park,john goldsmith,james golden,ia ventures,digital sky technologies,raymond tonsing,accel partners,start fund,sam altman,ashton kutcher,guy oseary,ron garret,karl jacob,marco bergmann,mark larosa,tom mcinerney,janis krums,cross atlantic capital partners,edison ventures,rikki tahta,simon murdoch,roland beaulieu,gary mueller,spring lake equity partners,epic ventures,kevin o'connor,omidyar network,marc andreessen,ron conway,al avery,floodgate,svb financial group,kevin rose,floodgate fund,resolute.vc,founder collective,thrive capital,alexis ohanian,bubba murarka,reinmkr capital,shasta ventures,felicis ventures,jeffrey p. parker,tom falus,matt dwyer,sam wohlstadter,high peaks venture partners,kec ventures,softbank capital,jeff parker,fintech collective,intel capital,north bridge venture partners,vision ridge partners,green tree equity,jove equity partners,david cohen,brett jackson,bart lorang,walt winshall,valero capital,five mill ventures,westly group,the menlo park,javelin venture partners,alireza masrour,plug & play ventures,united parcel service,osage university partners,scott mcnealy,andreessen horowitz,point nine capital,sand hill angels,dingman center angels,interwest partners,tenoneten ventures,foundry group,upfront ventures,kbs+ ventures,neu venture capital,apricot capital,globespan capital partners,menlo ventures,flybridge capital partners,commonwealth capital ventures,national science foundation,michael chaney,frederick farrar,t. mills kelly,wayne buder,gregory t stern,kenneth gilbert,gregory t.stern,peter j. toren,marco giberti,paolo rubatto,greg stern,peter toren,david neithercutt,investment group of santa barbara,madrona venture group,mhs capital,rob glaser,kleiner perkins caufield & byers,dag ventures,austin ventures,contour venture partners,allegro venture partners,mack capital,brett hurt,sam decker,adam ross,tom meredith,dean drako,vulcan capital,geoff entress,bloomberg beta,ignition partners,next world capital,software ag,workday,citi ventures,august capital,morado venture partners,ame cloud ventures,sopris partners,voodoo ventures,safeguard scientifics,charles f. dolan,golden seeds,springboard enterprises,miramar venture partners,ata ventures,foundation capital,stanford university,e.on,zig capital,jcb investments,moore venture partners,la jolla holding,tomorrowventures,venture capitals,mehdi daoudi,boldstart ventures,oak investment partners,diamondhead ventures,ampersand capital partners,qualcomm ventures,american express ventures,maine technology institute,libra future fund,i2e,seedstep angels group,divergent ventures,john ives,lew moorman,john engates,revel partners,commonangels,boston seed capital,dharmesh shah,waikit lau,nick ducoff,jacob perkins,dan casey,crosslink capital,kapor capital,giza venture capital,endeavor partners,naval ravikant,mark goines,josh james,ecosystem ventures,paige craig,correlation ventures,quest venture partners,sigma prime ventures,kepha partners,sigma partners,deutsche telekom,index ventures,t-venture,gerard govaerts,fortino,vendep oy,eden ventures,pentech ventures,oxford technology management,imperial innovations,bootstrap incubation,startup bootcamp,startupbootcamp,dan somers,technology strategy board,christoph janz,alexander bruehl,dn capital,slow ventures,draper fisher jurvetson (dfj),mayfield fund,draper richards,mangrove capital partners,lehman brothers,sierra ventures,team europe,fabrice grinda,jose marin,james gutierrez,embarcadero ventures,inovia capital,greycroft partners,brian s. cohen,phil grieshaber,john taysom,justin siegel,jeffrey silverman,jerry newman,bruno bowden,jeff hammerbacher,michael abbott,andrew mccollum,ed roberts,jean hammond,quotidian ventures,general catalyst partners,lowercase capital,lightbank,babak nivi,steamboat ventures,the mail room fund,launchpad la,at&t,stage one capital,mercury fund,ff venture capital,dundee venture capital,mfi capital,jim pallotta,josh mailman,radcliff group,hadi partovi,ali partovi,e.ventures,satya patel,redpoint eventures,split rock partners,ggv capital,canopy ventures,allegis capital,trident capital,canopy group,dry canyon holdings,cross creek capital,indous venture partners,triplepoint capital,jim clark,mike ramsay,gabriell weinberg,david cancel,joshua schachter,roy rodenstein,project 11 ventures,meakem becker venture capital,sunstone capital,operations,sales,computing,research,database management,technology,analytic,software,social media management,customer servce,customer service,app revenue,data visualization,service,operation,accounting,training,risk,data collection,social news,consumer web,data management,strategy,tool,inventory management,energy saving,optimization,crm,pricing,customer targeting,search enginenoptimization,customer engagement,social media analytics,content marketing,presentations,social media,dashboards,localized behaviour,wireless,sale,social network,music intelligece,network optimization,year of founding 1999,year of founding 2000,year of founding 2002,year of founding 2004,year of founding 2005,year of founding 2006,year of founding 2007,year of founding 2008,year of founding 2009,year of founding 2010,year of founding 2011,year of founding 2012,year of founding 2013,Country of company belgium,Country of company bulgaria,Country of company czech republic,Country of company denmark,Country of company finland,Country of company france,Country of company india,Country of company russian federation,Country of company singapore,Country of company switzerland,Country of company united kingdom,Country of company united states,Continent of company asia,Continent of company europe,Continent of company north america,Number of Investors in Seed 0,Number of Investors in Seed 1,Number of Investors in Seed 10,Number of Investors in Seed 13,Number of Investors in Seed 15,Number of Investors in Seed 17,Number of Investors in Seed 2,Number of Investors in Seed 22,Number of Investors in Seed 24,Number of Investors in Seed 3,Number of Investors in Seed 4,Number of Investors in Seed 5,Number of Investors in Seed 6,Number of Investors in Seed 7,Number of Investors in Seed 8,Number of Investors in Seed 9,Number of Investors in Angel and or VC 0,Number of Investors in Angel and or VC 1,Number of Investors in Angel and or VC 2,Number of Investors in Angel and or VC 3,Number of Investors in Angel and or VC 4,Number of Investors in Angel and or VC 5,Number of Investors in Angel and or VC 6,Number of Investors in Angel and or VC 7,Number of Investors in Angel and or VC 8,Number of Investors in Angel and or VC 9,Number of Co-founders 0,Number of Co-founders 1,Number of Co-founders 2,Number of Co-founders 3,Number of Co-founders 4,Number of Co-founders 5,Number of Co-founders 7,Number of of advisors 0,Number of of advisors 1,Number of of advisors 2,Number of of advisors 3,Number of of advisors 4,Number of of advisors 5,Number of of advisors 6,Number of of advisors 7,Number of of advisors 8,Number of of advisors 11,Team size Senior leadership 1,Team size Senior leadership 2,Team size Senior leadership 3,Team size Senior leadership 4,Team size Senior leadership 5,Team size Senior leadership 6,Team size Senior leadership 7,Team size Senior leadership 8,Team size Senior leadership 9,Team size Senior leadership 10,Team size Senior leadership 11,Team size Senior leadership 12,Number of of repeat investors 0,Number of of repeat investors 1,Number of of repeat investors 10,Number of of repeat investors 2,Number of of repeat investors 3,Number of of repeat investors 4,Number of Sales Support material high,Number of Sales Support material low,Number of Sales Support material medium,Number of Sales Support material nothing,Average size of companies worked for in the past large,Average size of companies worked for in the past medium,Average size of companies worked for in the past small,Product or service company? both,Product or service company? product,Product or service company? service,Focus on private or public data? both,Focus on private or public data? no,Focus on private or public data? private,Focus on private or public data? public,Focus on structured or unstructured data both,Focus on structured or unstructured data no,Focus on structured or unstructured data not applicable,Focus on structured or unstructured data structured,Focus on structured or unstructured data unstructured,Cloud or platform based serive/product? both,Cloud or platform based serive/product? cloud,Cloud or platform based serive/product? none,Cloud or platform based serive/product? platform,Number of of Partners of company few,Number of of Partners of company many,Number of of Partners of company none,Online or offline venture - physical location based business or online venture? both,Online or offline venture - physical location based business or online venture? offline,Online or offline venture - physical location based business or online venture? online,Top forums like 'Tech crunch' or 'Venture beat' talking about the company/model - How much is it being talked about? high,Top forums like 'Tech crunch' or 'Venture beat' talking about the company/model - How much is it being talked about? low,Top forums like 'Tech crunch' or 'Venture beat' talking about the company/model - How much is it being talked about? medium,Top forums like 'Tech crunch' or 'Venture beat' talking about the company/model - How much is it being talked about? none,Average Years of experience for founder and co founder high,Average Years of experience for founder and co founder low,Average Years of experience for founder and co founder medium,Breadth of experience across verticals high,Breadth of experience across verticals low,Breadth of experience across verticals medium,Highest education bachelors,Highest education masters,Highest education phd,Years of education 18,Years of education 21,Years of education 25,Degree from a Tier 1 or Tier 2 university? both,Degree from a Tier 1 or Tier 2 university? none,Degree from a Tier 1 or Tier 2 university? tier_1,Degree from a Tier 1 or Tier 2 university? tier_2,Experience in selling and building products high,Experience in selling and building products low,Experience in selling and building products medium,Experience in selling and building products none,Top management similarity high,Top management similarity low,Top management similarity medium,Top management similarity none,Number of of Research publications few,Number of of Research publications many,Number of of Research publications none,Team Composition score high,Team Composition score low,Team Composition score medium,Dificulty of Obtaining Work force high,Dificulty of Obtaining Work force low,Dificulty of Obtaining Work force medium,Time to market service or product high,Time to market service or product low,Time to market service or product medium,Industry trend in investing 1.0,Industry trend in investing 2.0,Industry trend in investing 3.0,Industry trend in investing 4.0,Disruptiveness of technology high,Disruptiveness of technology low,Disruptiveness of technology medium,Number of Direct competitors 0,Number of Direct competitors 1,Number of Direct competitors 10,Number of Direct competitors 11,Number of Direct competitors 19,Number of Direct competitors 2,Number of Direct competitors 28,Number of Direct competitors 3,Number of Direct competitors 33,Number of Direct competitors 4,Number of Direct competitors 5,Number of Direct competitors 6,Number of Direct competitors 8,"Survival through recession, based on existence of the company through recession times no","Survival through recession, based on existence of the company through recession times not applicable","Survival through recession, based on existence of the company through recession times yes",Renown score 0,Renown score 1,Renown score 10,Renown score 2,Renown score 3,Renown score 4,Renown score 5,Renown score 6,Renown score 7,Renown score 8,Renown score 9
0,455.0,14.0,0,10,1,0,0,0,0,1,1,1,1,1,0,0,0,0,1,0,0,1,0,0,0,0,1,1,1,500,0,0,1,18,36.0,1,0,1,1,1,0,0,71391,1,1,1,3.333333333,2.35,2,7.344444444,9.401709402,0,57.47863248,0.0,0.0,3.846153846,17.09401709,9.401709402,0.0,2.777777778,0,0,0,0.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0
1,496.0,39.0,0,40,0,0,0,0,0,1,1,0,1,1,1,0,0,0,0,0,0,1,0,0,0,0,1,1,1,500,0,0,0,5,23.0,1,0,1,1,1,0,0,201814,1,1,0,10.0,5.5,13,9.822222222,0.0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0,0.0,1.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0
2,106.0,14.0,0,14,0,0,0,0,0,0,1,0,1,1,0,0,0,0,0,0,0,1,0,1,0,0,1,0,0,500,0,0,0,5,25.0,1,0,0,1,0,1,0,591816,1,1,1,3.5,1.0,12,9.322222222,6.25,0,3.125,15.625,9.375,3.125,6.25,3.125,3.125,0.0,0,0,0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0
3,139.0,29.0,0,40,0,1,1,0,0,1,0,0,1,1,0,0,0,0,1,0,1,1,0,0,0,0,1,1,1,500,0,0,0,5,4.5,1,0,0,0,1,0,0,1015027,1,1,0,10.0,6.7,20,6.4,0.0,0,100.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0
4,306.0,16.0,0,50,1,1,1,0,0,1,1,0,1,1,0,1,0,0,0,0,0,1,0,1,0,0,1,1,1,500,0,0,0,13,48.0,0,0,0,1,0,0,0,256921,1,1,0,16.66666667,11.0,18,12.0,8.333333333,0,46.73202614,5.718954248,8.333333333,0.0,19.77124183,2.777777778,2.777777778,0.0,0,0,0,5.555555556,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0


Now let's make sure everything is a float type in our dataset.

In [646]:
for col in features.columns:
    try:
        features[col] = features[col].apply(lambda x: float(x))
    except:
        pass

features.head()

Unnamed: 0,Internet Activity Score,Employee Count,Has the team size grown,Team size all employees,Worked in top companies,Have been part of startups in the past?,Have been part of successful startups in the past?,Was he or she partner in Big 5 consulting?,Consulting experience?,Catering to product/service across verticals,Focus on consumer data?,Subscription based business,Local or global player,Linear or Non-linear business model,"Capital intensive business e.g. e-commerce, Engineering products and operations can also cause a business to be capital intensive",Crowdsourcing based business,Crowdfunding based business,Machine Learning based business,Predictive Analytics business,Speech analytics business,Prescriptive analytics business,Big Data Business,Cross-Channel Analytics/ marketing channels,Owns data or not? (monetization of data) e.g. Factual,Is the company an aggregator/market place? e.g. Bluekai,B2C or B2B venture?,Exposure across the globe,Relevance of education to venture,Relevance of experience to venture,Renowned in professional circle,Experience in Fortune 100 organizations,Experience in Fortune 500 organizations,Experience in Fortune 1000 organizations,Number of Recognitions for Founders and Co-founders,Skills score,Pricing Strategy,Hyper localisation,Long term relationship with other founders,Proprietary or patent position (competitive position),Barriers of entry for the competitors,Company awards,Legal risk and intellectual property,google page rank of company website,Technical proficiencies to analyse and interpret unstructured data,Solutions offered,Invested through global incubation competitions?,Employees per year of company existence,Last round of funding received (in milionUSD),Time to 1st investment (in months),"Avg time to investment - average across all rounds, measured from previous investment",Percent_skill_Entrepreneurship,Percent_skill_Operations,Percent_skill_Engineering,Percent_skill_Marketing,Percent_skill_Leadership,Percent_skill_Data Science,Percent_skill_Business Strategy,Percent_skill_Product Management,Percent_skill_Sales,Percent_skill_Domain,Percent_skill_Law,Percent_skill_Consulting,Percent_skill_Finance,Percent_skill_Investment,analytics,cloud computing,software development,marketing,enterprise software,food & beverages,hospitality,network / hosting / infrastructure,mobile,healthcare,pharmaceuticals,media,finance,e-commerce,gaming,email,security,publishing,education,advertising,entertainment,transportation,retail,social networking,search,insurance,cleantech,energy,market research,deals,telecommunications,music,techstars,streamlined ventures,amplify partners,rincon venture partners,pelion venture partners,500 startups,loren siebert,jason seats,xg ventures,george karidis,sam choi,morris wheeler,data collective,pejman nozad,ullas naik,dirk elmendorf,galvanize,pat matthews,paul kedrosky,matt ocko,cloud power capital,jared kopf,anne johnson,issac roth,george karutz,jim deters,zachary aarons,zack bogue,dfj frontier,draper nexus ventures,gil elbaz,auren hoffman,walter kortschak,mi ventures,brand ventures,daher capital,double m partners,gold hill capital,clark landry,draper associates,mi ventures llc,signia venture partners,pritzker group venture capital,excelerate labs,hyde park venture partners,chicago ventures,amicus capital,ideo,olive ventures,kd capital l.l.c,norwest venture partners,bessemer venture partners,atlas venture,promus ventures,softtech vc,costanoa venture capital,lee linden,chamath palihapitiya,raj de datta,tim kendall,omar siddiqui,david vivero,sequoia capital,khosla ventures,redpoint ventures,first round capital,pivotnorth capital,battery ventures,in-q-tel,jump capital,sap ventures,northwestern university,harvard business school angels,tech coast angels,hummer winblad venture partners,us venture partners,presidio ventures,mohr davidow ventures,grotech ventures,access venture partners,ffp holdings,matrix partners,allan zeise,thomas shannon,allen zeise,wisconsin investment partners,silicon pastures,thomas vincent,ross bjella,david lisle,ed karrels,jeff harris,tom demell,thomas demell,jeffery harris,michael kluiber,peter skanaivs,andy wojack,crista wojack,eric kessenich,john schmidt,angels on the water,ea media syndicate i llc,glen surnamer,jeff schweiger,defense advanced research projects agency,sv angel,harrison metal capital,baseline ventures,greylock partners,dick costolo,reid hoffman,jeff jordan,harrison metal,bain capital ventures,square 1 bank,high line venture partners,allen debevoise,matt coffin,paul bricault,jonah goodhart,dean gilbert,tony nethercutt,firstmark capital,ben ling,lerer ventures,cit gap funds,jaffray woodriff,hyde park angels,i2a fund,social leverage,morningstar,reed elsevier ventures,active up,guess,audilion,jean-luc halleux,dany donnen,startupinvest,launchub,launch hub,Unnamed: 238,boulas ventures,seed4soft,seventure partners,elaia partners,isai,capital spreads,rockaway capital,stephen bullock,chris underhill,scottish equity partners,seedcamp,innovation warehouse,stefan glaenzer,sherry coutu,ab banerjee,shamil chandaria,anish chandaria,bill emmott,ditlev schwanenflugel,ken olisa,azeem azhar,anthemis group,meridian venture partners,tom glocer,sanford dickert,alastair mitchell,andy mcloughlin,tim jackson,phil wilkinson,guy westlake,sean cornwell,ned cranborne,shan drummond,innova kapital,cantabria capital,voyager capital,vegastechfund,lars-henrik friis molin,amicus group,kolind a/s,appian ventures,core capital partners,qed investors,tech wildcatters,start-up chile,aleph,capital innovators,cultivation capital,eric kwan,google ventures,kenny van zant,franklyn chien,ryan merket,bobby goodlatte,yishan wong,brian mcclendon,charles river ventures,new enterprise associates,y combinator,gabor cselle,george zachary,jim young,alison rosenthal,jerry yang,paul buchheit,tie angels,hub angels investment group,jit saxena,rob soni,john simon,gaugarin oliver,prakash khot,launchcapital,chris devore,tom peterson,aaron bird,founder's co-op,howard lindzon,friends and family,jeremie berrebi,sutter hill ventures,crunchfund,rtp ventures,almaz capital,ru-net holdings,nick rau,bob jacobson,jimmy barge,jor law,lux capital,fidelity biosciences,venrock,highland capital partners,gerson lehrman group,jonathan bush,ed park,john goldsmith,james golden,ia ventures,digital sky technologies,raymond tonsing,accel partners,start fund,sam altman,ashton kutcher,guy oseary,ron garret,karl jacob,marco bergmann,mark larosa,tom mcinerney,janis krums,cross atlantic capital partners,edison ventures,rikki tahta,simon murdoch,roland beaulieu,gary mueller,spring lake equity partners,epic ventures,kevin o'connor,omidyar network,marc andreessen,ron conway,al avery,floodgate,svb financial group,kevin rose,floodgate fund,resolute.vc,founder collective,thrive capital,alexis ohanian,bubba murarka,reinmkr capital,shasta ventures,felicis ventures,jeffrey p. parker,tom falus,matt dwyer,sam wohlstadter,high peaks venture partners,kec ventures,softbank capital,jeff parker,fintech collective,intel capital,north bridge venture partners,vision ridge partners,green tree equity,jove equity partners,david cohen,brett jackson,bart lorang,walt winshall,valero capital,five mill ventures,westly group,the menlo park,javelin venture partners,alireza masrour,plug & play ventures,united parcel service,osage university partners,scott mcnealy,andreessen horowitz,point nine capital,sand hill angels,dingman center angels,interwest partners,tenoneten ventures,foundry group,upfront ventures,kbs+ ventures,neu venture capital,apricot capital,globespan capital partners,menlo ventures,flybridge capital partners,commonwealth capital ventures,national science foundation,michael chaney,frederick farrar,t. mills kelly,wayne buder,gregory t stern,kenneth gilbert,gregory t.stern,peter j. toren,marco giberti,paolo rubatto,greg stern,peter toren,david neithercutt,investment group of santa barbara,madrona venture group,mhs capital,rob glaser,kleiner perkins caufield & byers,dag ventures,austin ventures,contour venture partners,allegro venture partners,mack capital,brett hurt,sam decker,adam ross,tom meredith,dean drako,vulcan capital,geoff entress,bloomberg beta,ignition partners,next world capital,software ag,workday,citi ventures,august capital,morado venture partners,ame cloud ventures,sopris partners,voodoo ventures,safeguard scientifics,charles f. dolan,golden seeds,springboard enterprises,miramar venture partners,ata ventures,foundation capital,stanford university,e.on,zig capital,jcb investments,moore venture partners,la jolla holding,tomorrowventures,venture capitals,mehdi daoudi,boldstart ventures,oak investment partners,diamondhead ventures,ampersand capital partners,qualcomm ventures,american express ventures,maine technology institute,libra future fund,i2e,seedstep angels group,divergent ventures,john ives,lew moorman,john engates,revel partners,commonangels,boston seed capital,dharmesh shah,waikit lau,nick ducoff,jacob perkins,dan casey,crosslink capital,kapor capital,giza venture capital,endeavor partners,naval ravikant,mark goines,josh james,ecosystem ventures,paige craig,correlation ventures,quest venture partners,sigma prime ventures,kepha partners,sigma partners,deutsche telekom,index ventures,t-venture,gerard govaerts,fortino,vendep oy,eden ventures,pentech ventures,oxford technology management,imperial innovations,bootstrap incubation,startup bootcamp,startupbootcamp,dan somers,technology strategy board,christoph janz,alexander bruehl,dn capital,slow ventures,draper fisher jurvetson (dfj),mayfield fund,draper richards,mangrove capital partners,lehman brothers,sierra ventures,team europe,fabrice grinda,jose marin,james gutierrez,embarcadero ventures,inovia capital,greycroft partners,brian s. cohen,phil grieshaber,john taysom,justin siegel,jeffrey silverman,jerry newman,bruno bowden,jeff hammerbacher,michael abbott,andrew mccollum,ed roberts,jean hammond,quotidian ventures,general catalyst partners,lowercase capital,lightbank,babak nivi,steamboat ventures,the mail room fund,launchpad la,at&t,stage one capital,mercury fund,ff venture capital,dundee venture capital,mfi capital,jim pallotta,josh mailman,radcliff group,hadi partovi,ali partovi,e.ventures,satya patel,redpoint eventures,split rock partners,ggv capital,canopy ventures,allegis capital,trident capital,canopy group,dry canyon holdings,cross creek capital,indous venture partners,triplepoint capital,jim clark,mike ramsay,gabriell weinberg,david cancel,joshua schachter,roy rodenstein,project 11 ventures,meakem becker venture capital,sunstone capital,operations,sales,computing,research,database management,technology,analytic,software,social media management,customer servce,customer service,app revenue,data visualization,service,operation,accounting,training,risk,data collection,social news,consumer web,data management,strategy,tool,inventory management,energy saving,optimization,crm,pricing,customer targeting,search enginenoptimization,customer engagement,social media analytics,content marketing,presentations,social media,dashboards,localized behaviour,wireless,sale,social network,music intelligece,network optimization,year of founding 1999,year of founding 2000,year of founding 2002,year of founding 2004,year of founding 2005,year of founding 2006,year of founding 2007,year of founding 2008,year of founding 2009,year of founding 2010,year of founding 2011,year of founding 2012,year of founding 2013,Country of company belgium,Country of company bulgaria,Country of company czech republic,Country of company denmark,Country of company finland,Country of company france,Country of company india,Country of company russian federation,Country of company singapore,Country of company switzerland,Country of company united kingdom,Country of company united states,Continent of company asia,Continent of company europe,Continent of company north america,Number of Investors in Seed 0,Number of Investors in Seed 1,Number of Investors in Seed 10,Number of Investors in Seed 13,Number of Investors in Seed 15,Number of Investors in Seed 17,Number of Investors in Seed 2,Number of Investors in Seed 22,Number of Investors in Seed 24,Number of Investors in Seed 3,Number of Investors in Seed 4,Number of Investors in Seed 5,Number of Investors in Seed 6,Number of Investors in Seed 7,Number of Investors in Seed 8,Number of Investors in Seed 9,Number of Investors in Angel and or VC 0,Number of Investors in Angel and or VC 1,Number of Investors in Angel and or VC 2,Number of Investors in Angel and or VC 3,Number of Investors in Angel and or VC 4,Number of Investors in Angel and or VC 5,Number of Investors in Angel and or VC 6,Number of Investors in Angel and or VC 7,Number of Investors in Angel and or VC 8,Number of Investors in Angel and or VC 9,Number of Co-founders 0,Number of Co-founders 1,Number of Co-founders 2,Number of Co-founders 3,Number of Co-founders 4,Number of Co-founders 5,Number of Co-founders 7,Number of of advisors 0,Number of of advisors 1,Number of of advisors 2,Number of of advisors 3,Number of of advisors 4,Number of of advisors 5,Number of of advisors 6,Number of of advisors 7,Number of of advisors 8,Number of of advisors 11,Team size Senior leadership 1,Team size Senior leadership 2,Team size Senior leadership 3,Team size Senior leadership 4,Team size Senior leadership 5,Team size Senior leadership 6,Team size Senior leadership 7,Team size Senior leadership 8,Team size Senior leadership 9,Team size Senior leadership 10,Team size Senior leadership 11,Team size Senior leadership 12,Number of of repeat investors 0,Number of of repeat investors 1,Number of of repeat investors 10,Number of of repeat investors 2,Number of of repeat investors 3,Number of of repeat investors 4,Number of Sales Support material high,Number of Sales Support material low,Number of Sales Support material medium,Number of Sales Support material nothing,Average size of companies worked for in the past large,Average size of companies worked for in the past medium,Average size of companies worked for in the past small,Product or service company? both,Product or service company? product,Product or service company? service,Focus on private or public data? both,Focus on private or public data? no,Focus on private or public data? private,Focus on private or public data? public,Focus on structured or unstructured data both,Focus on structured or unstructured data no,Focus on structured or unstructured data not applicable,Focus on structured or unstructured data structured,Focus on structured or unstructured data unstructured,Cloud or platform based serive/product? both,Cloud or platform based serive/product? cloud,Cloud or platform based serive/product? none,Cloud or platform based serive/product? platform,Number of of Partners of company few,Number of of Partners of company many,Number of of Partners of company none,Online or offline venture - physical location based business or online venture? both,Online or offline venture - physical location based business or online venture? offline,Online or offline venture - physical location based business or online venture? online,Top forums like 'Tech crunch' or 'Venture beat' talking about the company/model - How much is it being talked about? high,Top forums like 'Tech crunch' or 'Venture beat' talking about the company/model - How much is it being talked about? low,Top forums like 'Tech crunch' or 'Venture beat' talking about the company/model - How much is it being talked about? medium,Top forums like 'Tech crunch' or 'Venture beat' talking about the company/model - How much is it being talked about? none,Average Years of experience for founder and co founder high,Average Years of experience for founder and co founder low,Average Years of experience for founder and co founder medium,Breadth of experience across verticals high,Breadth of experience across verticals low,Breadth of experience across verticals medium,Highest education bachelors,Highest education masters,Highest education phd,Years of education 18,Years of education 21,Years of education 25,Degree from a Tier 1 or Tier 2 university? both,Degree from a Tier 1 or Tier 2 university? none,Degree from a Tier 1 or Tier 2 university? tier_1,Degree from a Tier 1 or Tier 2 university? tier_2,Experience in selling and building products high,Experience in selling and building products low,Experience in selling and building products medium,Experience in selling and building products none,Top management similarity high,Top management similarity low,Top management similarity medium,Top management similarity none,Number of of Research publications few,Number of of Research publications many,Number of of Research publications none,Team Composition score high,Team Composition score low,Team Composition score medium,Dificulty of Obtaining Work force high,Dificulty of Obtaining Work force low,Dificulty of Obtaining Work force medium,Time to market service or product high,Time to market service or product low,Time to market service or product medium,Industry trend in investing 1.0,Industry trend in investing 2.0,Industry trend in investing 3.0,Industry trend in investing 4.0,Disruptiveness of technology high,Disruptiveness of technology low,Disruptiveness of technology medium,Number of Direct competitors 0,Number of Direct competitors 1,Number of Direct competitors 10,Number of Direct competitors 11,Number of Direct competitors 19,Number of Direct competitors 2,Number of Direct competitors 28,Number of Direct competitors 3,Number of Direct competitors 33,Number of Direct competitors 4,Number of Direct competitors 5,Number of Direct competitors 6,Number of Direct competitors 8,"Survival through recession, based on existence of the company through recession times no","Survival through recession, based on existence of the company through recession times not applicable","Survival through recession, based on existence of the company through recession times yes",Renown score 0,Renown score 1,Renown score 10,Renown score 2,Renown score 3,Renown score 4,Renown score 5,Renown score 6,Renown score 7,Renown score 8,Renown score 9
0,455.0,14.0,0.0,10.0,1.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,500.0,0.0,0.0,1.0,18.0,36.0,1.0,0.0,1.0,1.0,1.0,0.0,0.0,71391.0,1.0,1.0,1.0,3.333333,2.35,2.0,7.344444,9.401709,0.0,57.478632,0.0,0.0,3.846154,17.094017,9.401709,0.0,2.777778,0.0,0.0,0.0,0.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0
1,496.0,39.0,0.0,40.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,500.0,0.0,0.0,0.0,5.0,23.0,1.0,0.0,1.0,1.0,1.0,0.0,0.0,201814.0,1.0,1.0,0.0,10.0,5.5,13.0,9.822222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0
2,106.0,14.0,0.0,14.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,1.0,0.0,0.0,500.0,0.0,0.0,0.0,5.0,25.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,591816.0,1.0,1.0,1.0,3.5,1.0,12.0,9.322222,6.25,0.0,3.125,15.625,9.375,3.125,6.25,3.125,3.125,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0
3,139.0,29.0,0.0,40.0,0.0,1.0,1.0,0.0,0.0,1.0,0.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,500.0,0.0,0.0,0.0,5.0,4.5,1.0,0.0,0.0,0.0,1.0,0.0,0.0,1015027.0,1.0,1.0,0.0,10.0,6.7,20.0,6.4,0.0,0.0,100.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0
4,306.0,16.0,0.0,50.0,1.0,1.0,1.0,0.0,0.0,1.0,1.0,0.0,1.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,1.0,1.0,1.0,500.0,0.0,0.0,0.0,13.0,48.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,256921.0,1.0,1.0,0.0,16.666667,11.0,18.0,12.0,8.333333,0.0,46.732026,5.718954,8.333333,0.0,19.771242,2.777778,2.777778,0.0,0.0,0.0,0.0,5.555556,1.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0


To be even surer, lets print the data types:

In [647]:
features.dtypes


Internet Activity Score                                                                                                              float64
Employee Count                                                                                                                       float64
Has the team size grown                                                                                                              float64
Team size all employees                                                                                                              float64
Worked in top companies                                                                                                              float64
Have been part of startups in the past?                                                                                              float64
Have been part of successful startups in the past?                                                                                   float64
Was he or she

#### FABULOUS! 

### Normalizing the data...

Let's now normalize the data, so that our analysis are accurate.

In [648]:
X = preprocessing.StandardScaler().fit(features).transform(features)
X[0:7]

array([[ 1.29210852, -0.34979314, -0.9029865 , ...,  0.        ,
         0.        ,  0.        ],
       [ 1.47613575,  0.11649234, -0.9029865 , ...,  0.        ,
         0.        ,  0.        ],
       [-0.27436718, -0.34979314, -0.9029865 , ...,  0.        ,
         0.        ,  0.        ],
       ...,
       [ 0.62332663, -0.3124903 , -0.9029865 , ...,  0.        ,
         0.        ,  0.        ],
       [-0.51225604, -0.55495875, -0.9029865 , ...,  0.        ,
         0.        ,  0.        ],
       [ 2.67006852,  0.02323524, -0.9029865 , ...,  0.        ,
         0.        ,  0.        ]])

In [649]:
y = statuses.values
y[0:7]

array([1, 1, 1, 1, 1, 1, 1])

***
## Classification

### Train-test split

### Building the model

### Training

***
## Testing accuracy

***
## Analyzing results

***
## Report

***
## Thank you!

**Author:** [Aneta Baloyan](https://www.linkedin.com/in/aneta-baloyan/)

Email: *aneta.baloyan@gmail.com*

***

May 2020

