Objectives
 

Project Brief
You work for Spark Funds, an asset management company. Spark Funds wants to make investments in a few companies. The CEO of Spark Funds wants to understand the global trends in investments so that she can take the investment decisions effectively.

 

Business and Data Understanding
Spark Funds has two minor constraints for investments:

It wants to invest between 5 to 15 million USD per round of investment

It wants to invest only in English-speaking countries because of the ease of communication with the companies it would invest in

For your analysis, consider a country to be English speaking only if English is one of the official languages in that country

You may use this list: Click here for a list of countries where English is an official language.

 

These conditions will give you sufficient information for your initial analysis. Before getting to specific questions, let’s understand the problem and the data first.

 

1. What is the strategy?

Spark Funds wants to invest where most other investors are investing. This pattern is often observed among early stage startup investors.

 

2. Where did we get the data from? 

We have taken real investment data from crunchbase.com, so the insights you get may be incredibly useful. For this group project, we have divided the data into the following files:

 

You have to use three main data tables for the entire analysis (available for download on the next page):

 

3. What is Spark Funds’ business objective?

The business objectives and goals of data analysis are pretty straightforward.

Business objective: The objective is to identify the best sectors, countries, and a suitable investment type for making investments. The overall strategy is to invest where others are investing, implying that the 'best' sectors and countries are the ones 'where most investors are investing'.
Goals of data analysis: Your goals are divided into three sub-goals:
Investment type analysis: Comparing the typical investment amounts in the venture, seed, angel, private equity etc. so that Spark Funds can choose the type that is best suited for their strategy.
Country analysis: Identifying the countries which have been the most heavily invested in the past. These will be Spark Funds’ favourites as well.
Sector analysis: Understanding the distribution of investments across the eight main sectors. (Note that we are interested in the eight 'main sectors' provided in the mapping file. The two files — companies and rounds2 — have numerous sub-sector names; hence, you will need to map each sub-sector to its main sector.)
 

4. How do you approach the case study? What are the deliverables?

The entire case study is divided into checkpoints to help you navigate. For each checkpoint, you are advised to fill in the tables into the spreadsheet provided in the download segment. The tables are also mentioned under the 'Results Expected' section after each checkpoint. Since this is the first case study, you have been provided with some additional guidance. Going forward you will be expected to structure and solve the problem by yourself, just like you would be solving problems in real life scenarios.

 

Important Note: All your code has to be submitted in one Jupyter notebook. For every checkpoint, keep writing code in one well-commented Jupyter notebook which you can submit at the end.

In [4]:
import pandas as pd
import numpy as np

In [6]:
# Load company dataset
# Without encoding the read_csv was giving an error "Unicode Decode Error"
companies_df = pd.read_csv('C:/Users/Sandy/Documents/R/Workspace/CaseStudy/DSGroupProject-master/DSGroupProject/Data/companies.txt',delimiter='\t',encoding='unicode_escape')

In [7]:
companies_df.head()

Unnamed: 0,permalink,name,homepage_url,category_list,status,country_code,state_code,region,city,founded_at
0,/Organization/-Fame,#fame,http://livfame.com,Media,operating,IND,16,Mumbai,Mumbai,
1,/Organization/-Qounter,:Qounter,http://www.qounter.com,Application Platforms|Real Time|Social Network...,operating,USA,DE,DE - Other,Delaware City,04-09-2014
2,/Organization/-The-One-Of-Them-Inc-,"(THE) ONE of THEM,Inc.",http://oneofthem.jp,Apps|Games|Mobile,operating,,,,,
3,/Organization/0-6-Com,0-6.com,http://www.0-6.com,Curated Web,operating,CHN,22,Beijing,Beijing,01-01-2007
4,/Organization/004-Technologies,004 Technologies,http://004gmbh.de/en/004-interact,Software,operating,USA,IL,"Springfield, Illinois",Champaign,01-01-2010


In [8]:
# Load funding rounds file, got same error on the Unicode/Decode Error, used recommended encoding.
rounds2_df = pd.read_csv('C:/Users/Sandy/Documents/R/Workspace/CaseStudy/DSGroupProject-master/DSGroupProject/Data/rounds2.csv',encoding='unicode_escape')

In [9]:
# Quickly inspect the data frame
rounds2_df.head()

Unnamed: 0,company_permalink,funding_round_permalink,funding_round_type,funding_round_code,funded_at,raised_amount_usd
0,/organization/-fame,/funding-round/9a01d05418af9f794eebff7ace91f638,venture,B,05-01-2015,10000000.0
1,/ORGANIZATION/-QOUNTER,/funding-round/22dacff496eb7acb2b901dec1dfe5633,venture,A,14-10-2014,
2,/organization/-qounter,/funding-round/b44fbb94153f6cdef13083530bb48030,seed,,01-03-2014,700000.0
3,/ORGANIZATION/-THE-ONE-OF-THEM-INC-,/funding-round/650b8f704416801069bb178a1418776b,venture,B,30-01-2014,3406878.0
4,/organization/0-6-com,/funding-round/5727accaeaa57461bd22a9bdd945382d,venture,A,19-03-2008,2000000.0


In [10]:
rounds2_df.count()

company_permalink          114949
funding_round_permalink    114949
funding_round_type         114949
funding_round_code          31140
funded_at                  114949
raised_amount_usd           94959
dtype: int64

In [11]:
# Load the mapping file for the sector classification
sector_mapping_df = pd.read_csv('C:/Users/Sandy/Documents/R/Workspace/CaseStudy/DSGroupProject-master/DSGroupProject/Data//mapping.csv')

In [12]:
sector_mapping_df.head()

Unnamed: 0,category_list,Automotive & Sports,Blanks,Cleantech / Semiconductors,Entertainment,Health,Manufacturing,"News, Search and Messaging",Others,"Social, Finance, Analytics, Advertising"
0,,0,1,0,0,0,0,0,0,0
1,3D,0,0,0,0,0,1,0,0,0
2,3D Printing,0,0,0,0,0,1,0,0,0
3,3D Technology,0,0,0,0,0,1,0,0,0
4,Accounting,0,0,0,0,0,0,0,0,1


In [13]:
# Load the country codes file
# source: https://github.com/datasets/country-codes/blob/master/data/country-codes.csv
# columns of interest: country name: official_name_en, Country code: ISO3166-1-Alpha-3
country_df = pd.read_csv('C:/Users/Sandy/Documents/R/Workspace/CaseStudy/DSGroupProject-master/DSGroupProject/Data/country-codes.csv')

In [14]:
country_df.head()

Unnamed: 0,FIFA,Dial,ISO3166-1-Alpha-3,MARC,is_independent,ISO3166-1-numeric,GAUL,FIPS,WMO,ISO3166-1-Alpha-2,...,Sub-region Name,official_name_ru,Global Name,Capital,Continent,TLD,Languages,Geoname ID,CLDR display name,EDGAR
0,TPE,886,TWN,ch,Yes,158.0,925,TW,,TW,...,,,,Taipei,AS,.tw,"zh-TW,zh,nan,hak",1668284.0,Taiwan,
1,AFG,93,AFG,af,Yes,4.0,1,AF,AF,AF,...,Southern Asia,Афганистан,World,Kabul,AS,.af,"fa-AF,ps,uz-AF,tk",1149361.0,Afghanistan,B2
2,ALB,355,ALB,aa,Yes,8.0,3,AL,AB,AL,...,Southern Europe,Албания,World,Tirana,EU,.al,"sq,el",783754.0,Albania,B3
3,ALG,213,DZA,ae,Yes,12.0,4,AG,AL,DZ,...,Northern Africa,Алжир,World,Algiers,AF,.dz,ar-DZ,2589581.0,Algeria,B4
4,ASA,1-684,ASM,as,Territory of US,16.0,5,AQ,,AS,...,Polynesia,Американское Самоа,World,Pago Pago,OC,.as,"en-AS,sm,to",5880801.0,American Samoa,B5


In [16]:
country_df[['official_name_en','ISO3166-1-Alpha-3']]

Unnamed: 0,official_name_en,ISO3166-1-Alpha-3
0,,TWN
1,Afghanistan,AFG
2,Albania,ALB
3,Algeria,DZA
4,American Samoa,ASM
5,Andorra,AND
6,Angola,AGO
7,Anguilla,AIA
8,Antarctica,ATA
9,Antigua and Barbuda,ATG


In [17]:
# Rename the columns for easier analysis in later stages - only country - code & code are required for now
country_df.rename(columns={'official_name_en': 'Name', 'ISO3166-1-Alpha-3': 'Code'}, inplace=True)

In [18]:
country_df[['Code','Name']]

Unnamed: 0,Code,Name
0,TWN,
1,AFG,Afghanistan
2,ALB,Albania
3,DZA,Algeria
4,ASM,American Samoa
5,AND,Andorra
6,AGO,Angola
7,AIA,Anguilla
8,ATA,Antarctica
9,ATG,Antigua and Barbuda


In [19]:
# Create a new dataframe for the english speaking countries
eng_countries_list = [{'continent':'Asia','country':'India'},
                      {'continent':'Asia','country':'Pakistan'},
                      {'continent':'Asia','country':'Philippines'},
                      {'continent':'Asia','country':'Singapore'},
                      {'continent':'Africa','country':'Botswana'},
                      {'continent':'Africa','country':'Cameroon'},
                      {'continent':'Africa','country':'Ethiopia'},
                      {'continent':'Africa','country':'Eritrea'},
                      {'continent':'Africa','country':'The Gambia'},
                      {'continent':'Africa','country':'Ghana'},
                      {'continent':'Africa','country':'Kenya'},
                      {'continent':'Africa','country':'Lesotho'},
                      {'continent':'Africa','country':'Liberia'},
                      {'continent':'Africa','country':'Malawi'},
                      {'continent':'Africa','country':'Mauritius'},
                      {'continent':'Africa','country':'Namibia'},
                      {'continent':'Africa','country':'Nigeria'},
                      {'continent':'Africa','country':'Rwanda'},
                      {'continent':'Africa','country':'Seychelles'},
                      {'continent':'Africa','country':'Sierra Leone'},
                      {'continent':'Africa','country':'South Africa'},
                      {'continent':'Africa','country':'South Sudan'},
                      {'continent':'Africa','country':'Sudan'},
                      {'continent':'Africa','country':'Swaziland'},
                      {'continent':'Africa','country':'Tanzania'},
                      {'continent':'Africa','country':'Uganda'},
                      {'continent':'Africa','country':'Zambia'},
                      {'continent':'Africa','country':'Zimbabwe'},
                      {'continent':'Americas','country':'Antigua and Barbuda'},
                      {'continent':'Americas','country':'Bahamas'},
                      {'continent':'Americas','country':'Barbados'},
                      {'continent':'Americas','country':'Belize'},
                      {'continent':'Americas','country':'Canada'},
                      {'continent':'Americas','country':'Dominica'},
                      {'continent':'Americas','country':'Grenada'},
                      {'continent':'Americas','country':'Guyana'},
                      {'continent':'Americas','country':'Jamaica'},
                      {'continent':'Americas','country':'Saint Kitts and Nevis'},
                      {'continent':'Americas','country':'Saint Lucia'},
                      {'continent':'Americas','country':'Saint Vincent and the Grenadines'},
                      {'continent':'Americas','country':'Trinidad and Tobago'},
                      {'continent':'Americas','country':'United States of America'},
                      {'continent':'Australia/Oceania','country':'Australia'},
                      {'continent':'Australia/Oceania','country':'Fiji'},
                      {'continent':'Australia/Oceania','country':'Kiribati'},
                      {'continent':'Australia/Oceania','country':'Marshall Islands'},
                      {'continent':'Australia/Oceania','country':'Federated States of Micronesia'},
                      {'continent':'Australia/Oceania','country':'Nauru'},
                      {'continent':'Australia/Oceania','country':'New Zealand'},
                      {'continent':'Australia/Oceania','country':'Palau'},
                      {'continent':'Australia/Oceania','country':'Papua New Guinea'},
                      {'continent':'Australia/Oceania','country':'Samoa'},
                      {'continent':'Australia/Oceania','country':'Solomon Islands'},
                      {'continent':'Australia/Oceania','country':'Tonga'},
                      {'continent':'Australia/Oceania','country':'Tuvalu'},
                      {'continent':'Australia/Oceania','country':'Vanuatu'},
                      {'continent':'Europe','country':'Ireland'},
                      {'continent':'Europe','country':'Malta'},
                      {'continent':'Europe','country':'United Kingdom'}]
eng_countries_df = pd.DataFrame(eng_countries_list)

In [20]:
eng_countries_df

Unnamed: 0,continent,country
0,Asia,India
1,Asia,Pakistan
2,Asia,Philippines
3,Asia,Singapore
4,Africa,Botswana
5,Africa,Cameroon
6,Africa,Ethiopia
7,Africa,Eritrea
8,Africa,The Gambia
9,Africa,Ghana


In [21]:
# Make sure the rows/columns count matches with the .csv files
companies_df.shape

(66368, 10)

In [22]:
rounds2_df.shape

(114949, 6)

In [23]:
companies_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 66368 entries, 0 to 66367
Data columns (total 10 columns):
permalink        66368 non-null object
name             66367 non-null object
homepage_url     61310 non-null object
category_list    63220 non-null object
status           66368 non-null object
country_code     59410 non-null object
state_code       57821 non-null object
region           58338 non-null object
city             58340 non-null object
founded_at       51147 non-null object
dtypes: object(10)
memory usage: 5.1+ MB


In [24]:
rounds2_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 114949 entries, 0 to 114948
Data columns (total 6 columns):
company_permalink          114949 non-null object
funding_round_permalink    114949 non-null object
funding_round_type         114949 non-null object
funding_round_code         31140 non-null object
funded_at                  114949 non-null object
raised_amount_usd          94959 non-null float64
dtypes: float64(1), object(5)
memory usage: 5.3+ MB


In [25]:
sector_mapping_df.shape

(688, 10)

In [26]:
sector_mapping_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 688 entries, 0 to 687
Data columns (total 10 columns):
category_list                              687 non-null object
Automotive & Sports                        688 non-null int64
Blanks                                     688 non-null int64
Cleantech / Semiconductors                 688 non-null int64
Entertainment                              688 non-null int64
Health                                     688 non-null int64
Manufacturing                              688 non-null int64
News, Search and Messaging                 688 non-null int64
Others                                     688 non-null int64
Social, Finance, Analytics, Advertising    688 non-null int64
dtypes: int64(9), object(1)
memory usage: 53.8+ KB


In [27]:
# Merge countries dataset and english speaking countries dataset
# pd.merge(df_a, df_b, on='subject_id', how='inner')
eng_country_df= pd.merge(eng_countries_df,country_df, left_on='country',right_on='Name',how='left')[['Code','country','continent']]

In [28]:
eng_country_df

Unnamed: 0,Code,country,continent
0,IND,India,Asia
1,PAK,Pakistan,Asia
2,PHL,Philippines,Asia
3,SGP,Singapore,Asia
4,BWA,Botswana,Africa
5,CMR,Cameroon,Africa
6,ETH,Ethiopia,Africa
7,ERI,Eritrea,Africa
8,,The Gambia,Africa
9,GHA,Ghana,Africa


In [29]:
# update missing country codes
eng_country_df.loc[(eng_country_df.country=='United Kingdom'),'Code']='GBR'
eng_country_df.loc[(eng_country_df.country=='The Gambia'),'Code']='GMB'
eng_country_df.loc[(eng_country_df.country=='Federated States of Micronesia'),'Code']='FSM'
eng_country_df.loc[(eng_country_df.country=='Tanzania'),'Code']='TZA'
eng_country_df.loc[(eng_country_df.country=='Swaziland'),'Code']='SWZ'
# Not sure about the country codes for Federated States of Micronesia & Swaziland

In [30]:
eng_country_df

Unnamed: 0,Code,country,continent
0,IND,India,Asia
1,PAK,Pakistan,Asia
2,PHL,Philippines,Asia
3,SGP,Singapore,Asia
4,BWA,Botswana,Africa
5,CMR,Cameroon,Africa
6,ETH,Ethiopia,Africa
7,ERI,Eritrea,Africa
8,GMB,The Gambia,Africa
9,GHA,Ghana,Africa


In [31]:
# Find companies that are in English speaking countries
pd.merge(companies_df,eng_country_df, left_on='country_code',right_on='Code', how='left')

Unnamed: 0,permalink,name,homepage_url,category_list,status,country_code,state_code,region,city,founded_at,Code,country,continent
0,/Organization/-Fame,#fame,http://livfame.com,Media,operating,IND,16,Mumbai,Mumbai,,IND,India,Asia
1,/Organization/-Qounter,:Qounter,http://www.qounter.com,Application Platforms|Real Time|Social Network...,operating,USA,DE,DE - Other,Delaware City,04-09-2014,USA,United States of America,Americas
2,/Organization/-The-One-Of-Them-Inc-,"(THE) ONE of THEM,Inc.",http://oneofthem.jp,Apps|Games|Mobile,operating,,,,,,,,
3,/Organization/0-6-Com,0-6.com,http://www.0-6.com,Curated Web,operating,CHN,22,Beijing,Beijing,01-01-2007,,,
4,/Organization/004-Technologies,004 Technologies,http://004gmbh.de/en/004-interact,Software,operating,USA,IL,"Springfield, Illinois",Champaign,01-01-2010,USA,United States of America,Americas
5,/Organization/01Games-Technology,01Games Technology,http://www.01games.hk/,Games,operating,HKG,,Hong Kong,Hong Kong,,,,
6,/Organization/0Ndine-Biomedical-Inc,Ondine Biomedical Inc.,http://ondinebio.com,Biotechnology,operating,CAN,BC,Vancouver,Vancouver,01-01-1997,CAN,Canada,Americas
7,/Organization/0Xdata,H2O.ai,http://h2o.ai/,Analytics,operating,USA,CA,SF Bay Area,Mountain View,01-01-2011,USA,United States of America,Americas
8,/Organization/1,One Inc.,http://whatis1.com,Mobile,operating,USA,CA,SF Bay Area,San Francisco,01-08-2011,USA,United States of America,Americas
9,/Organization/1-2-3-Listo,"1,2,3 Listo",http://www.123listo.com,E-Commerce,operating,CHL,12,Santiago,Las Condes,01-01-2012,,,


In [32]:
#Find the column wise percentage null count for rounds2 dataframe 
round(100*(rounds2_df.isnull().sum()/len(rounds2_df.index)),2)

company_permalink           0.00
funding_round_permalink     0.00
funding_round_type          0.00
funding_round_code         72.91
funded_at                   0.00
raised_amount_usd          17.39
dtype: float64

In [33]:
#nunique function works on a single column, if we have to take distinct rows on multiple columns, have to use drop_duplicates
#Excel sheet shows 66368, need to verify. 66370 can't be right, only 66368 companies exists
rounds2_df['company_permalink']=rounds2_df['company_permalink'].str.lower()
rounds2_df['company_permalink'].nunique()

66370

In [34]:
# Unique count of companies in Company table.
companies_df['permalink']=companies_df['permalink'].str.lower()
companies_df['permalink'].nunique()

66368

In [35]:
#Are there any companies in the rounds2 file which are not  present in companies ? Answer Y/N.
pd.merge(rounds2_df,companies_df, left_on='company_permalink'
         ,right_on='permalink', how='left')['company_permalink'].isnull().sum(axis=0)

0

In [36]:
rounds2_df

Unnamed: 0,company_permalink,funding_round_permalink,funding_round_type,funding_round_code,funded_at,raised_amount_usd
0,/organization/-fame,/funding-round/9a01d05418af9f794eebff7ace91f638,venture,B,05-01-2015,10000000.0
1,/organization/-qounter,/funding-round/22dacff496eb7acb2b901dec1dfe5633,venture,A,14-10-2014,
2,/organization/-qounter,/funding-round/b44fbb94153f6cdef13083530bb48030,seed,,01-03-2014,700000.0
3,/organization/-the-one-of-them-inc-,/funding-round/650b8f704416801069bb178a1418776b,venture,B,30-01-2014,3406878.0
4,/organization/0-6-com,/funding-round/5727accaeaa57461bd22a9bdd945382d,venture,A,19-03-2008,2000000.0
5,/organization/004-technologies,/funding-round/1278dd4e6a37fa4b7d7e06c21b3c1830,venture,,24-07-2014,
6,/organization/01games-technology,/funding-round/7d53696f2b4f607a2f2a8cbb83d01839,undisclosed,,01-07-2014,41250.0
7,/organization/0ndine-biomedical-inc,/funding-round/2b9d3ac293d5cdccbecff5c8cb0f327d,seed,,11-09-2009,43360.0
8,/organization/0ndine-biomedical-inc,/funding-round/954b9499724b946ad8c396a57a5f3b72,venture,,21-12-2009,719491.0
9,/organization/0xdata,/funding-round/383a9bd2c04f7038bb543ccef5ba3eae,seed,,22-05-2013,3000000.0


In [38]:
master_frame = pd.merge(rounds2_df,companies_df, left_on='company_permalink'
         ,right_on='permalink', how='left')
master_frame

Unnamed: 0,company_permalink,funding_round_permalink,funding_round_type,funding_round_code,funded_at,raised_amount_usd,permalink,name,homepage_url,category_list,status,country_code,state_code,region,city,founded_at
0,/organization/-fame,/funding-round/9a01d05418af9f794eebff7ace91f638,venture,B,05-01-2015,10000000.0,/organization/-fame,#fame,http://livfame.com,Media,operating,IND,16,Mumbai,Mumbai,
1,/organization/-qounter,/funding-round/22dacff496eb7acb2b901dec1dfe5633,venture,A,14-10-2014,,/organization/-qounter,:Qounter,http://www.qounter.com,Application Platforms|Real Time|Social Network...,operating,USA,DE,DE - Other,Delaware City,04-09-2014
2,/organization/-qounter,/funding-round/b44fbb94153f6cdef13083530bb48030,seed,,01-03-2014,700000.0,/organization/-qounter,:Qounter,http://www.qounter.com,Application Platforms|Real Time|Social Network...,operating,USA,DE,DE - Other,Delaware City,04-09-2014
3,/organization/-the-one-of-them-inc-,/funding-round/650b8f704416801069bb178a1418776b,venture,B,30-01-2014,3406878.0,/organization/-the-one-of-them-inc-,"(THE) ONE of THEM,Inc.",http://oneofthem.jp,Apps|Games|Mobile,operating,,,,,
4,/organization/0-6-com,/funding-round/5727accaeaa57461bd22a9bdd945382d,venture,A,19-03-2008,2000000.0,/organization/0-6-com,0-6.com,http://www.0-6.com,Curated Web,operating,CHN,22,Beijing,Beijing,01-01-2007
5,/organization/004-technologies,/funding-round/1278dd4e6a37fa4b7d7e06c21b3c1830,venture,,24-07-2014,,/organization/004-technologies,004 Technologies,http://004gmbh.de/en/004-interact,Software,operating,USA,IL,"Springfield, Illinois",Champaign,01-01-2010
6,/organization/01games-technology,/funding-round/7d53696f2b4f607a2f2a8cbb83d01839,undisclosed,,01-07-2014,41250.0,/organization/01games-technology,01Games Technology,http://www.01games.hk/,Games,operating,HKG,,Hong Kong,Hong Kong,
7,/organization/0ndine-biomedical-inc,/funding-round/2b9d3ac293d5cdccbecff5c8cb0f327d,seed,,11-09-2009,43360.0,/organization/0ndine-biomedical-inc,Ondine Biomedical Inc.,http://ondinebio.com,Biotechnology,operating,CAN,BC,Vancouver,Vancouver,01-01-1997
8,/organization/0ndine-biomedical-inc,/funding-round/954b9499724b946ad8c396a57a5f3b72,venture,,21-12-2009,719491.0,/organization/0ndine-biomedical-inc,Ondine Biomedical Inc.,http://ondinebio.com,Biotechnology,operating,CAN,BC,Vancouver,Vancouver,01-01-1997
9,/organization/0xdata,/funding-round/383a9bd2c04f7038bb543ccef5ba3eae,seed,,22-05-2013,3000000.0,/organization/0xdata,H2O.ai,http://h2o.ai/,Analytics,operating,USA,CA,SF Bay Area,Mountain View,01-01-2011


In [39]:
#Drop all the rows which have Null values for raised_amount_usd column.
master_frame = master_frame[~np.isnan(master_frame['raised_amount_usd'])]
master_frame

Unnamed: 0,company_permalink,funding_round_permalink,funding_round_type,funding_round_code,funded_at,raised_amount_usd,permalink,name,homepage_url,category_list,status,country_code,state_code,region,city,founded_at
0,/organization/-fame,/funding-round/9a01d05418af9f794eebff7ace91f638,venture,B,05-01-2015,10000000.0,/organization/-fame,#fame,http://livfame.com,Media,operating,IND,16,Mumbai,Mumbai,
2,/organization/-qounter,/funding-round/b44fbb94153f6cdef13083530bb48030,seed,,01-03-2014,700000.0,/organization/-qounter,:Qounter,http://www.qounter.com,Application Platforms|Real Time|Social Network...,operating,USA,DE,DE - Other,Delaware City,04-09-2014
3,/organization/-the-one-of-them-inc-,/funding-round/650b8f704416801069bb178a1418776b,venture,B,30-01-2014,3406878.0,/organization/-the-one-of-them-inc-,"(THE) ONE of THEM,Inc.",http://oneofthem.jp,Apps|Games|Mobile,operating,,,,,
4,/organization/0-6-com,/funding-round/5727accaeaa57461bd22a9bdd945382d,venture,A,19-03-2008,2000000.0,/organization/0-6-com,0-6.com,http://www.0-6.com,Curated Web,operating,CHN,22,Beijing,Beijing,01-01-2007
6,/organization/01games-technology,/funding-round/7d53696f2b4f607a2f2a8cbb83d01839,undisclosed,,01-07-2014,41250.0,/organization/01games-technology,01Games Technology,http://www.01games.hk/,Games,operating,HKG,,Hong Kong,Hong Kong,
7,/organization/0ndine-biomedical-inc,/funding-round/2b9d3ac293d5cdccbecff5c8cb0f327d,seed,,11-09-2009,43360.0,/organization/0ndine-biomedical-inc,Ondine Biomedical Inc.,http://ondinebio.com,Biotechnology,operating,CAN,BC,Vancouver,Vancouver,01-01-1997
8,/organization/0ndine-biomedical-inc,/funding-round/954b9499724b946ad8c396a57a5f3b72,venture,,21-12-2009,719491.0,/organization/0ndine-biomedical-inc,Ondine Biomedical Inc.,http://ondinebio.com,Biotechnology,operating,CAN,BC,Vancouver,Vancouver,01-01-1997
9,/organization/0xdata,/funding-round/383a9bd2c04f7038bb543ccef5ba3eae,seed,,22-05-2013,3000000.0,/organization/0xdata,H2O.ai,http://h2o.ai/,Analytics,operating,USA,CA,SF Bay Area,Mountain View,01-01-2011
10,/organization/0xdata,/funding-round/3bb2ee4a2d89251a10aaa735b1180e44,venture,B,09-11-2015,20000000.0,/organization/0xdata,H2O.ai,http://h2o.ai/,Analytics,operating,USA,CA,SF Bay Area,Mountain View,01-01-2011
11,/organization/0xdata,/funding-round/ae2a174c06517c2394aed45006322a7e,venture,,03-01-2013,1700000.0,/organization/0xdata,H2O.ai,http://h2o.ai/,Analytics,operating,USA,CA,SF Bay Area,Mountain View,01-01-2011


In [42]:
#Top three English speaking countries in terms of investment
master_frame.groupby('country_code').raised_amount_usd.sum().sort_values(ascending = False).head(3).apply(lambda x: '%.2f' % x)

country_code
USA    669482123821.00
CHN     75703565796.00
GBR     32767048060.00
Name: raised_amount_usd, dtype: object

In [48]:
#Create dataframe for top English speaking country. This one for USA.
C1 = master_frame.loc[master_frame.country_code=='USA',:]
C1

Unnamed: 0,company_permalink,funding_round_permalink,funding_round_type,funding_round_code,funded_at,raised_amount_usd,permalink,name,homepage_url,category_list,status,country_code,state_code,region,city,founded_at
2,/organization/-qounter,/funding-round/b44fbb94153f6cdef13083530bb48030,seed,,01-03-2014,700000.0,/organization/-qounter,:Qounter,http://www.qounter.com,Application Platforms|Real Time|Social Network...,operating,USA,DE,DE - Other,Delaware City,04-09-2014
9,/organization/0xdata,/funding-round/383a9bd2c04f7038bb543ccef5ba3eae,seed,,22-05-2013,3000000.0,/organization/0xdata,H2O.ai,http://h2o.ai/,Analytics,operating,USA,CA,SF Bay Area,Mountain View,01-01-2011
10,/organization/0xdata,/funding-round/3bb2ee4a2d89251a10aaa735b1180e44,venture,B,09-11-2015,20000000.0,/organization/0xdata,H2O.ai,http://h2o.ai/,Analytics,operating,USA,CA,SF Bay Area,Mountain View,01-01-2011
11,/organization/0xdata,/funding-round/ae2a174c06517c2394aed45006322a7e,venture,,03-01-2013,1700000.0,/organization/0xdata,H2O.ai,http://h2o.ai/,Analytics,operating,USA,CA,SF Bay Area,Mountain View,01-01-2011
12,/organization/0xdata,/funding-round/e1cfcbe1bdf4c70277c5f29a3482f24e,venture,A,19-07-2014,8900000.0,/organization/0xdata,H2O.ai,http://h2o.ai/,Analytics,operating,USA,CA,SF Bay Area,Mountain View,01-01-2011
13,/organization/1,/funding-round/03b975068632eba5bfdb937ec8c07a68,seed,,05-02-2014,150000.0,/organization/1,One Inc.,http://whatis1.com,Mobile,operating,USA,CA,SF Bay Area,San Francisco,01-08-2011
15,/organization/1,/funding-round/e82464f22241715dd1a6c77241055ed1,seed,,20-07-2011,1000050.0,/organization/1,One Inc.,http://whatis1.com,Mobile,operating,USA,CA,SF Bay Area,San Francisco,01-08-2011
20,/organization/1-800-doctors,/funding-round/9eb8c7790a0c200d79e75785d1c4aa12,convertible_note,,02-03-2011,1750000.0,/organization/1-800-doctors,1-800-DOCTORS,http://1800doctors.com,Health and Wellness,operating,USA,NJ,Newark,Iselin,01-01-1984
21,/organization/1-800-publicrelations-inc-,/funding-round/11c228f58831bc7ed337ef69ecc560c2,private_equity,,01-02-2015,6000000.0,/organization/1-800-publicrelations-inc-,"1-800-PublicRelations, Inc.",http://www.1800publicrelations.com,Internet Marketing|Media|Public Relations,operating,USA,NY,New York City,New York,24-10-2013
22,/organization/1-mainstream,/funding-round/b952cbaf401f310927430c97b68162ea,venture,,17-03-2015,5000000.0,/organization/1-mainstream,1 Mainstream,http://www.1mainstream.com,Apps|Cable|Distribution|Software,acquired,USA,CA,SF Bay Area,Cupertino,01-03-2012


In [49]:
#Create dataframe for Second English speaking country. This one for CHINA.
C2 = master_frame.loc[master_frame.country_code=='CHN',:]
C2

Unnamed: 0,company_permalink,funding_round_permalink,funding_round_type,funding_round_code,funded_at,raised_amount_usd,permalink,name,homepage_url,category_list,status,country_code,state_code,region,city,founded_at
4,/organization/0-6-com,/funding-round/5727accaeaa57461bd22a9bdd945382d,venture,A,19-03-2008,2000000.0,/organization/0-6-com,0-6.com,http://www.0-6.com,Curated Web,operating,CHN,22,Beijing,Beijing,01-01-2007
52,/organization/1006-tv,/funding-round/b6aeb7401ec6993f92a16cbca153b600,venture,B,31-07-2014,10000000.0,/organization/1006-tv,1006.tv,http://www.1006.tv/,Games|Media,operating,CHN,22,Beijing,Beijing,01-01-2009
55,/organization/100du-tv,/funding-round/8797d60368bb0227f0d0ab4c72aef886,venture,A,07-01-2008,3000000.0,/organization/100du-tv,100du.tv,http://www.100du.com,Hospitality,operating,CHN,23,Shanghai,Shanghai,
56,/organization/100e-com,/funding-round/22a86670d1055d7bafce665b27e91871,venture,,01-01-2006,3000000.0,/organization/100e-com,100e.com,http://www.100e.com,Education,operating,CHN,22,Beijing,Beijing,
57,/organization/100e-com,/funding-round/a136f7eb873dfb13cec839fef7d7f51e,venture,,01-09-2006,1500000.0,/organization/100e-com,100e.com,http://www.100e.com,Education,operating,CHN,22,Beijing,Beijing,
80,/organization/117go,/funding-round/bbbda407fa8638d944ecfdd042230c4b,venture,B,28-04-2014,20000000.0,/organization/117go,117go,http://117go.com,Social Travel,operating,CHN,23,Shanghai,Shanghai,01-10-2011
99,/organization/123feng-com,/funding-round/9d83084d87dc63a309c9a6fe9bf59d1b,venture,A,27-10-2015,13000000.0,/organization/123feng-com,123Feng.Com,http://123feng.com/,,operating,CHN,2,Hangzhou,Hangzhou,01-01-2014
132,/organization/16wifi,/funding-round/0496c258019a924b3a2da1ebaddb1f9d,venture,A,05-06-2015,16000000.0,/organization/16wifi,16WiFi,,Public Transportation,operating,CHN,22,Beijing,Beijing,01-01-2011
137,/organization/17u-cn,/funding-round/1f3e644c0446dca6939bb0f679bfa5ea,venture,C,01-05-2012,1588983.0,/organization/17u-cn,17u.cn,http://www.17u.cn,Travel,operating,CHN,4,Shanghai,Suzhou,01-01-2004
138,/organization/17u-cn,/funding-round/275f67ca70ae24053ee00310a1472019,venture,A,01-01-2008,479014.0,/organization/17u-cn,17u.cn,http://www.17u.cn,Travel,operating,CHN,4,Shanghai,Suzhou,01-01-2004


In [50]:
#Create dataframe for Third English speaking country. This one for Britain.
C3 = master_frame.loc[master_frame.country_code=='GBR',:]
C3

Unnamed: 0,company_permalink,funding_round_permalink,funding_round_type,funding_round_code,funded_at,raised_amount_usd,permalink,name,homepage_url,category_list,status,country_code,state_code,region,city,founded_at
28,/organization/10-minutes-with,/funding-round/0faccbbcc5818dc5326469f13f5a8ac8,venture,A,09-10-2014,4000000.0,/organization/10-minutes-with,10 Minutes With,http://10minuteswith.com,Education,operating,GBR,H9,London,London,01-01-2013
29,/organization/10-minutes-with,/funding-round/f245a74b4c54610ae843e17bdf4d1113,seed,,01-01-2013,400000.0,/organization/10-minutes-with,10 Minutes With,http://10minuteswith.com,Education,operating,GBR,H9,London,London,01-01-2013
78,/organization/11-health,/funding-round/064e6e706d1b2928a064c1dde49f05d7,seed,,20-08-2015,1958909.0,/organization/11-health,11 Health,http://www.11health.com,Health and Wellness,operating,GBR,H9,London,London,
102,/organization/1248,/funding-round/ce0e1829f5fe37bb20fc1542340f1766,seed,,18-03-2014,378812.0,/organization/1248,1248,http://1248.io/index.php/?page=index,Software,operating,GBR,C3,London,Cambridge,01-01-2013
183,/organization/1rebel,/funding-round/a5c7a437af6e065280be325ae194f8d6,equity_crowdfunding,,01-09-2014,2572969.0,/organization/1rebel,1Rebel,http://1rebelco.uk,Fitness,operating,GBR,H9,London,London,
310,/organization/2dheat,/funding-round/555c226a44a28ca514634450e74e3924,angel,,24-10-2014,157000.0,/organization/2dheat,2DHeat,http://www.2dheat.com/,Clean Technology,operating,GBR,P2,Warrington,Warrington,
311,/organization/2dheat,/funding-round/a7c2474ff586c636c857b2da8ac964d9,grant,,24-10-2014,64226.0,/organization/2dheat,2DHeat,http://www.2dheat.com/,Clean Technology,operating,GBR,P2,Warrington,Warrington,
312,/organization/2dheat,/funding-round/b1308b31d7a416eb1200b9bb42c9a08f,private_equity,,24-10-2014,200708.0,/organization/2dheat,2DHeat,http://www.2dheat.com/,Clean Technology,operating,GBR,P2,Warrington,Warrington,
313,/organization/2dheat,/funding-round/c62ddc12e22867d4302c2b3c1a2eac14,angel,,24-10-2014,52183.0,/organization/2dheat,2DHeat,http://www.2dheat.com/,Clean Technology,operating,GBR,P2,Warrington,Warrington,
368,/organization/31dover,/funding-round/b95cb5a74632e596e19a845e405ef14b,venture,B,01-03-2014,2274716.0,/organization/31dover,31Dover,http://www.31dover.com,E-Commerce|Wine And Spirits,operating,GBR,H9,London,London,01-07-2012


In [43]:
#Find the average amaount raised for each funding type
#investment_type =  master_frame.groupby('funding_round_type').raised_amount_usd.mean().sort_values(ascending = True)
master_frame.groupby('funding_round_type').raised_amount_usd.mean().sort_values(ascending = False).apply(lambda x: '%.2f' % x)

funding_round_type
post_ipo_debt            168704571.82
post_ipo_equity           82182493.87
secondary_market          79649630.10
private_equity            73308593.03
undisclosed               19242370.23
debt_financing            17043526.02
venture                   11748949.13
grant                      4300576.34
convertible_note           1453438.54
product_crowdfunding       1363131.07
angel                       958694.47
seed                        719818.00
equity_crowdfunding         538368.21
non_equity_assistance       411203.05
Name: raised_amount_usd, dtype: object

In [123]:
# Sector Analysis
# Modify sector mapping dataframe category columns to rows.# Use Pandas Melt function
sector_mapping_df1 = pd.melt(sector_mapping_df,id_vars=["category_list"],var_name='category',value_name='flag')
sector_mapping_df1 = sector_mapping_df1[sector_mapping_df1['flag']==1]
sector_mapping_df1

Unnamed: 0,category_list,category,flag
8,Adventure Travel,Automotive & Sports,1
14,Aerospace,Automotive & Sports,1
45,Auto,Automotive & Sports,1
46,Automated Kiosk,Automotive & Sports,1
47,Automotive,Automotive & Sports,1
57,Bicycles,Automotive & Sports,1
69,Boating Industry,Automotive & Sports,1
87,CAD,Automotive & Sports,1
93,Cars,Automotive & Sports,1
188,Design,Automotive & Sports,1


In [124]:
# Noticed the category list values are having 0 instead of 'na', replacing it to correct the category list values.
sector_mapping_df1['category_list'] = sector_mapping_df1['category_list'].str.replace('0','na')

In [130]:
#drop blank category list from the mapping data frame.
sector_mapping_df1 = sector_mapping_df1[sector_mapping_df1.category_list.notnull()]

In [131]:
sector_mapping_df1

Unnamed: 0,category_list,category,flag
8,Adventure Travel,Automotive & Sports,1
14,Aerospace,Automotive & Sports,1
45,Auto,Automotive & Sports,1
46,Automated Kiosk,Automotive & Sports,1
47,Automotive,Automotive & Sports,1
57,Bicycles,Automotive & Sports,1
69,Boating Industry,Automotive & Sports,1
87,CAD,Automotive & Sports,1
93,Cars,Automotive & Sports,1
188,Design,Automotive & Sports,1


In [151]:
#Exclude company rounds data where there is no category_list for a company
master_frame = master_frame[master_frame['category_list'].notnull()]
master_frame

Unnamed: 0,company_permalink,funding_round_permalink,funding_round_type,funding_round_code,funded_at,raised_amount_usd,permalink,name,homepage_url,category_list,status,country_code,state_code,region,city,founded_at
0,/organization/-fame,/funding-round/9a01d05418af9f794eebff7ace91f638,venture,B,05-01-2015,10000000.0,/organization/-fame,#fame,http://livfame.com,Media,operating,IND,16,Mumbai,Mumbai,
2,/organization/-qounter,/funding-round/b44fbb94153f6cdef13083530bb48030,seed,,01-03-2014,700000.0,/organization/-qounter,:Qounter,http://www.qounter.com,Application Platforms|Real Time|Social Network...,operating,USA,DE,DE - Other,Delaware City,04-09-2014
3,/organization/-the-one-of-them-inc-,/funding-round/650b8f704416801069bb178a1418776b,venture,B,30-01-2014,3406878.0,/organization/-the-one-of-them-inc-,"(THE) ONE of THEM,Inc.",http://oneofthem.jp,Apps|Games|Mobile,operating,,,,,
4,/organization/0-6-com,/funding-round/5727accaeaa57461bd22a9bdd945382d,venture,A,19-03-2008,2000000.0,/organization/0-6-com,0-6.com,http://www.0-6.com,Curated Web,operating,CHN,22,Beijing,Beijing,01-01-2007
6,/organization/01games-technology,/funding-round/7d53696f2b4f607a2f2a8cbb83d01839,undisclosed,,01-07-2014,41250.0,/organization/01games-technology,01Games Technology,http://www.01games.hk/,Games,operating,HKG,,Hong Kong,Hong Kong,
7,/organization/0ndine-biomedical-inc,/funding-round/2b9d3ac293d5cdccbecff5c8cb0f327d,seed,,11-09-2009,43360.0,/organization/0ndine-biomedical-inc,Ondine Biomedical Inc.,http://ondinebio.com,Biotechnology,operating,CAN,BC,Vancouver,Vancouver,01-01-1997
8,/organization/0ndine-biomedical-inc,/funding-round/954b9499724b946ad8c396a57a5f3b72,venture,,21-12-2009,719491.0,/organization/0ndine-biomedical-inc,Ondine Biomedical Inc.,http://ondinebio.com,Biotechnology,operating,CAN,BC,Vancouver,Vancouver,01-01-1997
9,/organization/0xdata,/funding-round/383a9bd2c04f7038bb543ccef5ba3eae,seed,,22-05-2013,3000000.0,/organization/0xdata,H2O.ai,http://h2o.ai/,Analytics,operating,USA,CA,SF Bay Area,Mountain View,01-01-2011
10,/organization/0xdata,/funding-round/3bb2ee4a2d89251a10aaa735b1180e44,venture,B,09-11-2015,20000000.0,/organization/0xdata,H2O.ai,http://h2o.ai/,Analytics,operating,USA,CA,SF Bay Area,Mountain View,01-01-2011
11,/organization/0xdata,/funding-round/ae2a174c06517c2394aed45006322a7e,venture,,03-01-2013,1700000.0,/organization/0xdata,H2O.ai,http://h2o.ai/,Analytics,operating,USA,CA,SF Bay Area,Mountain View,01-01-2011


In [152]:
#Extract a first category_list value (left of |) and add it to the new column primary sector.
master_frame.loc[:,'primary_sector'] = master_frame.apply(lambda x:x['category_list'].split('|')[0],axis=1)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  self.obj[key] = _infer_fill_value(value)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  self.obj[item] = s


In [153]:
master_frame

Unnamed: 0,company_permalink,funding_round_permalink,funding_round_type,funding_round_code,funded_at,raised_amount_usd,permalink,name,homepage_url,category_list,status,country_code,state_code,region,city,founded_at,primary_sector
0,/organization/-fame,/funding-round/9a01d05418af9f794eebff7ace91f638,venture,B,05-01-2015,10000000.0,/organization/-fame,#fame,http://livfame.com,Media,operating,IND,16,Mumbai,Mumbai,,Media
2,/organization/-qounter,/funding-round/b44fbb94153f6cdef13083530bb48030,seed,,01-03-2014,700000.0,/organization/-qounter,:Qounter,http://www.qounter.com,Application Platforms|Real Time|Social Network...,operating,USA,DE,DE - Other,Delaware City,04-09-2014,Application Platforms
3,/organization/-the-one-of-them-inc-,/funding-round/650b8f704416801069bb178a1418776b,venture,B,30-01-2014,3406878.0,/organization/-the-one-of-them-inc-,"(THE) ONE of THEM,Inc.",http://oneofthem.jp,Apps|Games|Mobile,operating,,,,,,Apps
4,/organization/0-6-com,/funding-round/5727accaeaa57461bd22a9bdd945382d,venture,A,19-03-2008,2000000.0,/organization/0-6-com,0-6.com,http://www.0-6.com,Curated Web,operating,CHN,22,Beijing,Beijing,01-01-2007,Curated Web
6,/organization/01games-technology,/funding-round/7d53696f2b4f607a2f2a8cbb83d01839,undisclosed,,01-07-2014,41250.0,/organization/01games-technology,01Games Technology,http://www.01games.hk/,Games,operating,HKG,,Hong Kong,Hong Kong,,Games
7,/organization/0ndine-biomedical-inc,/funding-round/2b9d3ac293d5cdccbecff5c8cb0f327d,seed,,11-09-2009,43360.0,/organization/0ndine-biomedical-inc,Ondine Biomedical Inc.,http://ondinebio.com,Biotechnology,operating,CAN,BC,Vancouver,Vancouver,01-01-1997,Biotechnology
8,/organization/0ndine-biomedical-inc,/funding-round/954b9499724b946ad8c396a57a5f3b72,venture,,21-12-2009,719491.0,/organization/0ndine-biomedical-inc,Ondine Biomedical Inc.,http://ondinebio.com,Biotechnology,operating,CAN,BC,Vancouver,Vancouver,01-01-1997,Biotechnology
9,/organization/0xdata,/funding-round/383a9bd2c04f7038bb543ccef5ba3eae,seed,,22-05-2013,3000000.0,/organization/0xdata,H2O.ai,http://h2o.ai/,Analytics,operating,USA,CA,SF Bay Area,Mountain View,01-01-2011,Analytics
10,/organization/0xdata,/funding-round/3bb2ee4a2d89251a10aaa735b1180e44,venture,B,09-11-2015,20000000.0,/organization/0xdata,H2O.ai,http://h2o.ai/,Analytics,operating,USA,CA,SF Bay Area,Mountain View,01-01-2011,Analytics
11,/organization/0xdata,/funding-round/ae2a174c06517c2394aed45006322a7e,venture,,03-01-2013,1700000.0,/organization/0xdata,H2O.ai,http://h2o.ai/,Analytics,operating,USA,CA,SF Bay Area,Mountain View,01-01-2011,Analytics


In [154]:
# Join Master frame category list with the mapping category list, to map primary sector for all funding rounds.
master_frame2= pd.merge(master_frame,sector_mapping_df1, left_on='primary_sector'
         ,right_on='category_list', how='left')
master_frame2

Unnamed: 0,company_permalink,funding_round_permalink,funding_round_type,funding_round_code,funded_at,raised_amount_usd,permalink,name,homepage_url,category_list_x,status,country_code,state_code,region,city,founded_at,primary_sector,category_list_y,category,flag
0,/organization/-fame,/funding-round/9a01d05418af9f794eebff7ace91f638,venture,B,05-01-2015,10000000.0,/organization/-fame,#fame,http://livfame.com,Media,operating,IND,16,Mumbai,Mumbai,,Media,Media,Entertainment,1.0
1,/organization/-qounter,/funding-round/b44fbb94153f6cdef13083530bb48030,seed,,01-03-2014,700000.0,/organization/-qounter,:Qounter,http://www.qounter.com,Application Platforms|Real Time|Social Network...,operating,USA,DE,DE - Other,Delaware City,04-09-2014,Application Platforms,Application Platforms,"News, Search and Messaging",1.0
2,/organization/-the-one-of-them-inc-,/funding-round/650b8f704416801069bb178a1418776b,venture,B,30-01-2014,3406878.0,/organization/-the-one-of-them-inc-,"(THE) ONE of THEM,Inc.",http://oneofthem.jp,Apps|Games|Mobile,operating,,,,,,Apps,Apps,"News, Search and Messaging",1.0
3,/organization/0-6-com,/funding-round/5727accaeaa57461bd22a9bdd945382d,venture,A,19-03-2008,2000000.0,/organization/0-6-com,0-6.com,http://www.0-6.com,Curated Web,operating,CHN,22,Beijing,Beijing,01-01-2007,Curated Web,Curated Web,"News, Search and Messaging",1.0
4,/organization/01games-technology,/funding-round/7d53696f2b4f607a2f2a8cbb83d01839,undisclosed,,01-07-2014,41250.0,/organization/01games-technology,01Games Technology,http://www.01games.hk/,Games,operating,HKG,,Hong Kong,Hong Kong,,Games,Games,Entertainment,1.0
5,/organization/0ndine-biomedical-inc,/funding-round/2b9d3ac293d5cdccbecff5c8cb0f327d,seed,,11-09-2009,43360.0,/organization/0ndine-biomedical-inc,Ondine Biomedical Inc.,http://ondinebio.com,Biotechnology,operating,CAN,BC,Vancouver,Vancouver,01-01-1997,Biotechnology,Biotechnology,Cleantech / Semiconductors,1.0
6,/organization/0ndine-biomedical-inc,/funding-round/954b9499724b946ad8c396a57a5f3b72,venture,,21-12-2009,719491.0,/organization/0ndine-biomedical-inc,Ondine Biomedical Inc.,http://ondinebio.com,Biotechnology,operating,CAN,BC,Vancouver,Vancouver,01-01-1997,Biotechnology,Biotechnology,Cleantech / Semiconductors,1.0
7,/organization/0xdata,/funding-round/383a9bd2c04f7038bb543ccef5ba3eae,seed,,22-05-2013,3000000.0,/organization/0xdata,H2O.ai,http://h2o.ai/,Analytics,operating,USA,CA,SF Bay Area,Mountain View,01-01-2011,Analytics,Analytics,"Social, Finance, Analytics, Advertising",1.0
8,/organization/0xdata,/funding-round/3bb2ee4a2d89251a10aaa735b1180e44,venture,B,09-11-2015,20000000.0,/organization/0xdata,H2O.ai,http://h2o.ai/,Analytics,operating,USA,CA,SF Bay Area,Mountain View,01-01-2011,Analytics,Analytics,"Social, Finance, Analytics, Advertising",1.0
9,/organization/0xdata,/funding-round/ae2a174c06517c2394aed45006322a7e,venture,,03-01-2013,1700000.0,/organization/0xdata,H2O.ai,http://h2o.ai/,Analytics,operating,USA,CA,SF Bay Area,Mountain View,01-01-2011,Analytics,Analytics,"Social, Finance, Analytics, Advertising",1.0


In [14]:
#Converting the permlink into lower case for rounds2 dataframe
rounds2_df.apply(lambda x:x.astype(str).str.lower())

Unnamed: 0,company_permalink,funding_round_permalink,funding_round_type,funding_round_code,funded_at,raised_amount_usd
0,/organization/-fame,/funding-round/9a01d05418af9f794eebff7ace91f638,venture,b,05-01-2015,10000000.0
2,/organization/-qounter,/funding-round/b44fbb94153f6cdef13083530bb48030,seed,,01-03-2014,700000.0
3,/organization/-the-one-of-them-inc-,/funding-round/650b8f704416801069bb178a1418776b,venture,b,30-01-2014,3406878.0
4,/organization/0-6-com,/funding-round/5727accaeaa57461bd22a9bdd945382d,venture,a,19-03-2008,2000000.0
6,/organization/01games-technology,/funding-round/7d53696f2b4f607a2f2a8cbb83d01839,undisclosed,,01-07-2014,41250.0
7,/organization/0ndine-biomedical-inc,/funding-round/2b9d3ac293d5cdccbecff5c8cb0f327d,seed,,11-09-2009,43360.0
8,/organization/0ndine-biomedical-inc,/funding-round/954b9499724b946ad8c396a57a5f3b72,venture,,21-12-2009,719491.0
9,/organization/0xdata,/funding-round/383a9bd2c04f7038bb543ccef5ba3eae,seed,,22-05-2013,3000000.0
10,/organization/0xdata,/funding-round/3bb2ee4a2d89251a10aaa735b1180e44,venture,b,09-11-2015,20000000.0
11,/organization/0xdata,/funding-round/ae2a174c06517c2394aed45006322a7e,venture,,03-01-2013,1700000.0
