# Data Wrangling - Tools - Notebook

## Outline


* Basic Data Exploration
* Missing Values
* Duplicates
* Selecting / Dropping Columns
* String operations on whole columns
* Column splits/transformations
* MultiIndex
* Querying a Dataframe
* `DataFrame.at()` vs. `DataFrame.loc`
* Joins
* Sorting
* Grouping - Aggregation & Value Counts
* Resetting Index
* Pivot & Melt
* Column / Row Concatenation
* looping through DataFrame

In [1]:
import pandas as pd
import numpy as np

# Making copy of data

It is imporatant to make a copy of data first, as in future if we corrupt the df by accident then we must have a copy to read data.

In [2]:
data = pd.read_csv('startup_funding.csv',thousands=',')
df = data.copy()

## Basic Data Exploration

**Getting Shape of the Data**

In [3]:
df.shape

(2422, 11)

There are 2422 rows and 11 columns in the data

**Getting a list of all columns in the dataframe**

In [4]:
df.columns

Index(['index', 'SNo', 'Date', 'StartupName', 'IndustryVertical',
       'SubVertical', 'CityLocation', 'InvestorsName', 'InvestmentType',
       'AmountInUSD', 'Remarks'],
      dtype='object')

**Checking data types of all columns**

In [5]:
df.dtypes

index                 int64
SNo                   int64
Date                 object
StartupName          object
IndustryVertical     object
SubVertical          object
CityLocation         object
InvestorsName        object
InvestmentType       object
AmountInUSD         float64
Remarks              object
dtype: object

**Getting top/bottom 5 values**

By default the df.head() will show the first five rows of the data frame.

In [6]:
df.head(3)

Unnamed: 0,index,SNo,Date,StartupName,IndustryVertical,SubVertical,CityLocation,InvestorsName,InvestmentType,AmountInUSD,Remarks
0,0,0,01/08/2017,TouchKin,Technology,Predictive Care Platform,Bangalore,Kae Capital,Private Equity,1300000.0,
1,1,1,02/08/2017,Ethinos,Technology,Digital Marketing Agency,Mumbai,Triton Investment Advisors,Private Equity,,
2,2,2,02/08/2017,Leverage Edu,Consumer Internet,Online platform for Higher Education Services,New Delhi,"Kashyap Deorah, Anand Sankeshwar, Deepak Jain,...",Seed Funding,,


By default the df.tail() will show the last five rows of the data frame.

In [7]:
df.tail(3)

Unnamed: 0,index,SNo,Date,StartupName,IndustryVertical,SubVertical,CityLocation,InvestorsName,InvestmentType,AmountInUSD,Remarks
2419,47,47,28/07/2017,Gympik.com,Consumer Internet,online marketplace for discovering fitness cen...,bangalore,RoundGlass Partners,Seed Funding,,
2420,48,48,01/06/2017,Tripeur,Technology,Mobile based travel ERP platform,Bangalore,"Grace Grace Techno Ventures LLP, Rajul Garg & ...",Seed Funding,,
2421,49,49,02/06/2017,RentOnGo,eCommerce,"Online Marketplace for Renting Bikes, Electron...",Bangalore,TVS Motor Company,Private Equity,,


**Getting top/bottom n values**

In [8]:
df.head(3)

Unnamed: 0,index,SNo,Date,StartupName,IndustryVertical,SubVertical,CityLocation,InvestorsName,InvestmentType,AmountInUSD,Remarks
0,0,0,01/08/2017,TouchKin,Technology,Predictive Care Platform,Bangalore,Kae Capital,Private Equity,1300000.0,
1,1,1,02/08/2017,Ethinos,Technology,Digital Marketing Agency,Mumbai,Triton Investment Advisors,Private Equity,,
2,2,2,02/08/2017,Leverage Edu,Consumer Internet,Online platform for Higher Education Services,New Delhi,"Kashyap Deorah, Anand Sankeshwar, Deepak Jain,...",Seed Funding,,


In [9]:
df.tail(7)

Unnamed: 0,index,SNo,Date,StartupName,IndustryVertical,SubVertical,CityLocation,InvestorsName,InvestmentType,AmountInUSD,Remarks
2415,43,43,26/07/2017,ThinkerBell,Consumer Internet,Assisted Learning Startup,Bangalore,"Indian Angel Network, Anand Mahindra",Seed Funding,200000.0,
2416,44,44,27/07/2017,1mg,eCommerce,Online Pharmacy,Gurgaon,"HBM Healthcare Investments, Maverick Capital V...",Private Equity,15000000.0,
2417,45,45,28/07/2017,Jhakaas,Consumer Internet,App-based Aggregator of Offline Businesses,Mumbai,Amen Dhyllon,Seed Funding,,
2418,46,46,28/07/2017,BigStylist,Consumer Internet,Beauty Services Marketplace,Mumbai,Info Edge (India) Ltd,Private Equity,1250000.0,
2419,47,47,28/07/2017,Gympik.com,Consumer Internet,online marketplace for discovering fitness cen...,bangalore,RoundGlass Partners,Seed Funding,,
2420,48,48,01/06/2017,Tripeur,Technology,Mobile based travel ERP platform,Bangalore,"Grace Grace Techno Ventures LLP, Rajul Garg & ...",Seed Funding,,
2421,49,49,02/06/2017,RentOnGo,eCommerce,"Online Marketplace for Renting Bikes, Electron...",Bangalore,TVS Motor Company,Private Equity,,


**Getting a summary of all columns**

This will display the basic stats of the **numerical columns** in the data frame

In [11]:
df.describe()

Unnamed: 0,index,SNo,AmountInUSD
count,2422.0,2422.0,1553.0
mean,1161.532205,1161.532205,11923850.0
std,697.598259,697.598259,63466800.0
min,0.0,0.0,16000.0
25%,555.25,555.25,375000.0
50%,1160.5,1160.5,1100000.0
75%,1765.75,1765.75,6000000.0
max,2371.0,2371.0,1400000000.0


This will display the basic stats of the **object type** columns in the data frame. The 'O' means object in the following code. 

In [12]:
df.describe(include=['O'])

Unnamed: 0,Date,StartupName,IndustryVertical,SubVertical,CityLocation,InvestorsName,InvestmentType,Remarks
count,2422,2422,2251,1486,2243,2413,2421,419
unique,701,2001,743,1364,71,1885,7,69
top,08/07/2015,Swiggy,Consumer Internet,Online Pharmacy,Bangalore,Undisclosed Investors,Seed Funding,Series A
freq,11,7,795,10,649,33,1300,177


To display all the columns at once use the following command.

In [13]:
df.describe(include='all')

Unnamed: 0,index,SNo,Date,StartupName,IndustryVertical,SubVertical,CityLocation,InvestorsName,InvestmentType,AmountInUSD,Remarks
count,2422.0,2422.0,2422,2422,2251,1486,2243,2413,2421,1553.0,419
unique,,,701,2001,743,1364,71,1885,7,,69
top,,,08/07/2015,Swiggy,Consumer Internet,Online Pharmacy,Bangalore,Undisclosed Investors,Seed Funding,,Series A
freq,,,11,7,795,10,649,33,1300,,177
mean,1161.532205,1161.532205,,,,,,,,11923850.0,
std,697.598259,697.598259,,,,,,,,63466800.0,
min,0.0,0.0,,,,,,,,16000.0,
25%,555.25,555.25,,,,,,,,375000.0,
50%,1160.5,1160.5,,,,,,,,1100000.0,
75%,1765.75,1765.75,,,,,,,,6000000.0,


**Getting unique values of  single column**

After running this command you will see that it returns all unique values that are present in the column **InvestmentType**.

Note that **PrivateEquity** and **Private Equity** are two unique values.

In [15]:
df['InvestmentType'].unique()

array(['Private Equity', 'Seed Funding', 'Debt Funding', nan,
       'SeedFunding', 'PrivateEquity', 'Crowd funding', 'Crowd Funding'],
      dtype=object)

## Missing Values

**Checking which columns have missing values and how many**

In [14]:
df.isnull().sum()

index                  0
SNo                    0
Date                   0
StartupName            0
IndustryVertical     171
SubVertical          936
CityLocation         179
InvestorsName          9
InvestmentType         1
AmountInUSD          869
Remarks             2003
dtype: int64

**Treating Missing Values of a single column (Series)**

In [15]:
df['AmountInUSD'] = df['AmountInUSD'].fillna(0)

After running the following command you'll see that AmountInUSD is now filled with 0 and it doesn't have any missing/null values.

In [16]:
df.isnull().sum()

index                  0
SNo                    0
Date                   0
StartupName            0
IndustryVertical     171
SubVertical          936
CityLocation         179
InvestorsName          9
InvestmentType         1
AmountInUSD            0
Remarks             2003
dtype: int64

**Treating all Missing Values at once**

In [19]:
df.fillna('0', inplace=True)

The **inplace=True** parameter in the above code acts like an assignment operator and re-assigns the the changed changed df.

In [20]:
df.isnull().sum()

index               0
SNo                 0
Date                0
StartupName         0
IndustryVertical    0
SubVertical         0
CityLocation        0
InvestorsName       0
InvestmentType      0
AmountInUSD         0
Remarks             0
dtype: int64

Now you can see that all the columns have no **0** or **null values**

## Duplicates

**Checking for duplicates in a specefic column**

In [21]:
df.duplicated(['StartupName']).sum()

421

This means that there are **421** StartupName instances that are exactly the same.





**Checking for whole row duplicates**

In [22]:
df.duplicated().sum()

50

this returns the total number of rows that are duplicated.

In [23]:
df[df.duplicated()]

Unnamed: 0,index,SNo,Date,StartupName,IndustryVertical,SubVertical,CityLocation,InvestorsName,InvestmentType,AmountInUSD,Remarks
2372,0,0,01/08/2017,TouchKin,Technology,Predictive Care Platform,Bangalore,Kae Capital,Private Equity,1300000.0,0
2373,1,1,02/08/2017,Ethinos,Technology,Digital Marketing Agency,Mumbai,Triton Investment Advisors,Private Equity,0.0,0
2374,2,2,02/08/2017,Leverage Edu,Consumer Internet,Online platform for Higher Education Services,New Delhi,"Kashyap Deorah, Anand Sankeshwar, Deepak Jain,...",Seed Funding,0.0,0
2375,3,3,02/08/2017,Zepo,Consumer Internet,DIY Ecommerce platform,Mumbai,"Kunal Shah, LetsVenture, Anupam Mittal, Hetal ...",Seed Funding,500000.0,0
2376,4,4,02/08/2017,Click2Clinic,Consumer Internet,healthcare service aggregator,Hyderabad,"Narottam Thudi, Shireesh Palle",Seed Funding,850000.0,0
2377,5,5,01/07/2017,Billion Loans,Consumer Internet,Peer to Peer Lending platform,Bangalore,Reliance Corporate Advisory Services Ltd,Seed Funding,1000000.0,0
2378,6,6,03/07/2017,Ecolibriumenergy,Technology,Energy management solutions provider,Ahmedabad,"Infuse Ventures, JLL",Private Equity,2600000.0,0
2379,7,7,04/07/2017,Droom,eCommerce,Online marketplace for automobiles,Gurgaon,"Asset Management (Asia) Ltd, Digital Garage Inc",Private Equity,20000000.0,0
2380,8,8,05/07/2017,Jumbotail,eCommerce,online marketplace for food and grocery,Bangalore,"Kalaari Capital, Nexus India Capital Advisors",Private Equity,8500000.0,0
2381,9,9,05/07/2017,Moglix,eCommerce,B2B marketplace for Industrial products,Noida,"International Finance Corporation, Rocketship,...",Private Equity,12000000.0,0


This shows that these are the rows that are duplicated. They may have more than 2 instances.

**Deleting duplicates from a specefic column**

In [24]:
df.shape

(2422, 11)

Following code will delete the duplicates that are present in **StartupName**. The **keep='First'** parameter keeps the first occurance of the instance that is duplicated and deletes the rest. So in resutl only unique instances are left

In [25]:
df.drop_duplicates(['StartupName'], keep='first').shape # Default is keep=first

(2001, 11)

In [26]:
df.drop_duplicates(['StartupName'], keep='last').shape #keep=last keeps the last instance and deletes the rest

(2001, 11)

In [27]:
df.drop_duplicates(['StartupName'], keep=False).shape # this deletes all the occurances of the duplicates so none is left.

(1679, 11)

**NOTE:** In all the above occurances neither did we use the inplace=True parameter nor manually assigned it. So there won't be any change in the df

**Deleting whole row duplicates**

In [28]:
df.drop_duplicates().shape

(2372, 11)

In [29]:
df.shape

(2422, 11)

Now we will use the inplace=True parameter to save the data frame

In [30]:
df.drop_duplicates(inplace=True)
df.shape

(2372, 11)

## Selecting / Dropping Columns

**Selecting Columns**

In order to select multiple columns we have to pass their names as list in the data frame

In [31]:
df[['Date', 'CityLocation', 'AmountInUSD']]

Unnamed: 0,Date,CityLocation,AmountInUSD
0,01/08/2017,Bangalore,1300000.0
1,02/08/2017,Mumbai,0.0
2,02/08/2017,New Delhi,0.0
3,02/08/2017,Mumbai,500000.0
4,02/08/2017,Hyderabad,850000.0
5,01/07/2017,Bangalore,1000000.0
6,03/07/2017,Ahmedabad,2600000.0
7,04/07/2017,Gurgaon,20000000.0
8,05/07/2017,Bangalore,8500000.0
9,05/07/2017,Noida,12000000.0


**Dropping Columns**

In [32]:
df.head()

Unnamed: 0,index,SNo,Date,StartupName,IndustryVertical,SubVertical,CityLocation,InvestorsName,InvestmentType,AmountInUSD,Remarks
0,0,0,01/08/2017,TouchKin,Technology,Predictive Care Platform,Bangalore,Kae Capital,Private Equity,1300000.0,0
1,1,1,02/08/2017,Ethinos,Technology,Digital Marketing Agency,Mumbai,Triton Investment Advisors,Private Equity,0.0,0
2,2,2,02/08/2017,Leverage Edu,Consumer Internet,Online platform for Higher Education Services,New Delhi,"Kashyap Deorah, Anand Sankeshwar, Deepak Jain,...",Seed Funding,0.0,0
3,3,3,02/08/2017,Zepo,Consumer Internet,DIY Ecommerce platform,Mumbai,"Kunal Shah, LetsVenture, Anupam Mittal, Hetal ...",Seed Funding,500000.0,0
4,4,4,02/08/2017,Click2Clinic,Consumer Internet,healthcare service aggregator,Hyderabad,"Narottam Thudi, Shireesh Palle",Seed Funding,850000.0,0


In [33]:
df.drop('index', axis=1).shape

(2372, 10)

In [34]:
df.shape

(2372, 11)

use the inplace true paramter to save the data frame

In [35]:
df.drop('index', axis=1,inplace=True)

In [36]:
df.shape

(2372, 10)

In [37]:
df.head()

Unnamed: 0,SNo,Date,StartupName,IndustryVertical,SubVertical,CityLocation,InvestorsName,InvestmentType,AmountInUSD,Remarks
0,0,01/08/2017,TouchKin,Technology,Predictive Care Platform,Bangalore,Kae Capital,Private Equity,1300000.0,0
1,1,02/08/2017,Ethinos,Technology,Digital Marketing Agency,Mumbai,Triton Investment Advisors,Private Equity,0.0,0
2,2,02/08/2017,Leverage Edu,Consumer Internet,Online platform for Higher Education Services,New Delhi,"Kashyap Deorah, Anand Sankeshwar, Deepak Jain,...",Seed Funding,0.0,0
3,3,02/08/2017,Zepo,Consumer Internet,DIY Ecommerce platform,Mumbai,"Kunal Shah, LetsVenture, Anupam Mittal, Hetal ...",Seed Funding,500000.0,0
4,4,02/08/2017,Click2Clinic,Consumer Internet,healthcare service aggregator,Hyderabad,"Narottam Thudi, Shireesh Palle",Seed Funding,850000.0,0


## String Operations on whole columns

**String Replacement**

In [38]:
df['InvestmentType'].unique()

array(['Private Equity', 'Seed Funding', 'Debt Funding', '0',
       'SeedFunding', 'PrivateEquity', 'Crowd funding', 'Crowd Funding'],
      dtype=object)

We want to replace *PrivateEquatiy* which is written together to be written as *Private Equatiy*

In [39]:
# first parameter in str.replace is the one which we want to replace and the second one is with which we are replacing it
df['InvestmentType'].str.replace('PrivateEquity', 'Private Equity').unique() 

array(['Private Equity', 'Seed Funding', 'Debt Funding', '0',
       'SeedFunding', 'Crowd funding', 'Crowd Funding'], dtype=object)

In [40]:
df['InvestmentType'].unique()

array(['Private Equity', 'Seed Funding', 'Debt Funding', '0',
       'SeedFunding', 'PrivateEquity', 'Crowd funding', 'Crowd Funding'],
      dtype=object)

As you can see it is back in the df because we did not re assign it. **Inplace=True** does **NOT** work here so we have to reassign it manually

In [41]:
df['InvestmentType'] = df['InvestmentType'].str.replace('PrivateEquity', 'Private Equity')
df['InvestmentType'] = df['InvestmentType'].str.replace('SeedFunding', 'Seed Funding')
df['InvestmentType'] = df['InvestmentType'].str.replace('Crowd funding', 'Crowd Funding')
df['InvestmentType'] = df['InvestmentType'].str.replace('0', 'Other')

In [42]:
df['InvestmentType'].unique()

array(['Private Equity', 'Seed Funding', 'Debt Funding', 'Other',
       'Crowd Funding'], dtype=object)

**Capitalization**

To capitalize every name in a specific column use the follwing code

In [43]:
df['CityLocation'].str.lower()

0       bangalore
1          mumbai
2       new delhi
3          mumbai
4       hyderabad
5       bangalore
6       ahmedabad
7         gurgaon
8       bangalore
9           noida
10         mumbai
11      bangalore
12        gurgaon
13      bangalore
14      bangalore
15      hyderabad
16           pune
17         mumbai
18      bangalore
19      bangalore
20      hyderabad
21      bangalore
22           pune
23      bangalore
24      bangalore
25      bangalore
26           pune
27          noida
28      hyderabad
29        kolkata
          ...    
2342            0
2343            0
2344            0
2345            0
2346            0
2347            0
2348            0
2349            0
2350            0
2351            0
2352            0
2353            0
2354            0
2355            0
2356            0
2357            0
2358            0
2359            0
2360            0
2361            0
2362            0
2363            0
2364            0
2365            0
2366      

**Checking if there is a substring in each column value**

In [44]:
df[df['InvestorsName'].str.contains('Khan')]

Unnamed: 0,SNo,Date,StartupName,IndustryVertical,SubVertical,CityLocation,InvestorsName,InvestmentType,AmountInUSD,Remarks
81,81,16/06/2017,Fincash,Consumer Internet,Personal Finance platform,Mumbai,"Mohammed Khan, Sameer Narayan & Others",Seed Funding,100000.0,0
780,780,17/08/2016,MaalGaadi,Logistics,Online Logistics Marketplace,Indore,"Swan Angel Network,Sachin Khandelwal and others",Seed Funding,375000.0,0
871,871,15/07/2016,BaggOut,eCommerce,Women’s Fashion etailer,New Delhi,"Sumit Jain, Sumit Jain, Anurag Gupta, Varun Kh...",Seed Funding,0.0,0
920,920,10/06/2016,Kickstart Jobs,Technology,Entry level hiring platform,Gurgaon,"Vivek Joshi, Mohit Satyanand, Amit Banati, Aru...",Seed Funding,0.0,0
947,947,17/06/2016,BYG,Consumer Internet,Fitness centre Discovery & Booking Mobile app,Bangalore,"Sanjay Verma, Amit Khanna (LetsVenture)",Seed Funding,0.0,0
1094,1094,13/04/2016,Legalraasta,Consumer Internet,Online legal Services for Startups,New Delhi,"Pravin Khandelwal, Yatin Kumar Jain",Seed Funding,1000000.0,0
1155,1155,03/3/2016,Imarticus Learning,Education,Financial Services & Analytics Education Insti...,Mumbai,"Blinc Advisors, Amit Nanavati, Tashwinder Sing...",Private Equity,1000000.0,0
1581,1581,19/11/2015,PlaceofOrigin,Online Gourmet Food Marketplace,0,Bangalore,"S.D. Shibulal, Kris Gopalakrishnan, Srinath Ba...",Seed Funding,0.0,0
1598,1598,25/11/2015,Tooler,On Demand Laundry Services App,0,New Delhi,"Raghu Khanna, Sameer Gupta",Seed Funding,110000.0,0
1793,1793,29/09/2015,LoanCircle,Consumer lending marketplace,0,Bangalore,"Zishaan Hayath, Rahul Khanna & Others",Seed Funding,0.0,0


In [45]:
df[df['InvestorsName'].str.contains('Khan')]

Unnamed: 0,SNo,Date,StartupName,IndustryVertical,SubVertical,CityLocation,InvestorsName,InvestmentType,AmountInUSD,Remarks
81,81,16/06/2017,Fincash,Consumer Internet,Personal Finance platform,Mumbai,"Mohammed Khan, Sameer Narayan & Others",Seed Funding,100000.0,0
780,780,17/08/2016,MaalGaadi,Logistics,Online Logistics Marketplace,Indore,"Swan Angel Network,Sachin Khandelwal and others",Seed Funding,375000.0,0
871,871,15/07/2016,BaggOut,eCommerce,Women’s Fashion etailer,New Delhi,"Sumit Jain, Sumit Jain, Anurag Gupta, Varun Kh...",Seed Funding,0.0,0
920,920,10/06/2016,Kickstart Jobs,Technology,Entry level hiring platform,Gurgaon,"Vivek Joshi, Mohit Satyanand, Amit Banati, Aru...",Seed Funding,0.0,0
947,947,17/06/2016,BYG,Consumer Internet,Fitness centre Discovery & Booking Mobile app,Bangalore,"Sanjay Verma, Amit Khanna (LetsVenture)",Seed Funding,0.0,0
1094,1094,13/04/2016,Legalraasta,Consumer Internet,Online legal Services for Startups,New Delhi,"Pravin Khandelwal, Yatin Kumar Jain",Seed Funding,1000000.0,0
1155,1155,03/3/2016,Imarticus Learning,Education,Financial Services & Analytics Education Insti...,Mumbai,"Blinc Advisors, Amit Nanavati, Tashwinder Sing...",Private Equity,1000000.0,0
1581,1581,19/11/2015,PlaceofOrigin,Online Gourmet Food Marketplace,0,Bangalore,"S.D. Shibulal, Kris Gopalakrishnan, Srinath Ba...",Seed Funding,0.0,0
1598,1598,25/11/2015,Tooler,On Demand Laundry Services App,0,New Delhi,"Raghu Khanna, Sameer Gupta",Seed Funding,110000.0,0
1793,1793,29/09/2015,LoanCircle,Consumer lending marketplace,0,Bangalore,"Zishaan Hayath, Rahul Khanna & Others",Seed Funding,0.0,0


In [46]:
df['InvestorsName'].str.contains('Khan').sum()

11

In [47]:
df['InvestorsName'].str.contains('khan').sum()

3

**What do you observe?**
<br>The str.contains() function is case sensative. If you pass Khan with capital K then it will search it with capital K otherwise if you pass lower k then it will search accordingly.

## Column Transformations / Splits

**Changing Data types**

___1. Normal conversion___

In [48]:
df.dtypes

SNo                   int64
Date                 object
StartupName          object
IndustryVertical     object
SubVertical          object
CityLocation         object
InvestorsName        object
InvestmentType       object
AmountInUSD         float64
Remarks              object
dtype: object

In [49]:
df['AmountInUSD']=df['AmountInUSD'].astype(np.int32)

In [50]:
df.dtypes

SNo                  int64
Date                object
StartupName         object
IndustryVertical    object
SubVertical         object
CityLocation        object
InvestorsName       object
InvestmentType      object
AmountInUSD          int32
Remarks             object
dtype: object

___2. Dates Conversion___

This will return an error as there are unknown string formats present. The below function will only work when date is in a specific format like 1/1/2001

In [51]:
pd.to_datetime(df['Date'])

ValueError: ('Unknown string format:', '12/05.2015')

we saw in the data that there were some anomolies present. Like in the dates double slashes were used instead of single so we had to replace them
<br> The errors='coerce' parameter ignores the errors and converts the rest.

In [None]:
df['Date'] = pd.to_datetime(df['Date'].str.replace('//', '/'), dayfirst=True, errors='coerce')

In [None]:
df.dtypes

**Getting months/year/days as separate columns from a datetime column**

In [None]:
df['month'] = pd.DatetimeIndex(df['Date']).month

**Exercise:** Extract day and year

**Extracting new columns from existing string columns**

In [None]:
df['InvestmentType'].str.split(' ')

In [None]:
df['InvestmentType'].str.split(' ').str[0]

## MultiIndexing

In [None]:
grouped = df.groupby(['IndustryVertical', 'SubVertical'])

In [None]:
new = grouped.agg({'AmountInUSD': {'Mean': np.mean, 'Sum': np.sum}})

In [None]:
new

In [None]:
new['AmountInUSD']

#### You can slice a MultiIndex by providing multiple indexers.


You can use pandas.IndexSlice to facilitate a more natural syntax using :,



In [None]:
idx = pd.IndexSlice
new.loc[idx[:, ['AI Based Personal Assistant', 'App based cab aggregator']], idx[:, 'Sum']]


In [None]:
new['AmountInUSD']['Sum']

## Querying a Dataframe

**Querying based on a single value**

In [None]:
df['AmountInUSD'] < 100000

In [None]:
df[df['AmountInUSD'] < 100000]

**Querying based on multiple columns**

In [None]:
df[(df['AmountInUSD'] < 100000) & (df['CityLocation'] == 'New Delhi')]

**Querying based on a list of values**

In [None]:
df['CityLocation'].isin(['New Delhi', 'Bangalore', 'Mumbai'])

In [None]:
#df[(df['CityLocation'].isin(['New Delhi', 'Bangalore', 'Mumbai'])) & (df['AmountInUSD'] < 100000)]

df[
    (df['AmountInUSD'] > 5000) 
    &
    (df['CityLocation'].isin(['New Dehli']))
]



## `DataFrame.at` vs. `DataFrame.loc`

In [74]:
df.at[133,['StartupName', 'AmountInUSD']]

TypeError: unhashable type: 'list'

In [75]:
df.at[133,'StartupName']

'Curie Labs'

In [76]:
df.loc[133,['StartupName', 'AmountInUSD']]

StartupName    Curie Labs
AmountInUSD         50000
Name: 133, dtype: object

In [77]:
%timeit df.at[133,'StartupName']

9 µs ± 2.09 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)


In [78]:
%timeit df.loc[133,'StartupName']

12.7 µs ± 2.05 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


## Joins

In [79]:
rows1 = df[0:10]
rows2 = df[5:12]
print(rows1.shape)
rows1
table1 = rows1[['SNo', 'Date', 'CityLocation']]
table2 = rows2[['SNo', 'StartupName', 'IndustryVertical', 'InvestmentType']]
print(table1.shape)
table1

(10, 11)
(10, 3)


Unnamed: 0,SNo,Date,CityLocation
0,0,2017-08-01,Bangalore
1,1,2017-08-02,Mumbai
2,2,2017-08-02,New Delhi
3,3,2017-08-02,Mumbai
4,4,2017-08-02,Hyderabad
5,5,2017-07-01,Bangalore
6,6,2017-07-03,Ahmedabad
7,7,2017-07-04,Gurgaon
8,8,2017-07-05,Bangalore
9,9,2017-07-05,Noida


In [80]:
print(table2.shape)
table2

(7, 4)


Unnamed: 0,SNo,StartupName,IndustryVertical,InvestmentType
5,5,Billion Loans,Consumer Internet,Seed Funding
6,6,Ecolibriumenergy,Technology,Private Equity
7,7,Droom,eCommerce,Private Equity
8,8,Jumbotail,eCommerce,Private Equity
9,9,Moglix,eCommerce,Private Equity
10,10,Timesaverz,Consumer Internet,Private Equity
11,11,Minjar,Technology,Seed Funding


In [81]:
table1.merge(table2, how='inner', on='SNo')

Unnamed: 0,SNo,Date,CityLocation,StartupName,IndustryVertical,InvestmentType
0,5,2017-07-01,Bangalore,Billion Loans,Consumer Internet,Seed Funding
1,6,2017-07-03,Ahmedabad,Ecolibriumenergy,Technology,Private Equity
2,7,2017-07-04,Gurgaon,Droom,eCommerce,Private Equity
3,8,2017-07-05,Bangalore,Jumbotail,eCommerce,Private Equity
4,9,2017-07-05,Noida,Moglix,eCommerce,Private Equity


## Sorting

In [82]:
df.sort_values(by=['CityLocation', 'IndustryVertical'])

Unnamed: 0,SNo,Date,StartupName,IndustryVertical,SubVertical,CityLocation,InvestorsName,InvestmentType,AmountInUSD,Remarks,month
2201,2201,2015-03-02,TrulyMadly.com,0,0,0,"Helion Venture Partners, Kae Capital",Private Equity,5500000,Series A,3.0
2202,2202,2015-03-02,InstaLively,0,0,0,Group of Angel Investors,Seed Funding,120000,0,3.0
2203,2203,2015-03-03,Vserv,0,0,0,"Maverick Capital, IDG Ventures India",Private Equity,15000000,0,3.0
2204,2204,2015-03-03,Intruo.com,0,0,0,"Ashutosh Lawania, Alok Goel & others",Seed Funding,0,0,3.0
2205,2205,2015-03-05,Niffler,0,0,0,SAIF Partners,Seed Funding,1000000,0,3.0
2206,2206,2015-03-05,CustomFurnish.com,0,0,0,"Madhukar Gangadi, Satish Reddy, Srini Raju, Sr...",Private Equity,2500000,0,3.0
2207,2207,2015-03-06,MapMyGenome,0,0,0,Rajan Anandan & other angel investors,Private Equity,1200000,0,3.0
2208,2208,2015-03-09,Crowdfire (formerly Justunfollow),0,0,0,Kalaari Capital,Private Equity,2500000,0,3.0
2209,2209,2015-03-09,Bite Club,0,0,0,"Powai Lake Ventures, Aneesh Reddy, Ashish Kash...",Seed Funding,500000,0,3.0
2210,2210,2015-03-10,Localbanya,0,0,0,Shrem Strategies,Private Equity,0,Series B,3.0


In [83]:
df.sort_values(by=['CityLocation', 'IndustryVertical'], ascending=False)

Unnamed: 0,SNo,Date,StartupName,IndustryVertical,SubVertical,CityLocation,InvestorsName,InvestmentType,AmountInUSD,Remarks,month
47,47,2017-07-28,Gympik.com,Consumer Internet,online marketplace for discovering fitness cen...,bangalore,RoundGlass Partners,Seed Funding,0,0,7.0
1251,1251,2016-02-05,DawaiLelo,Consumer Internet,Healthcare Services & Online Pharmacy Mobile App,Varanasi,Undisclosed investors,Seed Funding,52000,0,2.0
259,259,2017-03-02,Kreate Konnect,Technology,End-to-End Seller e-commerce solutions Provider,Vadodara,Langoor,Seed Funding,0,0,3.0
1555,1555,2015-11-11,Gingercrush,Product Customization Platform,0,Vadodara,TV Mohandas Pai’s family office,Seed Funding,0,0,11.0
1530,1530,2015-11-03,boibanit,Online Food Ordering Marketplace,0,Vadodara,"Varun Ahuja, Anubhav Verma",Seed Funding,150000,0,11.0
2171,2171,2015-04-20,Pastiwala,Material Collection & Recycling,0,Vadodara,Agnus Capital,Private Equity,4000000,Series A,4.0
1249,1249,2016-02-05,Gingercrush,Ecommerce,Personalized Products & Merchandize eTailer,Vadodara,"Saha Fund, TV Mohandas Pai, Zia Mody, Mumbai A...",Private Equity,1000000,0,2.0
583,583,2016-11-30,Math Buddy,Consumer Internet,Online Math Learn,Vadodara,Menterra Social Impact Fund,Seed Funding,440000,0,11.0
713,713,2016-09-21,PurpleDocs,Consumer Internet,Electronic Health care records platform,Vadodara,"KellyGamma, Lead Angels & Others",Seed Funding,0,0,9.0
935,935,2016-06-15,Oneway.cab,Consumer Internet,Taxi Rental Platform,Vadodara,Indian Angel Network,Seed Funding,450000,0,6.0


## Grouping - Aggregation & Value Counts

**Without grouping**

In [84]:
df['AmountInUSD'].sum()

18347386476

In [85]:
df['IndustryVertical'].unique()

array(['Technology', 'Consumer Internet', 'eCommerce', 'Logistics',
       'Others', 'Healthcare', 'Food & Beverage', 'ECommerce', 'Finance',
       'Education', 'Food & Beverages', 'ecommerce', 'healthcare',
       'Real Estate', 'FMCG', 'Reality', 'Auto', 'Ecommerce', 'BFSI',
       'Consumer Interne', 'Online Education Information platform',
       'Brand Licensing Startup',
       'Gourmet Food Discovery & Delivery platform',
       'Transportation & Logistics Platform',
       'Enterprise Marketing Automation platform',
       'Health, Wellness & Beauty Services App', 'Digital Healthcare',
       'Last Minute Hotel Booking App', 'Womens Fashion Wear Portal',
       'Product Learning platform',
       'Online Food ordering & Delivery platform',
       'App based Bus Pooling Services', 'Social Learning Platform',
       'Social Fitness platform', 'On Demand Mobile app developer',
       'Car Maintenance & Management mobile app',
       'Online Wedding Marketplace', 'Splitting Bills 

In [86]:
df['IndustryVertical'].value_counts()

Consumer Internet                                          772
Technology                                                 313
eCommerce                                                  171
0                                                          171
ECommerce                                                   53
Healthcare                                                  30
Logistics                                                   24
Education                                                   20
Food & Beverage                                             19
Finance                                                      9
Others                                                       6
Online Food Delivery                                         5
Online Education Platform                                    5
Real Estate                                                  4
FMCG                                                         3
ecommerce                                              

**Grouping based on a single column**

In [87]:
df.groupby('InvestmentType')


<pandas.core.groupby.DataFrameGroupBy object at 0x0000015892A24EF0>

In [88]:
df.groupby('InvestmentType')['AmountInUSD'].sum()

InvestmentType
Crowd Funding     1.557680e+05
Debt Funding      7.800000e+06
Other             0.000000e+00
Private Equity    1.800708e+10
Seed Funding      3.323457e+08
Name: AmountInUSD, dtype: float64

In [89]:
df.groupby('CityLocation')['IndustryVertical'].value_counts()

CityLocation         IndustryVertical                                  
0                    0                                                     171
                     Consumer Internet                                       5
                     Mobile Point of Sale payment solution                   1
                     Online Travel Planning                                  1
                     SaaS product intelligence platform                      1
Agra                 eCommerce                                               2
Ahmedabad            Consumer Internet                                      12
                     Technology                                              6
                     eCommerce                                               6
                     Cloud Based Collaboration platform                      1
                     Custom Merchandize platform                             1
                     Enterprise Communication Platform     

**Grouping w.r.t. multiple columns**

In [90]:
df.groupby(['CityLocation', 'InvestmentType'])['AmountInUSD'].sum()

CityLocation           InvestmentType
0                      Crowd Funding     1.557680e+05
                       Private Equity    1.237675e+09
                       Seed Funding      3.403310e+07
Agra                   Seed Funding      0.000000e+00
Ahmedabad              Debt Funding      7.800000e+06
                       Private Equity    8.565000e+07
                       Seed Funding      4.736000e+06
Bangalore              Private Equity    8.289120e+09
                       Seed Funding      9.465411e+07
Bangalore / Palo Alto  Seed Funding      1.000000e+06
Bangalore / SFO        Private Equity    1.350000e+07
                       Seed Funding      1.800000e+06
Bangalore / San Mateo  Private Equity    8.000000e+06
Bangalore / USA        Private Equity    5.000000e+06
Bangalore/ Bangkok     Private Equity    9.900000e+06
Belgaum                Seed Funding      5.000000e+05
Bhopal                 Private Equity    1.800000e+06
                       Seed Funding      1.0

## Resetting Index

In [91]:
type(df.groupby(['CityLocation', 'InvestmentType'])['AmountInUSD'].sum())

pandas.core.series.Series

In [92]:
df.groupby(['CityLocation', 'InvestmentType'])['AmountInUSD'].sum().reset_index()

Unnamed: 0,CityLocation,InvestmentType,AmountInUSD
0,0,Crowd Funding,1.557680e+05
1,0,Private Equity,1.237675e+09
2,0,Seed Funding,3.403310e+07
3,Agra,Seed Funding,0.000000e+00
4,Ahmedabad,Debt Funding,7.800000e+06
5,Ahmedabad,Private Equity,8.565000e+07
6,Ahmedabad,Seed Funding,4.736000e+06
7,Bangalore,Private Equity,8.289120e+09
8,Bangalore,Seed Funding,9.465411e+07
9,Bangalore / Palo Alto,Seed Funding,1.000000e+06


In [93]:
type(df.groupby(['CityLocation', 'InvestmentType'])['AmountInUSD'].sum().reset_index())

pandas.core.frame.DataFrame

In [94]:
grouped=df.groupby(['CityLocation', 'InvestmentType'])
grouped.agg({'AmountInUSD':['min','max','mean']})
grouped.agg({'AmountInUSD':{'min':np.min,'mean':np.mean,'max':np.max}})

  return super(DataFrameGroupBy, self).aggregate(arg, *args, **kwargs)


Unnamed: 0_level_0,Unnamed: 1_level_0,AmountInUSD,AmountInUSD,AmountInUSD
Unnamed: 0_level_1,Unnamed: 1_level_1,min,mean,max
CityLocation,InvestmentType,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2
0,Crowd Funding,30768,7.788400e+04,125000
0,Private Equity,0,1.422615e+07,110000000
0,Seed Funding,0,3.781456e+05,3250000
Agra,Seed Funding,0,0.000000e+00,0
Ahmedabad,Debt Funding,7800000,7.800000e+06,7800000
Ahmedabad,Private Equity,1000000,7.786364e+06,32000000
Ahmedabad,Seed Funding,0,2.059130e+05,1000000
Bangalore,Private Equity,0,2.726684e+07,1400000000
Bangalore,Seed Funding,0,2.930468e+05,7500000
Bangalore / Palo Alto,Seed Funding,1000000,1.000000e+06,1000000


In [95]:
df.dtypes

SNo                          int64
Date                datetime64[ns]
StartupName                 object
IndustryVertical            object
SubVertical                 object
CityLocation                object
InvestorsName               object
InvestmentType              object
AmountInUSD                  int32
Remarks                     object
month                      float64
dtype: object

In [96]:
new = df.groupby(['CityLocation', 'InvestmentType'])['AmountInUSD'].sum().reset_index()
new['CityLocation'] = new['CityLocation'].str.replace('0', 'Other')
new

Unnamed: 0,CityLocation,InvestmentType,AmountInUSD
0,Other,Crowd Funding,1.557680e+05
1,Other,Private Equity,1.237675e+09
2,Other,Seed Funding,3.403310e+07
3,Agra,Seed Funding,0.000000e+00
4,Ahmedabad,Debt Funding,7.800000e+06
5,Ahmedabad,Private Equity,8.565000e+07
6,Ahmedabad,Seed Funding,4.736000e+06
7,Bangalore,Private Equity,8.289120e+09
8,Bangalore,Seed Funding,9.465411e+07
9,Bangalore / Palo Alto,Seed Funding,1.000000e+06


## Pivoting & Melting

**Pivoting**

In [97]:
new.fillna('0',inplace=True)
new

Unnamed: 0,CityLocation,InvestmentType,AmountInUSD
0,Other,Crowd Funding,1.557680e+05
1,Other,Private Equity,1.237675e+09
2,Other,Seed Funding,3.403310e+07
3,Agra,Seed Funding,0.000000e+00
4,Ahmedabad,Debt Funding,7.800000e+06
5,Ahmedabad,Private Equity,8.565000e+07
6,Ahmedabad,Seed Funding,4.736000e+06
7,Bangalore,Private Equity,8.289120e+09
8,Bangalore,Seed Funding,9.465411e+07
9,Bangalore / Palo Alto,Seed Funding,1.000000e+06


In [107]:
pivoted=pd.pivot_table(new,index='InvestmentType', columns='CityLocation', values='AmountInUSD',aggfunc='sum',fill_value=0)

In [108]:
pivoted

CityLocation,Agra,Ahmedabad,Bangalore,Bangalore / Palo Alto,Bangalore / SFO,Bangalore / San Mateo,Bangalore / USA,Bangalore/ Bangkok,Belgaum,Bhopal,...,Trivandrum,US,US/India,USA,USA/India,Udaipur,Udupi,Vadodara,Varanasi,bangalore
InvestmentType,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Crowd Funding,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
Debt Funding,0,7800000,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
Other,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
Private Equity,0,85650000,8289120000,0,13500000,8000000,5000000,9900000,0,1800000,...,0,0,3000000,0,16600000,0,12000000,5000000,0,0
Seed Funding,0,4736000,94654108,1000000,1800000,0,0,0,500000,100000,...,100000,0,0,0,0,0,0,1040000,52000,0


In [111]:
a=pivoted.reset_index()

**Melting**

In [112]:
pd.melt(a, id_vars=['InvestmentType'], value_vars=['Agra','Ahmedabad'], value_name='Amount')

Unnamed: 0,InvestmentType,CityLocation,Amount
0,Crowd Funding,Agra,0
1,Debt Funding,Agra,0
2,Other,Agra,0
3,Private Equity,Agra,0
4,Seed Funding,Agra,0
5,Crowd Funding,Ahmedabad,0
6,Debt Funding,Ahmedabad,7800000
7,Other,Ahmedabad,0
8,Private Equity,Ahmedabad,85650000
9,Seed Funding,Ahmedabad,4736000


## Column / Row Concatenation

In [113]:
df.columns

Index(['SNo', 'Date', 'StartupName', 'IndustryVertical', 'SubVertical',
       'CityLocation', 'InvestorsName', 'InvestmentType', 'AmountInUSD',
       'Remarks', 'month'],
      dtype='object')

In [114]:
part1 = df[['SNo', 'Date', 'StartupName']]
part2 = df[['InvestorsName', 'InvestmentType']]
print(part1.shape)
part1.head()

(2372, 3)


Unnamed: 0,SNo,Date,StartupName
0,0,2017-08-01,TouchKin
1,1,2017-08-02,Ethinos
2,2,2017-08-02,Leverage Edu
3,3,2017-08-02,Zepo
4,4,2017-08-02,Click2Clinic


In [115]:
print(part2.shape)
part2.head()

(2372, 2)


Unnamed: 0,InvestorsName,InvestmentType
0,Kae Capital,Private Equity
1,Triton Investment Advisors,Private Equity
2,"Kashyap Deorah, Anand Sankeshwar, Deepak Jain,...",Seed Funding
3,"Kunal Shah, LetsVenture, Anupam Mittal, Hetal ...",Seed Funding
4,"Narottam Thudi, Shireesh Palle",Seed Funding


In [116]:
pd.concat([part1, part2], axis=1)

Unnamed: 0,SNo,Date,StartupName,InvestorsName,InvestmentType
0,0,2017-08-01,TouchKin,Kae Capital,Private Equity
1,1,2017-08-02,Ethinos,Triton Investment Advisors,Private Equity
2,2,2017-08-02,Leverage Edu,"Kashyap Deorah, Anand Sankeshwar, Deepak Jain,...",Seed Funding
3,3,2017-08-02,Zepo,"Kunal Shah, LetsVenture, Anupam Mittal, Hetal ...",Seed Funding
4,4,2017-08-02,Click2Clinic,"Narottam Thudi, Shireesh Palle",Seed Funding
5,5,2017-07-01,Billion Loans,Reliance Corporate Advisory Services Ltd,Seed Funding
6,6,2017-07-03,Ecolibriumenergy,"Infuse Ventures, JLL",Private Equity
7,7,2017-07-04,Droom,"Asset Management (Asia) Ltd, Digital Garage Inc",Private Equity
8,8,2017-07-05,Jumbotail,"Kalaari Capital, Nexus India Capital Advisors",Private Equity
9,9,2017-07-05,Moglix,"International Finance Corporation, Rocketship,...",Private Equity


**Rows Concatenation**

In [117]:
rows1 = df[:3]
rows2 = df[5:7]
print(rows1.shape)
rows1

(3, 11)


Unnamed: 0,SNo,Date,StartupName,IndustryVertical,SubVertical,CityLocation,InvestorsName,InvestmentType,AmountInUSD,Remarks,month
0,0,2017-08-01,TouchKin,Technology,Predictive Care Platform,Bangalore,Kae Capital,Private Equity,1300000,0,8.0
1,1,2017-08-02,Ethinos,Technology,Digital Marketing Agency,Mumbai,Triton Investment Advisors,Private Equity,0,0,8.0
2,2,2017-08-02,Leverage Edu,Consumer Internet,Online platform for Higher Education Services,New Delhi,"Kashyap Deorah, Anand Sankeshwar, Deepak Jain,...",Seed Funding,0,0,8.0


In [118]:
print(rows2.shape)
rows2

(2, 11)


Unnamed: 0,SNo,Date,StartupName,IndustryVertical,SubVertical,CityLocation,InvestorsName,InvestmentType,AmountInUSD,Remarks,month
5,5,2017-07-01,Billion Loans,Consumer Internet,Peer to Peer Lending platform,Bangalore,Reliance Corporate Advisory Services Ltd,Seed Funding,1000000,0,7.0
6,6,2017-07-03,Ecolibriumenergy,Technology,Energy management solutions provider,Ahmedabad,"Infuse Ventures, JLL",Private Equity,2600000,0,7.0


In [119]:
pd.concat([rows1, rows2], axis=0)

Unnamed: 0,SNo,Date,StartupName,IndustryVertical,SubVertical,CityLocation,InvestorsName,InvestmentType,AmountInUSD,Remarks,month
0,0,2017-08-01,TouchKin,Technology,Predictive Care Platform,Bangalore,Kae Capital,Private Equity,1300000,0,8.0
1,1,2017-08-02,Ethinos,Technology,Digital Marketing Agency,Mumbai,Triton Investment Advisors,Private Equity,0,0,8.0
2,2,2017-08-02,Leverage Edu,Consumer Internet,Online platform for Higher Education Services,New Delhi,"Kashyap Deorah, Anand Sankeshwar, Deepak Jain,...",Seed Funding,0,0,8.0
5,5,2017-07-01,Billion Loans,Consumer Internet,Peer to Peer Lending platform,Bangalore,Reliance Corporate Advisory Services Ltd,Seed Funding,1000000,0,7.0
6,6,2017-07-03,Ecolibriumenergy,Technology,Energy management solutions provider,Ahmedabad,"Infuse Ventures, JLL",Private Equity,2600000,0,7.0


## Looping through DataFrame

#### Using Simple for loop

In [120]:
import time
start = time.time()
for x in df['StartupName']:
    print(x)
end = time.time()
print(end - start)

TouchKin
Ethinos
Leverage Edu
Zepo
Click2Clinic
Billion Loans
Ecolibriumenergy
Droom
Jumbotail
Moglix
Timesaverz
Minjar
MyCity4kids
Clip App
Upwardly.in
Autorox.co
Fabogo
Flickstree
Design Cafe
Innoviti
VDeliver
Bottr.me
Arcatron
QwikSpec
Chumbak
Increff
Vayana
MObiquest
Ambee
Ideal Insurance
Hypernova Interactive
Rentomojo
AirCTO
Playablo
Trupay
Brick2Wall
FableStreet
Monsoon Fintech
MonkeyBox
Noticeboard
Byju’s
Creator’s Gurukul
Fab Hotels
ThinkerBell
1mg
Jhakaas
BigStylist
Gympik.com
Tripeur
RentOnGo
Goomo
MaxMyWealth
Spinny
Healthbuds
Ftcash
BHIVE Workspace
wayForward
GyanDhan
Hungry Foal
ZipLoan
GrowFitter
Stratfit
Multiplier Solutions
ABI Health
Stockal
Guidoo
eSahai.in
Loanmeet
LendingKart
Boxx.ai
PeeSafe.in
Vista Rooms
CoverFox
HyperTrack
Goodera
Digit
Ola
Vanitycask
MrNeeds
MoneyTap
Goodbox
Fincash
PortDesk
EdgeFx
EazyDiner
HealthIntel
Unbxd
DarwinBox
ForeverShop
Insider.in
Fynd
Grow Fit
Fabulyst
mSwipe
OneAssist
Julia Computing
Kissht
Explore Life traveling
Loantap
Voxweb
HUG

Ixigo
SkillAngels
Vidgyor
Banihal
Goodservice
Treebo
EduKart
Healthians.com
LabInApp
Syona Cosmetics
Sigrid Education
Indix
The Porter
Retention.ai
Fusion Microfinance
Zinka
CreditMantri
GetMyPeon
UrbanClap
StayGlad
AdWyze
Zimmber
Buttercups
RailYatri
aagaar.com
Flatchat
MyCFO
Foodpanda
Termsheet
Applicate
World Art Community
SpoonJoy
Seed Schools
Ignis Careers
Square Yards
Fintellix
Customer360
Delhivery
Swiggy
Uniphore
Box8
Toppr
Vedantu
BuyHatke
KleverKid
Uniken
GrandOpinion
HealthifyMe
ZapStitch
Wassup
Mobiefit
Plancess
WorkHorse
Innovaccer
MyCuteOffice
IndianRoots.com
RedPolka
Venturesity
TheBetterIndia.com
Awaaz De
Chaayos
Akosha
Cooey
MeetUniv
Inspirock
Crown-it
Trucksfirst
Lookup
SunTerrace
Nudgespot
CarDekho
EveningFlavors
Razorpay
ZenParent
Newgen Payments
UrbanPro
Goodbox
Renderlogy
Postman
Casa2inns
FleetRover
TheKarrier
Zoomcar
Truweight
Ather Energy
Swiggy
Bluegape
KeepTrax
InstaLively
Pricejugaad
Quikr
PressPlay
LogiNext
FirstCry.com
MobiKwik
Olacabs
Gadgetwood
Bonhomia


#### Using iterrows func

In [121]:

start = time.time()

for index, row in df.iterrows():
    
    print(index,row['StartupName'], row['CityLocation'])
end = time.time()
print(end - start)

0 TouchKin Bangalore
1 Ethinos Mumbai
2 Leverage Edu New Delhi
3 Zepo Mumbai
4 Click2Clinic Hyderabad
5 Billion Loans Bangalore
6 Ecolibriumenergy Ahmedabad
7 Droom Gurgaon
8 Jumbotail Bangalore
9 Moglix Noida
10 Timesaverz Mumbai
11 Minjar Bangalore
12 MyCity4kids Gurgaon
13 Clip App Bangalore
14 Upwardly.in Bangalore
15 Autorox.co Hyderabad
16 Fabogo Pune
17 Flickstree Mumbai
18 Design Cafe Bangalore
19 Innoviti Bangalore
20 VDeliver Hyderabad
21 Bottr.me Bangalore
22 Arcatron Pune
23 QwikSpec Bangalore
24 Chumbak Bangalore
25 Increff Bangalore
26 Vayana Pune
27 MObiquest Noida
28 Ambee Hyderabad
29 Ideal Insurance Kolkata
30 Hypernova Interactive Bangalore
31 Rentomojo Bangalore
32 AirCTO Bangalore
33 Playablo Bangalore
34 Trupay Gurgaon
35 Brick2Wall Gurgaon
36 FableStreet New Delhi
37 Monsoon Fintech New Delhi
38 MonkeyBox Bangalore
39 Noticeboard Bangalore
40 Byju’s Bangalore
41 Creator’s Gurukul New Delhi
42 Fab Hotels New Delhi
43 ThinkerBell Bangalore
44 1mg Gurgaon
45 Jhakaas

491 Zoomcar Bangalore
492 BYJU’s Bangalore
493 Supr Daily Mumbai
494 The BlueBook Bangalore
495 MamaEarth New Delhi
496 Vegetall Chennai
497 Innoplexus Pune
498 Mindler New Delhi
499 MCaffeine Bangalore
500 enKast Bangalore
501 ShopKirana Indore
502 MyAdvo New Delhi
503 Asocon Noida
504 LexComply New Delhi
505 The Postbox Chennai
506 Pandorum Technologies Bangalore
507 Dekkho Mumbai
508 Shadowfox Gurgaon
509 ShopX Bangalore
510 BookEventz Mumbai
511 Healthifyme Bangalore
512 LetsMD New Delhi
513 Justbooks Bangalore
514 InstantPay New Delhi
515 PickMyLaundry New Delhi
516 FreshDesk Chennai
517 LaundryAnna Bangalore
518 LetsMD New Delhi
519 Sattviko New Delhi
520 Zarget Chennai
521 DoSelect Bangalore
522 EasyPolicy Noida
523 Browntape Goa
524 Connect India Bangalore
525 Corseco New Delhi
526 vImmune Bangalore
527 The Gourmet Jar Noida
528 Medzin New Delhi
529 Bizongo Mumbai
530 Rivigo Gurgaon
531 JustRide Mumbai
532 IIM Jobs New Delhi
533 Square Yards Gurgaon
534 GolfLAN New Delhi
535 Yo

891 Chikoop Indore
892 BRIDGEi2I Bangalore
893 Droom New Delhi
894 Visit New Delhi
895 Atomberg Mumbai
896 Coutloot Mumbai
897 Limo Mumbai
898 iCliniq Bangalore
899 GoGo Truck Chennai
900 Vyome Biosciences New Delhi
901 SillyMonks Hyderabad
902 CueMath Bangalore
903 KrazyBee Bangalore
904 Intgrea Bangalore
905 HelpShift Pune
906 CreditVidya Mumbai
907 Grey Campus Hyderabad
908 ListUp Mumbai
909 6Degree Mumbai
910 Shopwati Gurgaon
911 Cookifi Bangalore
912 StoreKing Bangalore
913 Wired Hub Jaipur
914 MySeniorDoctor Gurgaon
915 Teabox Siliguri
916 ExtraaEdge Pune
917 EPayLater Mumbai
918 Kyazoonga Mumbai
919 Paytunes New Delhi
920 Kickstart Jobs Gurgaon
921 Freshee Mumbai
922 MadRat Games Bangalore
923 Jazzmyride New Delhi
924 AutoSense New Delhi
925 Redsun Telematics Chennai
926 KhanaGadi Jaipur
927 Bombay Shirt Company Mumbai
928 Matrubharti 0
929 EdTechReview New Delhi
930 Bonhomia New Delhi
931 Gobolt New Delhi
932 Jivox Bangalore
933 Sminq Pune
934 CureJoy Bangalore
935 Oneway.cab V

1331 PinClick Bangalore
1332 Smartcooky New Delhi
1333 Helpi Mumbai
1334 DogSpot Gurgaon
1335 Career360 New Delhi
1336 B9 Beverages New Delhi
1337 WIMWI Foods Ahmedabad
1338 MyChild App Bhopal
1339 Instaproc Noida
1340 360Ride Bangalore
1341 AlefMobitech Mumbai
1342 Tracxn Bangalore
1343 Qdesq Gurgaon
1344 Sensara Bangalore
1345 Zebpay Ahmedabad
1346 PlanMyMedicalTrip Pune
1347 MyCity4Kids Gurgaon
1348 Justdakhila.com New Delhi
1349 PitStop Bangalore
1350 Bikxie Gurgaon
1351 Infurnia Bangalore
1352 Myly Jaipur
1353 MakeMyTrip Gurgaon
1354 Melorra Bangalore
1355 ConfirmTKT Bangalore
1356 Gyaanzone Mumbai
1357 FlatFurnish Gurgaon
1358 Lumiere Bangalore
1359 Koovs Gurgaon
1360 Wishup New Delhi
1361 FreshMenu Bangalore
1362 SavvyMob Bangalore
1363 Fashionablyin Mumbai
1364 Shopclues Gurgaon
1365 Hike Messenger New Delhi
1366 iBus Networks Bangalore
1367 Happy2Refer Mumbai
1368 CarTrade Mumbai
1369 Rentomo Bangalore
1370 Roder New Delhi
1371 Bucker Hyderabad
1372 Care24 Mumbai
1373 Obino Mu

1686 Parcelled Bangalore
1687 AlmaMapper Ahmedabad
1688 Flexing It New Delhi
1689 Netmeds.com Chennai
1690 Razorpay Jaipur
1691 Jombay Pune
1692 Happitoo Mumbai
1693 Care24 Mumbai
1694 Roadrunnr Bangalore
1695 Simpli5d Gurgaon
1696 ORIGA Leasing Mumbai
1697 Smartivity.in New Delhi
1698 UrDoorStep Bangalore
1699 Tavaga Mumbai
1700 Moglix Noida
1701 ZopHop Mumbai
1702 Niki.ai Bangalore
1703 Instavans Bangalore
1704 LiftO Mumbai
1705 TinyOwl Mumbai
1706 Toppr.com Mumbai
1707 Kleeto Gurgaon
1708 BeaconsTalk Mumbai
1709 Jugnoo Chandigarh
1710 TOFlo Mumbai
1711 FXMartIndia Chandigarh
1712 Stylecracker Mumbai
1713 Luxuryhues Gurgaon
1714 HolaChef Mumbai
1715 Zivame Bangalore
1716 Capillary Tech New Delhi
1717 Jobspire Bangalore
1718 MeraDoctor Mumbai
1719 Vistaar Finance Bangalore
1720 HiJinny Mumbai
1721 Clapsnslaps Mumbai
1722 Blubox Mumbai
1723 RoomCentral Bangalore
1724 YouthKiAwaaz New Delhi
1725 Prozo New Delhi
1726 Zomato Gurgaon
1727 Sahayog Dairy Bhopal
1728 MockBank Bangalore
1729 R

2057 Goodservice New Delhi
2058 Treebo Bangalore
2059 EduKart New Delhi
2060 Healthians.com Hyderabad
2061 LabInApp Bangalore
2062 Syona Cosmetics Chennai
2063 Sigrid Education Noida
2064 Indix 0
2065 The Porter Mumbai
2066 Retention.ai Bangalore
2067 Fusion Microfinance New Delhi
2068 Zinka Bangalore
2069 CreditMantri Chennai
2070 GetMyPeon Mumbai
2071 UrbanClap New Delhi
2072 StayGlad Bangalore
2073 AdWyze Bangalore
2074 Zimmber Mumbai
2075 Buttercups Bangalore
2076 RailYatri Bangalore
2077 aagaar.com Gurgaon
2078 Flatchat Bangalore
2079 MyCFO Mumbai
2080 Foodpanda Gurgaon
2081 Termsheet Chennai
2082 Applicate Bangalore
2083 World Art Community Gurgaon
2084 SpoonJoy Bangalore
2085 Seed Schools Hyderabad
2086 Ignis Careers Hyderabad
2087 Square Yards Gurgaon
2088 Fintellix Bangalore
2089 Customer360 Mumbai
2090 Delhivery Gurgaon
2091 Swiggy Bangalore
2092 Uniphore Chennai
2093 Box8 Mumbai
2094 Toppr Mumbai
2095 Vedantu Bangalore
2096 BuyHatke Bangalore
2097 KleverKid New Delhi
2098 Un