## Funding Amount

#### Problem Statement:
Find out if cities play any role in receiving funding.<br>
<br>
Find top 10 Indian cities with most amount of fundings received. Find out percentage of funding each city has got (among top 10 Indian cities only).<br>
<br>
Print the city and percentage with 2 decimal place after rounding off.<br>
<br>
__Note:__<br>
Take city name "Delhi" as "New Delhi"<br>
<br>
Check the case-sensitiveness of cities also. That means at some place, instead of "Bangalore", "bangalore" is given. Take city name as "Bangalore".<br>
<br>
For few startups multiple locations are given, one Indian and one Foreign. Count those startups in India startup also. Indian city name is first.<br>
<br>
Print the city in descending order with respect to the number of stratups.<br>
<br>
#### Output Format:
city1 percent1<br>
city2 percent2<br>
...<br>
...<br>
...<br>

In [1]:
# Importing the libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

In [2]:
# Importing the dataset
startup = pd.read_csv('startup_funding.csv')
# making the copy of the dataframe
df = startup.copy()
df

Unnamed: 0,SNo,Date,StartupName,IndustryVertical,SubVertical,CityLocation,InvestorsName,InvestmentType,AmountInUSD,Remarks
0,0,01/08/2017,TouchKin,Technology,Predictive Care Platform,Bangalore,Kae Capital,Private Equity,1300000,
1,1,02/08/2017,Ethinos,Technology,Digital Marketing Agency,Mumbai,Triton Investment Advisors,Private Equity,,
2,2,02/08/2017,Leverage Edu,Consumer Internet,Online platform for Higher Education Services,New Delhi,"Kashyap Deorah, Anand Sankeshwar, Deepak Jain,...",Seed Funding,,
3,3,02/08/2017,Zepo,Consumer Internet,DIY Ecommerce platform,Mumbai,"Kunal Shah, LetsVenture, Anupam Mittal, Hetal ...",Seed Funding,500000,
4,4,02/08/2017,Click2Clinic,Consumer Internet,healthcare service aggregator,Hyderabad,"Narottam Thudi, Shireesh Palle",Seed Funding,850000,
...,...,...,...,...,...,...,...,...,...,...
2367,2367,29/01/2015,Printvenue,,,,Asia Pacific Internet Group,Private Equity,4500000,
2368,2368,29/01/2015,Graphene,,,,KARSEMVEN Fund,Private Equity,825000,Govt backed VC Fund
2369,2369,30/01/2015,Mad Street Den,,,,"Exfinity Fund, GrowX Ventures.",Private Equity,1500000,
2370,2370,30/01/2015,Simplotel,,,,MakeMyTrip,Private Equity,,"Strategic Funding, Minority stake"


In [3]:
#Extracting the columns startup name and CityLocation
mycolumns = ['CityLocation', 'AmountInUSD']
df2 = df.loc[:, mycolumns]
df2

Unnamed: 0,CityLocation,AmountInUSD
0,Bangalore,1300000
1,Mumbai,
2,New Delhi,
3,Mumbai,500000
4,Hyderabad,850000
...,...,...
2367,,4500000
2368,,825000
2369,,1500000
2370,,


In [4]:
# modifying the the column AmountInUSD
def modified(amount):
    return int(amount.replace(',',''))

#function to split indian cities from string
def ind_city(city):
    return city.split('/')[0].strip()

In [5]:
# filling up the NaN values of the columns
df2['CityLocation'].fillna('', inplace=True)
df2['AmountInUSD'].fillna('0', inplace=True)

In [6]:
#replacing the wrong written city
df2['CityLocation'].replace("bangalore", "Bangalore", inplace = True)
df2['CityLocation'].replace("Delhi", "New Delhi", inplace = True)
df2['CityLocation'].replace("SFO / Bangalore", "Bangalore", inplace = True)
df2['CityLocation'].replace("Seattle / Bangalore", "Bangalore", inplace = True)
df2['CityLocation'].replace("Goa/Hyderabad", "Hyderabad", inplace = True)
df2['CityLocation'].replace("Dallas/Hyderabad", "Hyderabad", inplace = True)

In [7]:
# modifying the amount
df2['AmountInUSD'] = df2['AmountInUSD'].apply(modified)
df2['CityLocation'] = df2['CityLocation'].apply(ind_city)

In [8]:
df2['CityLocation'].isin(['Dallas']).any()

True

In [9]:
df2.head()

Unnamed: 0,CityLocation,AmountInUSD
0,Bangalore,1300000
1,Mumbai,0
2,New Delhi,0
3,Mumbai,500000
4,Hyderabad,850000


In [10]:
# filtering the Indian Cities
#filtering the cities on the basis of given conditions
df2 = df2[(df2['CityLocation'] == "Bangalore") | (df2['CityLocation'] == "Mumbai") | (df2['CityLocation'] == "Gurgaon") | 
          (df2['CityLocation'] == "Noida") | (df2['CityLocation'] == "New Delhi") | (df2['CityLocation'] == "Pune") | 
         (df2['CityLocation'] == "Chennai") | (df2['CityLocation'] == "Ahmedabad") | (df2['CityLocation'] == "Jaipur") | 
          (df2['CityLocation'] == "Hyderabad")]

In [11]:
# checking if any foreign city are there or not
df2['CityLocation'].isin(['Seattle']).any()

False

In [12]:
df2

Unnamed: 0,CityLocation,AmountInUSD
0,Bangalore,1300000
1,Mumbai,0
2,New Delhi,0
3,Mumbai,500000
4,Hyderabad,850000
...,...,...
2196,Bangalore,3500000
2197,Bangalore,0
2198,Bangalore,400000
2199,Chennai,500000


In [19]:
df3 = df2.groupby(['CityLocation']).sum()
df3

Unnamed: 0_level_0,AmountInUSD
CityLocation,Unnamed: 1_level_1
Ahmedabad,98186000
Bangalore,8425674108
Chennai,411105000
Gurgaon,2069021500
Hyderabad,195362000
Jaipur,35560000
Mumbai,2354934500
New Delhi,2818247500
Noida,170638000
Pune,366653000


In [20]:
# calculating the total fundings received among this top 10 cities
s = df3['AmountInUSD'].sum()
s

16945381608

In [31]:
# calculating the percentage and sorting the values in descending order
df4 = pd.DataFrame((df3['AmountInUSD'] / s)*100)
df4.sort_values(by='AmountInUSD', ascending=False, inplace=True)
df4 = pd.DataFrame(df4['AmountInUSD'].round(2))
df4

Unnamed: 0_level_0,AmountInUSD
CityLocation,Unnamed: 1_level_1
Bangalore,49.72
New Delhi,16.63
Mumbai,13.9
Gurgaon,12.21
Chennai,2.43
Pune,2.16
Hyderabad,1.15
Noida,1.01
Ahmedabad,0.58
Jaipur,0.21
