Question 3)
    Given File 'startup_funding.csv'
    Problem Statement :
    Find out if cities play any role in receiving funding.
    Find top 10 Indian cities with most amount of fundings received. Find out percentage of funding each city has got (among top 10 Indian cities only).
    Print the city and percentage with 2 decimal place after rounding off.
    Note:
    Take city name "Delhi" as "New Delhi".
    Check the case-sensitiveness of cities also. That means - at some place, instead of "Bangalore", "bangalore" is given. Take city name as "Bangalore".
    For few startups multiple locations are given, one Indian and one Foreign. Count those startups in Indian startup also. Indian city name is first.
    Print the city in descending order with respect to the percentage of funding.
    Output Format :

    city1 percent1
    city2 percent2
    city3 percent3
    . . . 
    . . .
    . . .


### Importing the Libraries

In [2]:
import numpy as np  # For using n-d array mathematical operations
import pandas as pd # For open csv files and create dataframe
import matplotlib.pyplot as plt # For plotting graphs

### Reading CSV Files and Checking the content in it

In [7]:
# Opening csv files
dataset = pd.read_csv('startup_funding.csv', skipinitialspace = True, encoding = 'utf-8')
df = dataset.copy() # Copying the dataset dataframe in df in case anything goes wrong, we can start again
df.head(10) # Showing first 10 rows in df

Unnamed: 0,SNo,Date,StartupName,IndustryVertical,SubVertical,CityLocation,InvestorsName,InvestmentType,AmountInUSD,Remarks
0,0,01/08/2017,TouchKin,Technology,Predictive Care Platform,Bangalore,Kae Capital,Private Equity,1300000.0,
1,1,02/08/2017,Ethinos,Technology,Digital Marketing Agency,Mumbai,Triton Investment Advisors,Private Equity,,
2,2,02/08/2017,Leverage Edu,Consumer Internet,Online platform for Higher Education Services,New Delhi,"Kashyap Deorah, Anand Sankeshwar, Deepak Jain,...",Seed Funding,,
3,3,02/08/2017,Zepo,Consumer Internet,DIY Ecommerce platform,Mumbai,"Kunal Shah, LetsVenture, Anupam Mittal, Hetal ...",Seed Funding,500000.0,
4,4,02/08/2017,Click2Clinic,Consumer Internet,healthcare service aggregator,Hyderabad,"Narottam Thudi, Shireesh Palle",Seed Funding,850000.0,
5,5,01/07/2017,Billion Loans,Consumer Internet,Peer to Peer Lending platform,Bangalore,Reliance Corporate Advisory Services Ltd,Seed Funding,1000000.0,
6,6,03/07/2017,Ecolibriumenergy,Technology,Energy management solutions provider,Ahmedabad,"Infuse Ventures, JLL",Private Equity,2600000.0,
7,7,04/07/2017,Droom,eCommerce,Online marketplace for automobiles,Gurgaon,"Asset Management (Asia) Ltd, Digital Garage Inc",Private Equity,20000000.0,
8,8,05/07/2017,Jumbotail,eCommerce,online marketplace for food and grocery,Bangalore,"Kalaari Capital, Nexus India Capital Advisors",Private Equity,8500000.0,
9,9,05/07/2017,Moglix,eCommerce,B2B marketplace for Industrial products,Noida,"International Finance Corporation, Rocketship,...",Private Equity,12000000.0,


### Solution for given problem
To find top 10 cities with maximum fundings we can use columns CityLocation and AmountInUSD in the dataset.
Count each city in the column.

In [46]:
# Replacing all the values having nan with 0 in the column AmountInUSD
df.AmountInUSD.replace(np.nan,'0',inplace = True)
# Removing nan from CityLocation
df.CityLocation.replace(np.nan, '', inplace = True)
df.CityLocation.replace('bangalore', 'Bangalore', inplace = True)
df.CityLocation.replace('Delhi', 'New Delhi', inplace = True)

cityLocation = df['CityLocation'] # Copying column CityLocation in df
amountInUSD = df['AmountInUSD'] # Copying column AmountInUSD in df

cityWithAmount = {} # Creating a dictionary which will contain city as key and amount as value
for index in range(len(cityLocation)):
    # One row can contain multiple cities we have to consider first one only
    city = cityLocation[index].split('/')[0]
    amount = str(amountInUSD[index]).replace(',','')
    if city != '':
        # Adding amount to the corresponding city
        cityWithAmount[city] = cityWithAmount.get(city,0) + int(amount)

allCities = list(cityWithAmount.keys()) # Getting each city from cityWithAmount
amount = list(cityWithAmount.values()) # Getting amount for each city from cityWithAmount

# Zip function in python converts two 1d arrays into a single 2d array.
# For eg: if a = [1,2,3], b = [4,5,6]
# Then zip(a,b) = [(1,4),[2,5],[3,6]]
cityWithAmount = list(zip(allCities,amount))

# Sorting the list as per amount in descending order
cityWithAmount.sort(reverse = True, key = lambda a : a[1])
cityWithAmount = np.array(cityWithAmount) # Converting into numpy 2D array

city = cityWithAmount[:,0] # Seperating out city from sorted
amount = np.array(cityWithAmount[:,1], dtype = int) # Seperating amount corresponding to the array
totalamount = np.sum(amount) # Calculating the sum for amount for totalamount

# Printing top 10 cities with maximum fundings
for index in range(10):
    perc = amount[index]/totalamount*100
    print(city[index],'{:.2f}'.format(perc))

Bangalore 49.16
New Delhi 16.11
Mumbai 13.73
Gurgaon 12.11
Chennai 2.41
Pune 1.74
Hyderabad 1.14
Noida 1.00
Ahmedabad 0.58
Pune  0.41
