# Indian Startup Analysis

India's startup ecosystem is thriving, with numerous successful companies across various sectors. Some prominent examples include Flipkart, Ola, Swiggy, Zomato, and Nykaa, which have achieved significant scale and impact. The country also boasts a growing number of unicorns, startups valued at over $1 billion, such as Zepto, Ather Energy, and Groww. Furthermore, initiatives like Startup India are actively promoting and supporting the growth of startups across the nation

Problem Statement :
- What sectors are booming?

- Which cities produce the most startups?

- What's the funding trend over time?

- Who are the top investors in the Indian market?

Libraries

In [86]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

In [118]:
data = pd.read_csv('startup_funding.csv')
df = pd.DataFrame(data)


Data Cleaning


In [119]:
# Changing Date Column to Date
df.rename(columns={'Date dd/mm/yyyy':'Date'},inplace=True)
# Changing dtype of Date
df['Date'] = pd.to_datetime(df['Date'],dayfirst=True,errors='coerce') 
df

Unnamed: 0,Sr No,Date,Startup Name,Industry Vertical,SubVertical,City Location,Investors Name,InvestmentnType,Amount in USD,Remarks
0,1,2020-01-09,BYJU’S,E-Tech,E-learning,Bengaluru,Tiger Global Management,Private Equity Round,200000000,
1,2,2020-01-13,Shuttl,Transportation,App based shuttle service,Gurgaon,Susquehanna Growth Equity,Series C,8048394,
2,3,2020-01-09,Mamaearth,E-commerce,Retailer of baby and toddler products,Bengaluru,Sequoia Capital India,Series B,18358860,
3,4,2020-01-02,https://www.wealthbucket.in/,FinTech,Online Investment,New Delhi,Vinod Khatumal,Pre-series A,3000000,
4,5,2020-01-02,Fashor,Fashion and Apparel,Embroiled Clothes For Women,Mumbai,Sprout Venture Partners,Seed Round,1800000,
...,...,...,...,...,...,...,...,...,...,...
3039,3040,2015-01-29,Printvenue,,,,Asia Pacific Internet Group,Private Equity,4500000,
3040,3041,2015-01-29,Graphene,,,,KARSEMVEN Fund,Private Equity,825000,Govt backed VC Fund
3041,3042,2015-01-30,Mad Street Den,,,,"Exfinity Fund, GrowX Ventures.",Private Equity,1500000,
3042,3043,2015-01-30,Simplotel,,,,MakeMyTrip,Private Equity,,"Strategic Funding, Minority stake"


In [126]:
# Changing dtype of Amount
df['Amount in USD'] = df['Amount in USD'].astype(str).str.replace(',', '', regex=False)
df['Amount in USD'] = pd.to_numeric(df['Amount in USD'],errors='coerce')
df

Unnamed: 0,Sr No,Date,Startup Name,Industry Vertical,SubVertical,City Location,Investors Name,InvestmentnType,Amount in USD
0,1,2020-01-09,BYJU’S,E-Tech,E-learning,Bengaluru,Tiger Global Management,Private Equity Round,200000000.0
1,2,2020-01-13,Shuttl,Transportation,App based shuttle service,Gurgaon,Susquehanna Growth Equity,Series C,8048394.0
2,3,2020-01-09,Mamaearth,E-commerce,Retailer of baby and toddler products,Bengaluru,Sequoia Capital India,Series B,18358860.0
3,4,2020-01-02,https://www.wealthbucket.in/,FinTech,Online Investment,New Delhi,Vinod Khatumal,Pre-series A,3000000.0
4,5,2020-01-02,Fashor,Fashion and Apparel,Embroiled Clothes For Women,Mumbai,Sprout Venture Partners,Seed Round,1800000.0
...,...,...,...,...,...,...,...,...,...
3039,3040,2015-01-29,Printvenue,,,,Asia Pacific Internet Group,Private Equity,4500000.0
3040,3041,2015-01-29,Graphene,,,,KARSEMVEN Fund,Private Equity,825000.0
3041,3042,2015-01-30,Mad Street Den,,,,"Exfinity Fund, GrowX Ventures.",Private Equity,1500000.0
3042,3043,2015-01-30,Simplotel,,,,MakeMyTrip,Private Equity,


In [120]:
# Removing Remarks Column
df.drop(columns='Remarks',inplace=True)

In [158]:
# df.isnull().sum()
df.dropna(subset=['Date'],inplace=True)
df.isnull().sum()


Sr No              0
Date               0
Startup Name       0
City  Location     0
Investors Name     0
InvestmentnType    0
Amount in USD      0
Industry           0
dtype: int64

In [157]:
mean_val=df['Amount in USD'].mean()
df['Amount in USD'] = df['Amount in USD'].fillna(mean_val)


In [156]:
print(df['Amount in USD'].dtype)
df

float64


Unnamed: 0,Sr No,Date,Startup Name,City Location,Investors Name,InvestmentnType,Amount in USD,Industry
0,1,2020-01-09,BYJU’S,Bengaluru,Tiger Global Management,Private Equity Round,2.000000e+08,E-learning E-Tech
1,2,2020-01-13,Shuttl,Gurgaon,Susquehanna Growth Equity,Series C,8.048394e+06,App based shuttle service Transportation
2,3,2020-01-09,Mamaearth,Bengaluru,Sequoia Capital India,Series B,1.835886e+07,Retailer of baby and toddler products E-commerce
3,4,2020-01-02,https://www.wealthbucket.in/,New Delhi,Vinod Khatumal,Pre-series A,3.000000e+06,Online Investment FinTech
4,5,2020-01-02,Fashor,Mumbai,Sprout Venture Partners,Seed Round,1.800000e+06,Embroiled Clothes For Women Fashion and Apparel
...,...,...,...,...,...,...,...,...
2868,2869,2015-04-29,Tracxn,Bangalore,SAIF Partners,Private Equity,3.500000e+06,Startup Analytics platform
2869,2870,2015-04-29,Dazo,Bangalore,"Sumit Jain, Aprameya Radhakrishna, Alok Goel, ...",Seed Funding,1.835602e+07,Mobile Food Ordering app
2870,2871,2015-04-29,Tradelab,Bangalore,Rainmatter,Seed Funding,4.000000e+05,Financial Markets Software
2871,2872,2015-04-29,PiQube,Chennai,The HR Fund,Seed Funding,5.000000e+05,Hiring Analytics platform


In [159]:
df

Unnamed: 0,Sr No,Date,Startup Name,City Location,Investors Name,InvestmentnType,Amount in USD,Industry
0,1,2020-01-09,BYJU’S,Bengaluru,Tiger Global Management,Private Equity Round,2.000000e+08,E-learning E-Tech
1,2,2020-01-13,Shuttl,Gurgaon,Susquehanna Growth Equity,Series C,8.048394e+06,App based shuttle service Transportation
2,3,2020-01-09,Mamaearth,Bengaluru,Sequoia Capital India,Series B,1.835886e+07,Retailer of baby and toddler products E-commerce
3,4,2020-01-02,https://www.wealthbucket.in/,New Delhi,Vinod Khatumal,Pre-series A,3.000000e+06,Online Investment FinTech
4,5,2020-01-02,Fashor,Mumbai,Sprout Venture Partners,Seed Round,1.800000e+06,Embroiled Clothes For Women Fashion and Apparel
...,...,...,...,...,...,...,...,...
2868,2869,2015-04-29,Tracxn,Bangalore,SAIF Partners,Private Equity,3.500000e+06,Startup Analytics platform
2869,2870,2015-04-29,Dazo,Bangalore,"Sumit Jain, Aprameya Radhakrishna, Alok Goel, ...",Seed Funding,1.835602e+07,Mobile Food Ordering app
2870,2871,2015-04-29,Tradelab,Bangalore,Rainmatter,Seed Funding,4.000000e+05,Financial Markets Software
2871,2872,2015-04-29,PiQube,Chennai,The HR Fund,Seed Funding,5.000000e+05,Hiring Analytics platform


# Cleaned DATA 

In [160]:
df.to_excel('Cleaned_DATA.xlsx',index=False)

In [161]:
import pandas as pd

# Load the Excel file
df = pd.read_excel("Cleaned_DATA.xlsx")

# List of common sectors
common_sectors = [
    "E-commerce", "Fintech", "Edtech", "Healthtech", "Logistics", "Transportation",
    "Hospitality", "Fashion", "Foodtech", "SaaS", "AI", "E-learning", "Agritech",
    "Technology", "E-Tech", "Aerospace", "Retail", "Automobile", "Communication"
]

# Function to extract sector
def extract_sector(industry_text):
    for sector in common_sectors:
        if pd.notna(industry_text) and sector.lower() in industry_text.lower():
            return sector
    return "Other"

# Apply function
df['Sector'] = df['Industry'].apply(extract_sector)

# Save the updated file
df.to_excel("Cleaned_DATA_with_Sector.xlsx", index=False)
