#**Analyzing Indian Startup Funding Trends (2018-2021) with CRISP-DM**

#**Description:**

This project conducts an in-depth analysis of funding trends within the Indian startup ecosystem from 2018 to 2021, employing the CRISP-DM methodology. Leveraging comprehensive datasets on funding details, our goal is to identify patterns, fluctuations, and key insights into the funding landscape of Indian startups during this period. By examining funding amounts, types, sectors, and geographical distributions, we aim to provide actionable recommendations for stakeholders navigating the dynamic funding environment.

#**Hypothesis**

#**Null Hypothesis (H₀)**

The funding landscape of Indian startups from 2018 to 2021 is not significantly influenced by economic conditions, industry sector performance, and investor sentiments.

#**Alternative Hypothesis (H₁)**

The funding landscape of Indian startups from 2018 to 2021 is significantly influenced by economic conditions, industry sector performance, and investor sentiments.

**Business Questions**
1. What are the overall trends in funding amounts received by startups in India from 2018 to 2021, and how do they vary across different years and sectors?
2. Which funding stages (e.g., seed, Series A, Series B) have been most prevalent among Indian startups during this period, and have there been any notable shifts in stage preferences over time?
3. Are there specific industry sectors that have consistently attracted higher levels of funding, and what are the driving factors behind their attractiveness to investors?
4. How does the geographical distribution of funding vary across different regions of India, and are there any emerging startup hubs or regions experiencing significant growth in funding activity?
5. What role do external factors such as global economic trends, government policies, and technological advancements play in shaping funding trends within the Indian startup ecosystem during this timeframe?

In [None]:
import pyodbc
print("pyodbc is installed and imported successfully")

import pyodbc     
from dotenv import dotenv_values    #import the dotenv_values function from the dotenv package
import pandas as pd
import warnings 

warnings.filterwarnings('ignore')

# Load environment variables from .env file into a dictionary
environment_variables = dotenv_values('.env')

# Get the values for the credentials you set in the '.env' file
server = environment_variables.get("SERVER_NAME")
database = environment_variables.get("DATABASE_NAME")
login = environment_variables.get("LOGIN")
password = environment_variables.get("PASSWORD")




pyodbc is installed and imported successfully


In [None]:
# Create a connection string
connection_string = f"DRIVER={{SQL Server}};SERVER={server};DATABASE={database};UID={login};PWD={password};MARS_Connection=yes;MinProtocolVersion=TLSv1.2;"

In [None]:
connection = pyodbc.connect(connection_string)

In [None]:
query = "SELECT * FROM dbo.LP1_startup_funding2020"

data_1 = pd.read_sql(query, connection)

In [None]:
data_1.head()

Unnamed: 0,Company_Brand,Founded,HeadQuarter,Sector,What_it_does,Founders,Investor,Amount,Stage,column10
0,Aqgromalin,2019.0,Chennai,AgriTech,Cultivating Ideas for Profit,"Prasanna Manogaran, Bharani C L",Angel investors,200000.0,,
1,Krayonnz,2019.0,Bangalore,EdTech,An academy-guardian-scholar centric ecosystem ...,"Saurabh Dixit, Gurudutt Upadhyay",GSF Accelerator,100000.0,Pre-seed,
2,PadCare Labs,2018.0,Pune,Hygiene management,Converting bio-hazardous waste to harmless waste,Ajinkya Dhariya,Venture Center,,Pre-seed,
3,NCOME,2020.0,New Delhi,Escrow,Escrow-as-a-service platform,Ritesh Tiwari,"Venture Catalysts, PointOne Capital",400000.0,,
4,Gramophone,2016.0,Indore,AgriTech,Gramophone is an AgTech platform enabling acce...,"Ashish Rajan Singh, Harshit Gupta, Nishant Mah...","Siana Capital Management, Info Edge",340000.0,,


In [None]:
data_1.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1055 entries, 0 to 1054
Data columns (total 10 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   Company_Brand  1055 non-null   object 
 1   Founded        842 non-null    float64
 2   HeadQuarter    961 non-null    object 
 3   Sector         1042 non-null   object 
 4   What_it_does   1055 non-null   object 
 5   Founders       1043 non-null   object 
 6   Investor       1017 non-null   object 
 7   Amount         801 non-null    float64
 8   Stage          591 non-null    object 
 9   column10       2 non-null      object 
dtypes: float64(2), object(8)
memory usage: 82.5+ KB


In [None]:
data_1.shape

(1055, 10)

In [None]:
query = "SELECT * FROM dbo.LP1_startup_funding2021"

data_2 = pd.read_sql(query, connection)

In [None]:
data_2.head()

Unnamed: 0,Company_Brand,Founded,HeadQuarter,Sector,What_it_does,Founders,Investor,Amount,Stage
0,Unbox Robotics,2019.0,Bangalore,AI startup,Unbox Robotics builds on-demand AI-driven ware...,"Pramod Ghadge, Shahid Memon","BEENEXT, Entrepreneur First","$1,200,000",Pre-series A
1,upGrad,2015.0,Mumbai,EdTech,UpGrad is an online higher education platform.,"Mayank Kumar, Phalgun Kompalli, Ravijot Chugh,...","Unilazer Ventures, IIFL Asset Management","$120,000,000",
2,Lead School,2012.0,Mumbai,EdTech,LEAD School offers technology based school tra...,"Smita Deorah, Sumeet Mehta","GSV Ventures, Westbridge Capital","$30,000,000",Series D
3,Bizongo,2015.0,Mumbai,B2B E-commerce,Bizongo is a business-to-business online marke...,"Aniket Deb, Ankit Tomar, Sachin Agrawal","CDC Group, IDG Capital","$51,000,000",Series C
4,FypMoney,2021.0,Gurugram,FinTech,"FypMoney is Digital NEO Bank for Teenagers, em...",Kapil Banwari,"Liberatha Kallat, Mukesh Yadav, Dinesh Nagpal","$2,000,000",Seed


In [None]:
data_2.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1209 entries, 0 to 1208
Data columns (total 9 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   Company_Brand  1209 non-null   object 
 1   Founded        1208 non-null   float64
 2   HeadQuarter    1208 non-null   object 
 3   Sector         1209 non-null   object 
 4   What_it_does   1209 non-null   object 
 5   Founders       1205 non-null   object 
 6   Investor       1147 non-null   object 
 7   Amount         1206 non-null   object 
 8   Stage          781 non-null    object 
dtypes: float64(1), object(8)
memory usage: 85.1+ KB


In [None]:
data_2.shape

(1209, 9)

In [None]:
df = pd.concat([data_1,data_2])

df.shape

(2264, 10)

In [None]:
# Load the CSV file into  DataFrame
df_data_3 = pd.read_csv('C:\\Users\\hp\\Desktop\\LP1\\Datasets\\Indian-Startup-Funding-Analysis\\startup_funding2018.csv')
df_data_4 = pd.read_csv('C:\\Users\\hp\\Desktop\\LP1\\Datasets\\Indian-Startup-Funding-Analysis\\startup_funding2019.csv')

# Display the first few rows of each DataFrame
print("Data 3:")
print(df_data_3.head(10))



Data 3:
         Company Name                                           Industry  \
0     TheCollegeFever  Brand Marketing, Event Promotion, Marketing, S...   
1     Happy Cow Dairy                               Agriculture, Farming   
2          MyLoanCare   Credit, Financial Services, Lending, Marketplace   
3         PayMe India                        Financial Services, FinTech   
4            Eunimart                 E-Commerce Platforms, Retail, SaaS   
5              Hasura                   Cloud Infrastructure, PaaS, SaaS   
6           Tripshelf                     Internet, Leisure, Marketplace   
7        Hyperdata.IO                                    Market Research   
8        Freightwalla       Information Services, Information Technology   
9  Microchip Payments                                    Mobile Payments   

  Round/Series       Amount                          Location  \
0         Seed       250000       Bangalore, Karnataka, India   
1         Seed  ₹40,000,0

In [None]:
print("\nData 4:")
print(df_data_4.head(10))


Data 4:
    Company/Brand  Founded HeadQuarter           Sector  \
0  Bombay Shaving      NaN         NaN        Ecommerce   
1       Ruangguru   2014.0      Mumbai           Edtech   
2        Eduisfun      NaN      Mumbai           Edtech   
3        HomeLane   2014.0     Chennai  Interior design   
4        Nu Genes   2004.0   Telangana         AgriTech   
5        FlytBase      NaN        Pune       Technology   
6           Finly      NaN   Bangalore             SaaS   
7        Kratikal   2013.0       Noida       Technology   
8       Quantiphi      NaN         NaN        AI & Tech   
9        Lenskart   2010.0       Delhi       E-commerce   

                                        What it does  \
0         Provides a range of male grooming products   
1  A learning platform that provides topic-based ...   
2            It aims to make learning fun via games.   
3              Provides interior designing solutions   
4  It is a seed company engaged in production, pr...   
5    