# INDIAN START-UP ECOSYSTEM PROJECT

### Installing the needed libraries

Fisrt I go ahead to install all the required libraries

In [1]:
import pyodbc #just installed with pip
from dotenv import dotenv_values #import the dotenv_values function from the dotenv package
import pandas as pd
import warnings
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

warnings.filterwarnings('ignore')

### Initializing environment variables

To connect to one of the data sources(a database) I'll be working with, I need some credentials (username, password etc.) which cannot be shared or cannot be shown to the public, therefore I need to create a .env file which is going to house the environment variables. This environment variable will be used in creating a connection string to help connect to the database.

In [2]:
# Load environment variables from .env file into a dictionary
environment_variables = dotenv_values('.env')

# Get the values for the credentials you set in the '.env' file
database=environment_variables.get("DATABASE")
server=environment_variables.get("SERVER")
login=environment_variables.get("LOGIN")
password=environment_variables.get("PASSWORD")

# creating connection to database
connection_string=f"DRIVER={{SQL Server}};SERVER={server};DATABASE={database};UID={login};PWD={password}"

In [3]:
# Using the connect method of the pyodbc library, I go ahead to pass in the connection string.
# This will connect to the server

connection=pyodbc.connect(connection_string)

I then go ahead to query the database for the data for 2020 and 2021

In [4]:
# Let's see have a look at the tables in the database
query = ''' SELECT *
            FROM INFORMATION_SCHEMA.TABLES
            WHERE TABLE_TYPE = 'BASE TABLE' '''

tables=pd.read_sql(query,connection)
print(tables)


  TABLE_CATALOG TABLE_SCHEMA               TABLE_NAME  TABLE_TYPE
0         dapDB          dbo  LP1_startup_funding2021  BASE TABLE
1         dapDB          dbo  LP1_startup_funding2020  BASE TABLE






As seen above, the database contains two tables namely LP1_startup_funding2020 which contains data for 2020 and then LP1_startup_funding2021 which contains data for 2021.

Now I will go ahead to read the data into pandas dataframe separately and assign them to variable names which are self-descriptive

## Reading data for all years into pandas dataframe separately

In [5]:

#first we assign the data for 2020 to a variable 'data_2020'

query="Select * from dbo.LP1_startup_funding2020"
data_2020=pd.read_sql(query,connection)

data_2020.head()

Unnamed: 0,Company_Brand,Founded,HeadQuarter,Sector,What_it_does,Founders,Investor,Amount,Stage,column10
0,Aqgromalin,2019.0,Chennai,AgriTech,Cultivating Ideas for Profit,"Prasanna Manogaran, Bharani C L",Angel investors,200000.0,,
1,Krayonnz,2019.0,Bangalore,EdTech,An academy-guardian-scholar centric ecosystem ...,"Saurabh Dixit, Gurudutt Upadhyay",GSF Accelerator,100000.0,Pre-seed,
2,PadCare Labs,2018.0,Pune,Hygiene management,Converting bio-hazardous waste to harmless waste,Ajinkya Dhariya,Venture Center,,Pre-seed,
3,NCOME,2020.0,New Delhi,Escrow,Escrow-as-a-service platform,Ritesh Tiwari,"Venture Catalysts, PointOne Capital",400000.0,,
4,Gramophone,2016.0,Indore,AgriTech,Gramophone is an AgTech platform enabling acce...,"Ashish Rajan Singh, Harshit Gupta, Nishant Mah...","Siana Capital Management, Info Edge",340000.0,,


In [6]:
#then assign the data for 2021 to a variable 'data_2021'

query="Select * from dbo.LP1_startup_funding2021"
data_2021=pd.read_sql(query,connection)

data_2021.head()

Unnamed: 0,Company_Brand,Founded,HeadQuarter,Sector,What_it_does,Founders,Investor,Amount,Stage
0,Unbox Robotics,2019.0,Bangalore,AI startup,Unbox Robotics builds on-demand AI-driven ware...,"Pramod Ghadge, Shahid Memon","BEENEXT, Entrepreneur First","$1,200,000",Pre-series A
1,upGrad,2015.0,Mumbai,EdTech,UpGrad is an online higher education platform.,"Mayank Kumar, Phalgun Kompalli, Ravijot Chugh,...","Unilazer Ventures, IIFL Asset Management","$120,000,000",
2,Lead School,2012.0,Mumbai,EdTech,LEAD School offers technology based school tra...,"Smita Deorah, Sumeet Mehta","GSV Ventures, Westbridge Capital","$30,000,000",Series D
3,Bizongo,2015.0,Mumbai,B2B E-commerce,Bizongo is a business-to-business online marke...,"Aniket Deb, Ankit Tomar, Sachin Agrawal","CDC Group, IDG Capital","$51,000,000",Series C
4,FypMoney,2021.0,Gurugram,FinTech,"FypMoney is Digital NEO Bank for Teenagers, em...",Kapil Banwari,"Liberatha Kallat, Mukesh Yadav, Dinesh Nagpal","$2,000,000",Seed


I have been able to retrieve the data for 2020 and 2021. Now in the next steps, I go ahead to download the data for 2018 called startup_funding2018.csv, which is stored in a github repository. Alternatively, one can connect to the data using a url.

Data for 2019 was downloaded from OneDrive on to my local storage and read into pandas dataframe from there.

In [7]:
# reading in data for 2018
data_2018=pd.read_csv('startup_funding2018.csv')
data_2018.head()

Unnamed: 0,Company Name,Industry,Round/Series,Amount,Location,About Company
0,TheCollegeFever,"Brand Marketing, Event Promotion, Marketing, S...",Seed,250000,"Bangalore, Karnataka, India","TheCollegeFever is a hub for fun, fiesta and f..."
1,Happy Cow Dairy,"Agriculture, Farming",Seed,"₹40,000,000","Mumbai, Maharashtra, India",A startup which aggregates milk from dairy far...
2,MyLoanCare,"Credit, Financial Services, Lending, Marketplace",Series A,"₹65,000,000","Gurgaon, Haryana, India",Leading Online Loans Marketplace in India
3,PayMe India,"Financial Services, FinTech",Angel,2000000,"Noida, Uttar Pradesh, India",PayMe India is an innovative FinTech organizat...
4,Eunimart,"E-Commerce Platforms, Retail, SaaS",Seed,—,"Hyderabad, Andhra Pradesh, India",Eunimart is a one stop solution for merchants ...


In [8]:
# reading in data for 2019
data_2019=pd.read_excel('startup_funding2019.xlsx')
data_2019.head()

Unnamed: 0,Company/Brand,Founded,HeadQuarter,Sector,What it does,Founders,Investor,Amount($),Stage
0,Bombay Shaving,,,Ecommerce,Provides a range of male grooming products,Shantanu Deshpande,Sixth Sense Ventures,6300000,
1,Ruangguru,2014.0,Mumbai,Edtech,A learning platform that provides topic-based ...,"Adamas Belva Syah Devara, Iman Usman.",General Atlantic,150000000,Series C
2,Eduisfun,,Mumbai,Edtech,It aims to make learning fun via games.,Jatin Solanki,"Deepak Parekh, Amitabh Bachchan, Piyush Pandey",28000000,Fresh funding
3,HomeLane,2014.0,Chennai,Interior design,Provides interior designing solutions,"Srikanth Iyer, Rama Harinath","Evolvence India Fund (EIF), Pidilite Group, FJ...",30000000,Series D
4,Nu Genes,2004.0,Telangana,AgriTech,"It is a seed company engaged in production, pr...",Narayana Reddy Punyala,Innovation in Food and Agriculture (IFA),6000000,


## Business Understanding

In this project, I am going to focus on the details of funding for these start-ups in India, where I will dive deep into a data containing all the details about fundings received by the start-ups, from 2018 to 2021 and communicate my findings and recommendations to the team in the best way possible.

Some hypotheses drawn are 

1. The total funding amount for Indian startups has increased significantly from 2018 to 2021

2. Certain sectors consistently receive more funding.

3. Startup funding in India is concentrated in specific cities or regions.


I will be focusing on answering the following  business questions:

1. what is the overall trend in funding amounts over the specified time period? (Average and Total funding amount per year)

2. What are the top 3 start-ups that received the highest funding and Which sectors are they from?

3. Which sectors has the highest total funding during this timeframe?

4. Which sectors has the lowest total funding during this timeframe?

5. Which locations have the highest total funding during this timeframe?

6. Which locations have the lowest total funding during this timeframe?


