# Introduction

For this project, we seek to analyse funding received by startups in India. The aim is to prescribe the best course of action for a startup looking into the Indian business ecosystem. Our first step will be to gain business understanding of the problem.

# Business Understanding

India has become an attractive location for investors and has seen a number of successful startups achieve the coveted "unicorn" status. To guide our quest for the best course of action as an upcoming startup, we asked a few questions which we will attempt to answer using the data on hand.

### Questions

- Does the age of the  startup affect the funding received?
- Which sectors received the most funding?
- Does the number of founders affect the funding received?
- At what stage do startups receive the most funding?
- Does the location affect the funding received?

### Hypothesis 

##### NULL: Technological industries do not have a higher success rate of being funded 

##### ALTERNATE: Technological industries have a higher success rate of being funded

## Setup

## Importation
Here is the section to import all the packages/libraries that will be used through this notebook.

In [2]:
# Data handling
import numpy as np 
import pandas as pd 

# Vizualisation (Matplotlib, Plotly, Seaborn, etc. )
import matplotlib.ticker as ticker
import matplotlib.pyplot as plt 
%matplotlib inline 
import seaborn as sns 
sns.set_style('whitegrid')
plt.style.use("fivethirtyeight")



import plotly.express as px

# EDA (pandas-profiling, etc. )

from scipy import stats

from scipy.stats import pearsonr

from scipy.stats import chi2_contingency



ModuleNotFoundError: No module named 'plotly'

# Data Loading
Here is the section to load the datasets and the additional files

#### Load 2018 Data  

In [4]:
# For CSV, use pandas.read_csv

#import the 2018 dataset 
#select specific columns 
startup_funding_2018 = pd.read_csv('startup_funding2018.csv', 
                                   usecols = ['Company Name','Industry','Round/Series','Amount','Location'])

# rename the columns for consistency 

#industry --> sector 
#Round/Series --> stage 
startup_funding_2018.rename(columns = {'Industry':'Sector'}, inplace = True)

startup_funding_2018.rename(columns = {'Round/Series':'Stage'}, inplace = True)

# Add the funding year as a column 

startup_funding_2018['Funding Year'] = "2018"

#Change the funding year to integer type 

startup_funding_2018['Funding Year'] = startup_funding_2018['Funding Year'].astype(int)

<bound method NDFrame.head of        Company/Brand  Founded HeadQuarter                 Sector   
0     Unbox Robotics   2019.0   Bangalore             AI startup  \
1             upGrad   2015.0      Mumbai                 EdTech   
2        Lead School   2012.0      Mumbai                 EdTech   
3            Bizongo   2015.0      Mumbai         B2B E-commerce   
4           FypMoney   2021.0    Gurugram                FinTech   
...              ...      ...         ...                    ...   
1204        Gigforce   2019.0    Gurugram  Staffing & Recruiting   
1205          Vahdam   2015.0   New Delhi       Food & Beverages   
1206    Leap Finance   2019.0   Bangalore     Financial Services   
1207    CollegeDekho   2015.0    Gurugram                 EdTech   
1208          WeRize   2019.0   Bangalore     Financial Services   

                                           What it does   
0     Unbox Robotics builds on-demand AI-driven ware...  \
1        UpGrad is an online higher