This project involved using the CRISP-DM (Cross-Industry Standard Process for Data Mining) framework to analyze data on startup growth, funding patterns, success rates, regional concentration, and market penetration of the Indian Startup Ecosystem between 2018 and 2021.
The objective, as a data analyst, was to draw insights on the available data from 4 tables and give actionable insights and recommendations to our team that, hypothetically speaking, wants to venture into the Indian startup market. The project utilized advanced statistical techniques and visualization tools to draw meaningful conclusions and present our findings effectively.
Jupyter Notebook | Published Article | PowerBi Dashboard |
---|---|---|
Notebook | Medium Article | PowerBI Dashboard |
- Project Links
- Dataset
- Some Tools Used For The Project
- Dataset
- Process
- Project Structure
- Key Insights
- Dashboard
- How To Use This Repository
- Author
-
Company/Brand: Name of the company/start-up
-
Sector: Sector of service
-
What it does: Description about Company
-
Investor: Investors
-
Amount($): Raised fund
-
Stage: Round of funding reached
-
Year: The year of funding
-
Pull data from remote database with pyodbc; save as csv files
-
Develop questions and a hypothesis to base analysis of the project on
-
Understand the data and make decisions on how to process the data
-
Data preprocessing, cleaning and merging- The data was very messy and 90% of the project involved cleaning and making the data ready for analysis and visualizations
-
Visualise the data with seaborn and matplotlib.pyplot
-
Created a PowerBI dashboard with the visualizations
-
Wrote a medium article and briefly described the process, findings and recommendations
Dataset/
: Contains the dataset used for analysis..gitignore
: Holds files to be ignored by Git.LICENSE
: Project license.Project_notebook.ipynb
: The jupyter notebook with data cleaning, EDA and visualizationsREADME.md
: Project overview, links, highlights, and information.requirements.txt
: Required libraries & packages
- Top sectors in indian startup ecosystem are Fintech, Retail, Edtech, Tech and E-commerce.
- Bangalore has the most startups. It seems to be the emerging city with the top sectors being Retail, Food Delivery, Innovation Management and FinTech
- Mumbai is the big city with the big money investments, with leading sectors being Fintech, Retail and a Multinational conglomerate
- There seems to be high demand for finance solutions and shopping experiences. Retail was popular during the pandemic as more people were probably shopping from home.
The Indian startup ecosystem is a vibrant and influential force in the global market. This project provides valuable insights into funding patterns and industry preferences. By leveraging this information, stakeholders can make informed decisions and contribute to the growth and success of startups in India.
You need to have Python 3
on your system. Then you can clone this repo and being at the repo's root :: repository_name> ...
- Clone this repository:
git clone https://github.com/Azie88/LP1-Data-Analysis.git
- On your IDE, create A Virtual Environment and Install the required packages for the project:
-
Windows:
python -m venv venv; venv\Scripts\activate; python -m pip install -q --upgrade pip; python -m pip install -qr requirements.txt
-
Linux & MacOs:
python3 -m venv venv; source venv/bin/activate; python -m pip install -q --upgrade pip; python -m pip install -qr requirements.txt
The two long command-lines have the same structure. They pipe multiple commands using the symbol ;
but you can manually execute them one after the other.
- Create the Python's virtual environment that isolates the required libraries of the project to avoid conflicts;
- Activate the Python's virtual environment so that the Python kernel & libraries will be those of the isolated environment;
- Upgrade Pip, the installed libraries/packages manager to have the up-to-date version that will work correctly;
- Install the required libraries/packages listed in the
requirements.txt
file so that they can be imported into the python script and notebook without any issue.
NB: For MacOs users, please install Xcode
if you have an issue.
- Explore the Jupyter notebook for detailed steps and code execution.
- Check out the Power BI dashboard for interactive visualizations.
- Read the published article for a comprehensive understanding of the project.
Andrew Obando
Feel free to star ⭐ this repository if you find it helpful!