Skip to content

Beginner Data Analysis and Power BI Dashboarding with Indian Startup Dataset (2018-2021)

License

Notifications You must be signed in to change notification settings

Azie88/Beginner-Data-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

65 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Beginner Data Analytics -- Indian Start-up Funding Analysis 🇮🇳 💸

Punjabi Businessmen

This project involved using the CRISP-DM (Cross-Industry Standard Process for Data Mining) framework to analyze data on startup growth, funding patterns, success rates, regional concentration, and market penetration of the Indian Startup Ecosystem between 2018 and 2021.

The objective, as a data analyst, was to draw insights on the available data from 4 tables and give actionable insights and recommendations to our team that, hypothetically speaking, wants to venture into the Indian startup market. The project utilized advanced statistical techniques and visualization tools to draw meaningful conclusions and present our findings effectively.

Project Links 🔗

Jupyter Notebook Published Article PowerBi Dashboard
Notebook Medium Article PowerBI Dashboard

Table of Contents 🔖

Dataset ℹ️

  • Company/Brand: Name of the company/start-up

  • Sector: Sector of service

  • What it does: Description about Company

  • Investor: Investors

  • Amount($): Raised fund

  • Stage: Round of funding reached

  • Year: The year of funding

Some Tools Used For The Project 🧰

vscode pandas numpy python jupyter

Process

  • Pull data from remote database with pyodbc; save as csv files

  • Develop questions and a hypothesis to base analysis of the project on

  • Understand the data and make decisions on how to process the data

  • Data preprocessing, cleaning and merging- The data was very messy and 90% of the project involved cleaning and making the data ready for analysis and visualizations

  • Visualise the data with seaborn and matplotlib.pyplot

  • Created a PowerBI dashboard with the visualizations

  • Wrote a medium article and briefly described the process, findings and recommendations

Project Structure 📂

  • Dataset/: Contains the dataset used for analysis.
  • .gitignore: Holds files to be ignored by Git.
  • LICENSE: Project license.
  • Project_notebook.ipynb: The jupyter notebook with data cleaning, EDA and visualizations
  • README.md: Project overview, links, highlights, and information.
  • requirements.txt: Required libraries & packages

Key Insights 📈

  1. Top sectors in indian startup ecosystem are Fintech, Retail, Edtech, Tech and E-commerce.
  2. Bangalore has the most startups. It seems to be the emerging city with the top sectors being Retail, Food Delivery, Innovation Management and FinTech
  3. Mumbai is the big city with the big money investments, with leading sectors being Fintech, Retail and a Multinational conglomerate
  4. There seems to be high demand for finance solutions and shopping experiences. Retail was popular during the pandemic as more people were probably shopping from home.

The Indian startup ecosystem is a vibrant and influential force in the global market. This project provides valuable insights into funding patterns and industry preferences. By leveraging this information, stakeholders can make informed decisions and contribute to the growth and success of startups in India.

Power BI Dashboard 📺

Dashboard

PowerBI Dashboard

How to Use The Repository

You need to have Python 3 on your system. Then you can clone this repo and being at the repo's root :: repository_name> ...

  1. Clone this repository: git clone https://github.com/Azie88/LP1-Data-Analysis.git
  2. On your IDE, create A Virtual Environment and Install the required packages for the project:
  • Windows:

      python -m venv venv; 
      venv\Scripts\activate; 
      python -m pip install -q --upgrade pip; 
      python -m pip install -qr requirements.txt  
    
  • Linux & MacOs:

      python3 -m venv venv; 
      source venv/bin/activate; 
      python -m pip install -q --upgrade pip; 
      python -m pip install -qr requirements.txt  
    

The two long command-lines have the same structure. They pipe multiple commands using the symbol ; but you can manually execute them one after the other.

  • Create the Python's virtual environment that isolates the required libraries of the project to avoid conflicts;
  • Activate the Python's virtual environment so that the Python kernel & libraries will be those of the isolated environment;
  • Upgrade Pip, the installed libraries/packages manager to have the up-to-date version that will work correctly;
  • Install the required libraries/packages listed in the requirements.txt file so that they can be imported into the python script and notebook without any issue.

NB: For MacOs users, please install Xcode if you have an issue.

  1. Explore the Jupyter notebook for detailed steps and code execution.
  2. Check out the Power BI dashboard for interactive visualizations.
  3. Read the published article for a comprehensive understanding of the project.

Author ✍️

Andrew Obando

Andrew Obando | LinkedIn Medium


Feel free to star ⭐ this repository if you find it helpful!

About

Beginner Data Analysis and Power BI Dashboarding with Indian Startup Dataset (2018-2021)

Topics

Resources

License

Stars

Watchers

Forks