This Python script analyzes COVID-19 data using the Pandas, NumPy, Matplotlib, Seaborn, and scikit-learn libraries. It loads COVID-19 data from a CSV file, cleans and preprocesses the data, performs exploratory data analysis (EDA), visualizes the data using various plots, and demonstrates a linear regression model to predict COVID-19 recovery rates.
- Introduction
- Installation
- Usage
- Data Source
- Analysis and Visualization
- Linear Regression Model
- Screenshots
The COVID-19 pandemic has generated a vast amount of data, and this script allows you to explore and analyze COVID-19 data for different states/union territories in India. It covers the following tasks:
● Loading data from a CSV file. ● Data cleaning and preprocessing. ● Calculating active cases, death rates, and cured rates. ● Creating various plots using Matplotlib and Seaborn for data visualization. ● Implementing a linear regression model to predict COVID-19 recovery rates.
-
Clone this repository: git clone https://github.com/Keshajani12/Covid-Data-Analysis-Using-Python.git
-
Navigate to the project directory: cd Covid
-
Install the required Python packages using pip: pip install pandas numpy matplotlib seaborn scikit-learn
Download Zip and Install requirements.txt write command : pip install -r requirements.txt
-
Run the Python script: python covid_analysis.py
-
The script will load the COVID-19 data, perform analysis, generate plots, and display them.
The COVID-19 data is loaded from the 'covid_19_india.csv' file. Ensure that the data file is present in the same directory as the script.
● The script calculates active cases, death rates, and cured rates for each state/union territory. ● It creates line plots, bar plots, and scatter plots to visualize the data using Seaborn and Matplotlib.
● The script demonstrates a linear regression model using scikit-learn. ● It predicts COVID-19 recovery rates based on the number of deaths. ● Mean squared error (MSE) is calculated to evaluate the model's performance.