Skip to content

A Jupyter notebook based on Covid-19 Dataset to understand, analyze and visualize the changes in the number of cases through different visualization techniques and plots.

Notifications You must be signed in to change notification settings

parthd06/Covid-19DataAnalysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Covid-19 Analysis to visualize and understand the pandemic cases using different Techniques.

Coronavirus disease 2019 (COVID-19) time series lists confirmed cases, reported deaths, active cases and comparison with other Epidemics. Data are disaggregated by country (and sometimes subregion). Coronavirus disease (COVID-19) is caused by the Severe acute respiratory syndrome Coronavirus 2 (SARS-CoV-2) and has had a worldwide effect.

This notebook uses data from various sources to understand, analyze and visualize the changes in the number of cases through different visualization techniques and plots.

This dataset includes time-series data tracking the number of people affected by COVID-19 worldwide, including:

  • confirmed tested cases of Coronavirus infection
  • number of people who have reportedly died due to Coronavirus
  • number of people who have reportedly recovered from it

Notebook Demo

NotebookDemo.mp4

Inspiration

  • To test the understanding & apply the concepts learned from an IBM Data Science Specialization on Coursera.
  • To learn more about visualization techniques & tryout different plotting techniques.

Executed Cells from the Notebook

Checkout the Notebook with all the executed cells to see the results for each plots in this Html version of the notebook @Notebook.

Please note this file takes a lot of time to load & view, hence it is preferred that you watch the above video or execute the Notebook on your local Machine.

Note:

The Data collection for Recovered cases isn't quite accurate and has been stopped by a lot of countries and it is also found discrepancies in data if taken from multiple sources. Also a lot of recovery cases aren't reported and it is not possible to analyze them accurately. Yet the data reported till July 2020, is quite accurate to understand. You can select from Jan2020 to Jan 2021 to clearly see the transition in all the graphs.

Therefore the following two scenarios should also be kept in mind:

  • Active Cases are calculated based on recoveries so this will not be also correct
  • Anything that requires recoveries in the calculation isn't accurate, like deaths/100recoveries, etc.

Additional References for the above point:

Most of these reports are based on the US but mostly it is true for the rest of the world.

Sources for DataSet

The following datasources are used:

Screenshots

Dataset Preparation

WorldWide Confirmed, Recovered Cases & Deaths

Scatterplot

Case Density

Case over a period of Time

Case Visualization using Folium Maps

Choropleth Maps

Confirmed & Death Cases using Bar plot

Static Color Maps

Death & Recoveries per 100 Cases

No. of new cases per Day & Countries

Scatterplot for Deaths Vs Confirmed Cases

Bar plot

Line plot

Growth rate after 100 cases

Growth rate after 1000 cases

Growth rate after 100K cases

Growth rate after 1 Million cases

Growth rate after 10 Million cases

Tree Map Analysis

First & Last case report time

Confirmed cases Country & Day wise

Covid-19 Vs other Epidemics

About

A Jupyter notebook based on Covid-19 Dataset to understand, analyze and visualize the changes in the number of cases through different visualization techniques and plots.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published