Project : Analyzing the trends of COVID-19 with Python

Problem Statement:

Given data about COVID-19 patients, write code to visualize the impact and
analyze the trend of rate of infection and recovery as well as make predictions
about the number of cases expected a week in future based on the current
trends.
Dataset:

CSV and Excel files containing data about the number of COVID-19 confirmed
deaths and recovered patients both around the world and in India. Download Link
Guidelines:

● Use pandas to accumulate data from multiple data files.

● Use plotly (visualization library) to create interactive visualizations.

● Use Facebook prophet library to make time series models.

● Visualize the prediction by combining these technologies.

In [1]:
# First of let's import some libraries
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.graph_objects as go
from sklearn.metrics import accuracy_score,confusion_matrix,classification_report,r2_score

In [2]:
#lets load our covid 19 dataset
cvd=pd.read_csv('covid_19.csv',skipinitialspace=True)


In [3]:
#lets read first five and last five records of out dataset
print(cvd.head(5))
print(cvd.tail(5))

  Province/State Country/Region       Lat       Long        Date  Confirmed  \
0            NaN    Afghanistan  33.93911  67.709953  2020-01-22          0   
1            NaN        Albania  41.15330  20.168300  2020-01-22          0   
2            NaN        Algeria  28.03390   1.659600  2020-01-22          0   
3            NaN        Andorra  42.50630   1.521800  2020-01-22          0   
4            NaN         Angola -11.20270  17.873900  2020-01-22          0   

   Deaths  Recovered  Active             WHO Region  
0       0          0       0  Eastern Mediterranean  
1       0          0       0                 Europe  
2       0          0       0                 Africa  
3       0          0       0                 Europe  
4       0          0       0                 Africa  
      Province/State         Country/Region        Lat       Long        Date  \
49063            NaN  Sao Tome and Principe   0.186400   6.613100  2020-07-27   
49064            NaN                  Y

In [4]:
# we can see there are two columns which have unnecessary long name so let's change it
cvd.rename(columns={'Province/State':'State','Country/Region':'Country'},inplace=True)
cvd.head()

Unnamed: 0,State,Country,Lat,Long,Date,Confirmed,Deaths,Recovered,Active,WHO Region
0,,Afghanistan,33.93911,67.709953,2020-01-22,0,0,0,0,Eastern Mediterranean
1,,Albania,41.1533,20.1683,2020-01-22,0,0,0,0,Europe
2,,Algeria,28.0339,1.6596,2020-01-22,0,0,0,0,Africa
3,,Andorra,42.5063,1.5218,2020-01-22,0,0,0,0,Europe
4,,Angola,-11.2027,17.8739,2020-01-22,0,0,0,0,Africa


## "Lat" typically represents the latitude (geographical coordinate) of a specific location, and "Long" represents the longitude (another geographical coordinate) of that location. These coordinates provide the geographical location of where the COVID-19 cases or related data were recorded.

These features are essential for mapping and visualizing the spread of COVID-19 across different regions and countries. Researchers and data analysts use these coordinates to create visual representations such as maps, allowing a better understanding of the geographic distribution of cases.

# **EDA**

In [5]:
#now let's know about the columns and information of dataset
cvd.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 49068 entries, 0 to 49067
Data columns (total 10 columns):
 #   Column      Non-Null Count  Dtype  
---  ------      --------------  -----  
 0   State       14664 non-null  object 
 1   Country     49068 non-null  object 
 2   Lat         49068 non-null  float64
 3   Long        49068 non-null  float64
 4   Date        49068 non-null  object 
 5   Confirmed   49068 non-null  int64  
 6   Deaths      49068 non-null  int64  
 7   Recovered   49068 non-null  int64  
 8   Active      49068 non-null  int64  
 9   WHO Region  49068 non-null  object 
dtypes: float64(2), int64(4), object(4)
memory usage: 3.7+ MB


Clearly we can see thare are 2,4,4 float,int and object datatypes columns respectively.

Also the shape of dataset is 49068 x 10 that is 49068 rows and 10 columns.

In addition we can see that there are null values.

In [6]:
# now check for null values
pd.isnull(cvd).sum()

State         34404
Country           0
Lat               0
Long              0
Date              0
Confirmed         0
Deaths            0
Recovered         0
Active            0
WHO Region        0
dtype: int64

So, in this dataset Province/state columns have all the null values.

## **Analysis of Recovered,Confirmed,Active and Deaths, Date wise Country wise Lets do this**

In [7]:
# for example try with latest date
cvd['Date'].max()

'2020-07-27'

In [8]:
# '2020-07-27' by using this data lets find some imsights
date_1=cvd[cvd['Date']=='2020-07-27'] #filtering for specific date
date_1.head()

Unnamed: 0,State,Country,Lat,Long,Date,Confirmed,Deaths,Recovered,Active,WHO Region
48807,,Afghanistan,33.93911,67.709953,2020-07-27,36263,1269,25198,9796,Eastern Mediterranean
48808,,Albania,41.1533,20.1683,2020-07-27,4880,144,2745,1991,Europe
48809,,Algeria,28.0339,1.6596,2020-07-27,27973,1163,18837,7973,Africa
48810,,Andorra,42.5063,1.5218,2020-07-27,907,52,803,52,Europe
48811,,Angola,-11.2027,17.8739,2020-07-27,950,41,242,667,Africa


We can interprate that on the 27th july 2020 Afghanistan hase 36263 confirmed, 1269 Deaths,25198 recoverd and 9796 are Active cases are there.

In [9]:
#now lets find all the cases for given date and country wise
country_wise=date_1.groupby('Country')['Confirmed','Deaths','Recovered','Active'].sum().reset_index()
country_wise

  country_wise=date_1.groupby('Country')['Confirmed','Deaths','Recovered','Active'].sum().reset_index()


Unnamed: 0,Country,Confirmed,Deaths,Recovered,Active
0,Afghanistan,36263,1269,25198,9796
1,Albania,4880,144,2745,1991
2,Algeria,27973,1163,18837,7973
3,Andorra,907,52,803,52
4,Angola,950,41,242,667
...,...,...,...,...,...
182,West Bank and Gaza,10621,78,3752,6791
183,Western Sahara,10,1,8,1
184,Yemen,1691,483,833,375
185,Zambia,4552,140,2815,1597


In [10]:
# let's Find top 10 country with most no of comfirmed cases on that date
country_wise.sort_values(by='Confirmed',ascending=False).reset_index().iloc[0:10]

Unnamed: 0,index,Country,Confirmed,Deaths,Recovered,Active
0,173,US,4290259,148011,1325804,2816444
1,23,Brazil,2442375,87618,1846641,508116
2,79,India,1480073,33408,951166,495499
3,138,Russia,816680,13334,602249,201097
4,154,South Africa,452529,7067,274925,170537
5,111,Mexico,395489,44022,303810,47657
6,132,Peru,389717,18418,272547,98752
7,35,Chile,347923,9187,319954,18782
8,177,United Kingdom,301708,45844,1437,254427
9,81,Iran,293606,15912,255144,22550


From the above  we can see top 10 contries with most number of comfirmed cases, And US has most number of comfirmed cases.
Similarly We will obtain this for Deaths,Recovered and Active Cases.

In [11]:
country_wise.sort_values(by='Deaths',ascending=False).reset_index().iloc[0:10]

Unnamed: 0,index,Country,Confirmed,Deaths,Recovered,Active
0,173,US,4290259,148011,1325804,2816444
1,23,Brazil,2442375,87618,1846641,508116
2,177,United Kingdom,301708,45844,1437,254427
3,111,Mexico,395489,44022,303810,47657
4,85,Italy,246286,35112,198593,12581
5,79,India,1480073,33408,951166,495499
6,61,France,220352,30212,81212,108928
7,157,Spain,272421,28432,150376,93613
8,132,Peru,389717,18418,272547,98752
9,81,Iran,293606,15912,255144,22550


Again We can see above that US has most no. of deaths.

In [12]:
country_wise.sort_values(by='Recovered',ascending=False).reset_index().iloc[0:10]

Unnamed: 0,index,Country,Confirmed,Deaths,Recovered,Active
0,23,Brazil,2442375,87618,1846641,508116
1,173,US,4290259,148011,1325804,2816444
2,79,India,1480073,33408,951166,495499
3,138,Russia,816680,13334,602249,201097
4,35,Chile,347923,9187,319954,18782
5,111,Mexico,395489,44022,303810,47657
6,154,South Africa,452529,7067,274925,170537
7,132,Peru,389717,18418,272547,98752
8,81,Iran,293606,15912,255144,22550
9,128,Pakistan,274289,5842,241026,27421


In the Recovery department Brazil has most numbers.

In [13]:
country_wise.sort_values(by='Active',ascending=False).reset_index().iloc[0:10]

Unnamed: 0,index,Country,Confirmed,Deaths,Recovered,Active
0,173,US,4290259,148011,1325804,2816444
1,23,Brazil,2442375,87618,1846641,508116
2,79,India,1480073,33408,951166,495499
3,177,United Kingdom,301708,45844,1437,254427
4,138,Russia,816680,13334,602249,201097
5,154,South Africa,452529,7067,274925,170537
6,37,Colombia,257101,8777,131161,117163
7,61,France,220352,30212,81212,108928
8,32,Canada,116458,8944,0,107514
9,132,Peru,389717,18418,272547,98752


US has most no of Active cases on 2020/07/27.

We can do analysis for each day and country wise (for given in data set), by simply creating a function.

In [14]:
def get_data_dctoc(date,country,*arg) :
  df=cvd[cvd['Date']==date].groupby('Country')[arg].sum().reset_index()
  return df[df['Country']==country]

So, Using above function you can get a data for specific data, country and type of cases like confirmed,deaths,active and recovered.
Below you can see example of it,

In [15]:
get_data_dctoc('2020-07-27','India','Confirmed','Active')

  df=cvd[cvd['Date']==date].groupby('Country')[arg].sum().reset_index()


Unnamed: 0,Country,Confirmed,Active
79,India,1480073,495499


I passed date as 2020/07/27, country as india and cases as confirmed and active.

In [16]:
#creating another function which will give us whole the data for some specific date and country so we can analize
# by simply passing a values
def get_data_date(date) :
  df=cvd[cvd['Date']==date]
  return df

In [17]:
get_data_date('2020-07-27')

Unnamed: 0,State,Country,Lat,Long,Date,Confirmed,Deaths,Recovered,Active,WHO Region
48807,,Afghanistan,33.939110,67.709953,2020-07-27,36263,1269,25198,9796,Eastern Mediterranean
48808,,Albania,41.153300,20.168300,2020-07-27,4880,144,2745,1991,Europe
48809,,Algeria,28.033900,1.659600,2020-07-27,27973,1163,18837,7973,Africa
48810,,Andorra,42.506300,1.521800,2020-07-27,907,52,803,52,Europe
48811,,Angola,-11.202700,17.873900,2020-07-27,950,41,242,667,Africa
...,...,...,...,...,...,...,...,...,...,...
49063,,Sao Tome and Principe,0.186400,6.613100,2020-07-27,865,14,734,117,Africa
49064,,Yemen,15.552727,48.516388,2020-07-27,1691,483,833,375,Eastern Mediterranean
49065,,Comoros,-11.645500,43.333300,2020-07-27,354,7,328,19,Africa
49066,,Tajikistan,38.861000,71.276100,2020-07-27,7235,60,6028,1147,Europe


just example we can try any other dates also.


In [18]:
# creating function for specific country data
def get_data_country(country) :
  return cvd[cvd['Country']==country]

In [19]:
get_data_country('US')

Unnamed: 0,State,Country,Lat,Long,Date,Confirmed,Deaths,Recovered,Active,WHO Region
223,,US,40.0,-100.0,2020-01-22,1,0,0,1,Americas
484,,US,40.0,-100.0,2020-01-23,1,0,0,1,Americas
745,,US,40.0,-100.0,2020-01-24,2,0,0,2,Americas
1006,,US,40.0,-100.0,2020-01-25,2,0,0,2,Americas
1267,,US,40.0,-100.0,2020-01-26,5,0,0,5,Americas
...,...,...,...,...,...,...,...,...,...,...
47986,,US,40.0,-100.0,2020-07-23,4038816,144430,1233269,2661117,Americas
48247,,US,40.0,-100.0,2020-07-24,4112531,145560,1261624,2705347,Americas
48508,,US,40.0,-100.0,2020-07-25,4178970,146465,1279414,2753091,Americas
48769,,US,40.0,-100.0,2020-07-26,4233923,146935,1297863,2789125,Americas


So usingg these function we can directly retrive data as per our need and on them by applying sorting or filtering methods we can determine top 10 or 105 or least 5 countries etc data.

In [20]:
# creating ne more function that take type of cases by the dates
def get_data_t_asD(*arg) :
  return cvd.groupby('Date')[arg].sum().reset_index()

In [21]:
# lets get date wise confirmed cases
get_data_t_asD('Confirmed')

Unnamed: 0,Date,Confirmed
0,2020-01-22,555
1,2020-01-23,654
2,2020-01-24,941
3,2020-01-25,1434
4,2020-01-26,2118
...,...,...
183,2020-07-23,15510481
184,2020-07-24,15791645
185,2020-07-25,16047190
186,2020-07-26,16251796


From above we can retrive data of confirmed cases on a specific day.


In [22]:
get_data_t_asD('Active','Deaths')

  return cvd.groupby('Date')[arg].sum().reset_index()


Unnamed: 0,Date,Active,Deaths
0,2020-01-22,510,17
1,2020-01-23,606,18
2,2020-01-24,879,26
3,2020-01-25,1353,42
4,2020-01-26,2010,56
...,...,...,...
183,2020-07-23,6166006,633506
184,2020-07-24,6212290,639650
185,2020-07-25,6243930,644517
186,2020-07-26,6309711,648621


In [23]:
#printing some in data date wise
for i in ['Confirmed','Deaths','Recovered','Active'] :
  print(f'Motst no.of {i} on {get_data_t_asD(i).sort_values(by=i,ascending=False).iloc[0,0]} is {get_data_t_asD(i).sort_values(by=i,ascending=False).reset_index().loc[0,i]}')
  print(f'Least no.of {i} on {get_data_t_asD(i).sort_values(by=i,ascending=True).iloc[0,0]} is {get_data_t_asD(i).sort_values(by=i,ascending=True).reset_index().loc[0,i]}')

Motst no.of Confirmed on 2020-07-27 is 16480485
Least no.of Confirmed on 2020-01-22 is 555
Motst no.of Deaths on 2020-07-27 is 654036
Least no.of Deaths on 2020-01-22 is 17
Motst no.of Recovered on 2020-07-27 is 9468087
Least no.of Recovered on 2020-01-22 is 28
Motst no.of Active on 2020-07-27 is 6358362
Least no.of Active on 2020-01-22 is 510


In [24]:
## creating more function that take type of cases by the Country
def get_data_t_asC(*arg) :
    return cvd.groupby('Country')[arg].sum().reset_index()

In [25]:
get_data_t_asC('Confirmed','Deaths','Active','Recovered')

  return cvd.groupby('Country')[arg].sum().reset_index()


Unnamed: 0,Country,Confirmed,Deaths,Active,Recovered
0,Afghanistan,1936390,49098,1089052,798240
1,Albania,196702,5708,72117,118877
2,Algeria,1179755,77972,345886,755897
3,Andorra,94404,5423,19907,69074
4,Angola,22662,1078,15011,6573
...,...,...,...,...,...
182,West Bank and Gaza,233461,1370,170967,61124
183,Western Sahara,901,63,190,648
184,Yemen,67180,17707,25694,23779
185,Zambia,129421,2643,43167,83611


In [26]:
#printing some in data Country wise
for i in ['Confirmed','Deaths','Recovered','Active'] :
  print(f'Motst no.of {i} in {get_data_t_asC(i).sort_values(by=i,ascending=False).iloc[0,0]} is {get_data_t_asC(i).sort_values(by=i,ascending=False).reset_index().loc[0,i]}')
  print(f'Least no.of {i} in {get_data_t_asC(i).sort_values(by=i,ascending=True).iloc[0,0]} is {get_data_t_asC(i).sort_values(by=i,ascending=True).reset_index().loc[0,i]}')

Motst no.of Confirmed in US is 224345948
Least no.of Confirmed in Western Sahara is 901
Motst no.of Deaths in US is 11011411
Least no.of Deaths in Fiji is 0
Motst no.of Recovered in US is 56353416
Least no.of Recovered in Mozambique is 0
Motst no.of Active in US is 156981121
Least no.of Active in Greenland is 135


### **Now, We will analize visualize about US,India and China**

***First take US***

In [27]:
us_data=get_data_country('US')
us_data.head()

Unnamed: 0,State,Country,Lat,Long,Date,Confirmed,Deaths,Recovered,Active,WHO Region
223,,US,40.0,-100.0,2020-01-22,1,0,0,1,Americas
484,,US,40.0,-100.0,2020-01-23,1,0,0,1,Americas
745,,US,40.0,-100.0,2020-01-24,2,0,0,2,Americas
1006,,US,40.0,-100.0,2020-01-25,2,0,0,2,Americas
1267,,US,40.0,-100.0,2020-01-26,5,0,0,5,Americas


In [28]:
us_data.tail()

Unnamed: 0,State,Country,Lat,Long,Date,Confirmed,Deaths,Recovered,Active,WHO Region
47986,,US,40.0,-100.0,2020-07-23,4038816,144430,1233269,2661117,Americas
48247,,US,40.0,-100.0,2020-07-24,4112531,145560,1261624,2705347,Americas
48508,,US,40.0,-100.0,2020-07-25,4178970,146465,1279414,2753091,Americas
48769,,US,40.0,-100.0,2020-07-26,4233923,146935,1297863,2789125,Americas
49030,,US,40.0,-100.0,2020-07-27,4290259,148011,1325804,2816444,Americas


In [29]:
us_data.shape

(188, 10)

In [30]:
us_data.isnull().sum()

State         188
Country         0
Lat             0
Long            0
Date            0
Confirmed       0
Deaths          0
Recovered       0
Active          0
WHO Region      0
dtype: int64

In [31]:
def us_data_t_asD(*arg) :
  return us_data.groupby('Date')[arg].sum().reset_index()

## creating more function that take type of cases by the State
def us_data_t_asS(*arg) :
    return us_data.groupby('State')[arg].sum().reset_index()

In [32]:
us_data_t_asD('Deaths','Active','Recovered')

  return us_data.groupby('Date')[arg].sum().reset_index()


Unnamed: 0,Date,Deaths,Active,Recovered
0,2020-01-22,0,1,0
1,2020-01-23,0,1,0
2,2020-01-24,0,2,0
3,2020-01-25,0,2,0
4,2020-01-26,0,5,0
...,...,...,...,...
183,2020-07-23,144430,2661117,1233269
184,2020-07-24,145560,2705347,1261624
185,2020-07-25,146465,2753091,1279414
186,2020-07-26,146935,2789125,1297863


In [33]:
#printing some in data date wise
for i in ['Confirmed','Deaths','Recovered','Active'] :
  print(f'Motst no.of {i} in the US on {us_data_t_asD(i).sort_values(by=i,ascending=False).iloc[0,0]} is {us_data_t_asD(i).sort_values(by=i,ascending=False).reset_index().loc[0,i]}')
  print(f'Least no.of {i} in the US on {us_data_t_asD(i).sort_values(by=i,ascending=True).iloc[0,0]} is {us_data_t_asD(i).sort_values(by=i,ascending=True).reset_index().loc[0,i]}')

Motst no.of Confirmed in the US on 2020-07-27 is 4290259
Least no.of Confirmed in the US on 2020-01-22 is 1
Motst no.of Deaths in the US on 2020-07-27 is 148011
Least no.of Deaths in the US on 2020-01-22 is 0
Motst no.of Recovered in the US on 2020-07-27 is 1325804
Least no.of Recovered in the US on 2020-01-22 is 0
Motst no.of Active in the US on 2020-07-27 is 2816444
Least no.of Active in the US on 2020-01-22 is 1


## ***Now lets find about India ***

In [34]:
ind_data=get_data_country('India')
ind_data.head()

Unnamed: 0,State,Country,Lat,Long,Date,Confirmed,Deaths,Recovered,Active,WHO Region
129,,India,20.593684,78.96288,2020-01-22,0,0,0,0,South-East Asia
390,,India,20.593684,78.96288,2020-01-23,0,0,0,0,South-East Asia
651,,India,20.593684,78.96288,2020-01-24,0,0,0,0,South-East Asia
912,,India,20.593684,78.96288,2020-01-25,0,0,0,0,South-East Asia
1173,,India,20.593684,78.96288,2020-01-26,0,0,0,0,South-East Asia


In [35]:
ind_data.tail()

Unnamed: 0,State,Country,Lat,Long,Date,Confirmed,Deaths,Recovered,Active,WHO Region
47892,,India,20.593684,78.96288,2020-07-23,1288108,30601,817209,440298,South-East Asia
48153,,India,20.593684,78.96288,2020-07-24,1337024,31358,849432,456234,South-East Asia
48414,,India,20.593684,78.96288,2020-07-25,1385635,32060,885573,468002,South-East Asia
48675,,India,20.593684,78.96288,2020-07-26,1435616,32771,917568,485277,South-East Asia
48936,,India,20.593684,78.96288,2020-07-27,1480073,33408,951166,495499,South-East Asia


In [36]:
ind_data.shape

(188, 10)

In [37]:
ind_data.isnull().sum()

State         188
Country         0
Lat             0
Long            0
Date            0
Confirmed       0
Deaths          0
Recovered       0
Active          0
WHO Region      0
dtype: int64

In [38]:
def ind_data_t_asD(*arg) :
  return ind_data.groupby('Date')[arg].sum().reset_index()

## creating more function that take type of cases by the State
def ind_data_t_asS(*arg) :
    return ind_data.groupby('State')[arg].sum().reset_index()

In [39]:
ind_data_t_asD('Confirmed','Deaths','Active','Recovered')

  return ind_data.groupby('Date')[arg].sum().reset_index()


Unnamed: 0,Date,Confirmed,Deaths,Active,Recovered
0,2020-01-22,0,0,0,0
1,2020-01-23,0,0,0,0
2,2020-01-24,0,0,0,0
3,2020-01-25,0,0,0,0
4,2020-01-26,0,0,0,0
...,...,...,...,...,...
183,2020-07-23,1288108,30601,440298,817209
184,2020-07-24,1337024,31358,456234,849432
185,2020-07-25,1385635,32060,468002,885573
186,2020-07-26,1435616,32771,485277,917568


In [40]:
#printing some in data date wise
for i in ['Confirmed','Deaths','Recovered','Active'] :
  print(f'Motst no.of {i} in the India on {ind_data_t_asD(i).sort_values(by=i,ascending=False).iloc[0,0]} is {ind_data_t_asD(i).sort_values(by=i,ascending=False).reset_index().loc[0,i]}')
  print(f'Least no.of {i} in the India on {ind_data_t_asD(i).sort_values(by=i,ascending=True).iloc[0,0]} is {ind_data_t_asD(i).sort_values(by=i,ascending=True).reset_index().loc[0,i]}')

Motst no.of Confirmed in the India on 2020-07-27 is 1480073
Least no.of Confirmed in the India on 2020-01-22 is 0
Motst no.of Deaths in the India on 2020-07-27 is 33408
Least no.of Deaths in the India on 2020-01-22 is 0
Motst no.of Recovered in the India on 2020-07-27 is 951166
Least no.of Recovered in the India on 2020-01-22 is 0
Motst no.of Active in the India on 2020-07-27 is 495499
Least no.of Active in the India on 2020-01-22 is 0


## ***China is here ***

In [41]:
chin_data=get_data_country('China')
chin_data.head()

Unnamed: 0,State,Country,Lat,Long,Date,Confirmed,Deaths,Recovered,Active,WHO Region
48,Anhui,China,31.8257,117.2264,2020-01-22,1,0,0,1,Western Pacific
49,Beijing,China,40.1824,116.4142,2020-01-22,14,0,0,14,Western Pacific
50,Chongqing,China,30.0572,107.874,2020-01-22,6,0,0,6,Western Pacific
51,Fujian,China,26.0789,117.9874,2020-01-22,1,0,0,1,Western Pacific
52,Gansu,China,35.7518,104.2861,2020-01-22,0,0,0,0,Western Pacific


In [42]:
chin_data.tail()

Unnamed: 0,State,Country,Lat,Long,Date,Confirmed,Deaths,Recovered,Active,WHO Region
48883,Tianjin,China,39.3054,117.323,2020-07-27,204,3,195,6,Western Pacific
48884,Tibet,China,31.6927,88.0924,2020-07-27,1,0,1,0,Western Pacific
48885,Xinjiang,China,41.1129,85.2401,2020-07-27,311,3,73,235,Western Pacific
48886,Yunnan,China,24.974,101.487,2020-07-27,190,2,186,2,Western Pacific
48887,Zhejiang,China,29.1832,120.0934,2020-07-27,1270,1,1268,1,Western Pacific


In [43]:
chin_data.shape

(6204, 10)

In [44]:
chin_data.isnull().sum()

State         0
Country       0
Lat           0
Long          0
Date          0
Confirmed     0
Deaths        0
Recovered     0
Active        0
WHO Region    0
dtype: int64

In [45]:
def chin_data_t_asD(*arg) :
  return chin_data.groupby('Date')[arg].sum().reset_index()

## creating more function that take type of cases by the State
def chin_data_t_asS(*arg) :
    return chin_data.groupby('State')[arg].sum().reset_index()

In [46]:
chin_data_t_asD('Confirmed','Deaths','Active','Recovered')

  return chin_data.groupby('Date')[arg].sum().reset_index()


Unnamed: 0,Date,Confirmed,Deaths,Active,Recovered
0,2020-01-22,548,17,503,28
1,2020-01-23,643,18,595,30
2,2020-01-24,920,26,858,36
3,2020-01-25,1406,42,1325,39
4,2020-01-26,2075,56,1970,49
...,...,...,...,...,...
183,2020-07-23,86045,4649,2695,78701
184,2020-07-24,86202,4650,2807,78745
185,2020-07-25,86381,4652,2916,78813
186,2020-07-26,86570,4652,3056,78862


In [47]:
chin_data_t_asS('Confirmed','Deaths','Active','Recovered').head(10)

  return chin_data.groupby('State')[arg].sum().reset_index()


Unnamed: 0,State,Confirmed,Deaths,Active,Recovered
0,Anhui,172497,1007,15722,155768
1,Beijing,108512,1383,23282,83847
2,Chongqing,101756,1013,10608,90135
3,Fujian,59855,159,6760,52936
4,Gansu,23786,341,2282,21163
5,Guangdong,268051,1273,29668,237110
6,Guangxi,44368,336,5105,38927
7,Guizhou,25341,335,2334,22672
8,Hainan,29584,993,2661,25930
9,Hebei,56848,1000,55848,0


In [48]:
#printing some in data date wise
for i in ['Confirmed','Deaths','Recovered','Active'] :
  print(f'Motst no.of {i} in the China on {chin_data_t_asD(i).sort_values(by=i,ascending=False).iloc[0,0]} is {chin_data_t_asD(i).sort_values(by=i,ascending=False).reset_index().loc[0,i]}')
  print(f'Least no.of {i} in the China on {chin_data_t_asD(i).sort_values(by=i,ascending=True).iloc[0,0]} is {chin_data_t_asD(i).sort_values(by=i,ascending=True).reset_index().loc[0,i]}')

Motst no.of Confirmed in the China on 2020-07-27 is 86783
Least no.of Confirmed in the China on 2020-01-22 is 548
Motst no.of Deaths in the China on 2020-07-27 is 4656
Least no.of Deaths in the China on 2020-01-22 is 17
Motst no.of Recovered in the China on 2020-07-27 is 78869
Least no.of Recovered in the China on 2020-01-22 is 28
Motst no.of Active in the China on 2020-02-17 is 58739
Least no.of Active in the China on 2020-01-22 is 503


## **Now let analize with WHO region**

In [49]:
def w_data_R(region) :
 return cvd[cvd['WHO Region']==region]

In [50]:
cvd['WHO Region'].unique()

array(['Eastern Mediterranean', 'Europe', 'Africa', 'Americas',
       'Western Pacific', 'South-East Asia'], dtype=object)

In [51]:
## creating more function that take type of cases by the who regio
def get_data_t_asW(*arg) :
    return cvd.groupby('WHO Region')[arg].sum().reset_index()

In [52]:
get_data_t_asW('Deaths','Confirmed','Recovered','Active')

  return cvd.groupby('WHO Region')[arg].sum().reset_index()


Unnamed: 0,WHO Region,Deaths,Confirmed,Recovered,Active
0,Africa,439978,21791827,11193730,10158119
1,Americas,19359292,402261194,157069444,225832458
2,Eastern Mediterranean,1924029,74082892,48050703,24108160
3,Europe,19271040,248879793,123202075,106406678
4,South-East Asia,1458134,55118365,30030327,23629904
5,Western Pacific,932430,26374411,18861950,6580031


In [53]:
#printing some in data WHO wise
for i in ['Confirmed','Deaths','Recovered','Active'] :
  print(f'Motst no.of {i} in {get_data_t_asW(i).sort_values(by=i,ascending=False).iloc[0,0]} is {get_data_t_asW(i).sort_values(by=i,ascending=False).reset_index().loc[0,i]}')
  print(f'Least no.of {i} in {get_data_t_asW(i).sort_values(by=i,ascending=True).iloc[0,0]} is {get_data_t_asW(i).sort_values(by=i,ascending=True).reset_index().loc[0,i]}')

Motst no.of Confirmed in Americas is 402261194
Least no.of Confirmed in Africa is 21791827
Motst no.of Deaths in Americas is 19359292
Least no.of Deaths in Africa is 439978
Motst no.of Recovered in Americas is 157069444
Least no.of Recovered in Africa is 11193730
Motst no.of Active in Americas is 225832458
Least no.of Active in Western Pacific is 6580031


**Lets start with firs region 'Estern Mediterranean'**

In [54]:
emd=w_data_R('Eastern Mediterranean')
emd.head()

Unnamed: 0,State,Country,Lat,Long,Date,Confirmed,Deaths,Recovered,Active,WHO Region
0,,Afghanistan,33.93911,67.709953,2020-01-22,0,0,0,0,Eastern Mediterranean
19,,Bahrain,26.0275,50.55,2020-01-22,0,0,0,0,Eastern Mediterranean
93,,Djibouti,11.8251,42.5903,2020-01-22,0,0,0,0,Eastern Mediterranean
96,,Egypt,26.820553,30.802498,2020-01-22,0,0,0,0,Eastern Mediterranean
131,,Iran,32.427908,53.688046,2020-01-22,0,0,0,0,Eastern Mediterranean


In [55]:
emd.tail()

Unnamed: 0,State,Country,Lat,Long,Date,Confirmed,Deaths,Recovered,Active,WHO Region
49021,,United Arab Emirates,23.424076,53.847818,2020-07-27,59177,345,52510,6322,Eastern Mediterranean
49039,,Syria,34.802075,38.996815,2020-07-27,674,40,0,634,Eastern Mediterranean
49043,,Libya,26.3351,17.228331,2020-07-27,2827,64,577,2186,Eastern Mediterranean
49044,,West Bank and Gaza,31.9522,35.2332,2020-07-27,10621,78,3752,6791,Eastern Mediterranean
49064,,Yemen,15.552727,48.516388,2020-07-27,1691,483,833,375,Eastern Mediterranean


In [56]:
emd.shape

(4136, 10)

In [57]:
emd.isnull().sum()

State         4136
Country          0
Lat              0
Long             0
Date             0
Confirmed        0
Deaths           0
Recovered        0
Active           0
WHO Region       0
dtype: int64

In [58]:
#creating another function which will give us whole the data for some specific date and country so we can analize
# by simply passing a values
def emd_date(date) :
  df=emd[emd['Date']==date]
  return df

In [59]:
emd_date('2020-05-01')

Unnamed: 0,State,Country,Lat,Long,Date,Confirmed,Deaths,Recovered,Active,WHO Region
26100,,Afghanistan,33.93911,67.709953,2020-05-01,2335,68,310,1957,Eastern Mediterranean
26119,,Bahrain,26.0275,50.55,2020-05-01,3170,8,1555,1607,Eastern Mediterranean
26193,,Djibouti,11.8251,42.5903,2020-05-01,1097,2,672,423,Eastern Mediterranean
26196,,Egypt,26.820553,30.802498,2020-05-01,5895,406,1460,4029,Eastern Mediterranean
26231,,Iran,32.427908,53.688046,2020-05-01,95646,6091,76318,13237,Eastern Mediterranean
26232,,Iraq,33.223191,43.679291,2020-05-01,2153,94,1414,645,Eastern Mediterranean
26238,,Jordan,31.24,36.51,2020-05-01,459,8,364,87,Eastern Mediterranean
26242,,Kuwait,29.31166,47.481766,2020-05-01,4377,30,1602,2745,Eastern Mediterranean
26245,,Lebanon,33.8547,35.8623,2020-05-01,729,24,192,513,Eastern Mediterranean
26261,,Morocco,31.7917,-7.0926,2020-05-01,4569,171,1083,3315,Eastern Mediterranean


In [60]:
# creating function for specific country data
def emd_country(country) :
  if country not in emd['Country'] :
    print('Enter country form Eastern Mediterranean region')

  return emd[emd['Country']==country]


In [61]:
emd_country('Qatar')

Enter country form Eastern Mediterranean region


Unnamed: 0,State,Country,Lat,Long,Date,Confirmed,Deaths,Recovered,Active,WHO Region
183,,Qatar,25.3548,51.1839,2020-01-22,0,0,0,0,Eastern Mediterranean
444,,Qatar,25.3548,51.1839,2020-01-23,0,0,0,0,Eastern Mediterranean
705,,Qatar,25.3548,51.1839,2020-01-24,0,0,0,0,Eastern Mediterranean
966,,Qatar,25.3548,51.1839,2020-01-25,0,0,0,0,Eastern Mediterranean
1227,,Qatar,25.3548,51.1839,2020-01-26,0,0,0,0,Eastern Mediterranean
...,...,...,...,...,...,...,...,...,...,...
47946,,Qatar,25.3548,51.1839,2020-07-23,108244,164,105018,3062,Eastern Mediterranean
48207,,Qatar,25.3548,51.1839,2020-07-24,108638,164,105420,3054,Eastern Mediterranean
48468,,Qatar,25.3548,51.1839,2020-07-25,109036,164,105750,3122,Eastern Mediterranean
48729,,Qatar,25.3548,51.1839,2020-07-26,109305,165,106024,3116,Eastern Mediterranean


In [62]:
# creating ne more function that take type of cases by the dates
def emd_t_asD(*arg) :
  return emd.groupby('Date')[arg].sum().reset_index()

In [63]:
emd_t_asD('Deaths')

Unnamed: 0,Date,Deaths
0,2020-01-22,0
1,2020-01-23,0
2,2020-01-24,0
3,2020-01-25,0
4,2020-01-26,0
...,...,...
183,2020-07-23,36575
184,2020-07-24,37033
185,2020-07-25,37467
186,2020-07-26,37894


In [64]:
#printing some in data date wise
for i in ['Confirmed','Deaths','Recovered','Active'] :
  print(f'Motst no.of {i} on {emd_t_asD(i).sort_values(by=i,ascending=False).iloc[0,0]} is {emd_t_asD(i).sort_values(by=i,ascending=False).reset_index().loc[0,i]}')
  print(f'Least no.of {i} on {emd_t_asD(i).sort_values(by=i,ascending=True).iloc[0,0]} is {emd_t_asD(i).sort_values(by=i,ascending=True).reset_index().loc[0,i]}')

Motst no.of Confirmed on 2020-07-27 is 1490744
Least no.of Confirmed on 2020-01-22 is 0
Motst no.of Deaths on 2020-07-27 is 38339
Least no.of Deaths on 2020-01-22 is 0
Motst no.of Recovered on 2020-07-27 is 1201400
Least no.of Recovered on 2020-01-22 is 0
Motst no.of Active on 2020-07-01 is 351468
Least no.of Active on 2020-01-22 is 0


In [65]:
## creating more function that take type of cases by the Country
def emd_t_asC(*arg) :
  return emd.groupby('Country')[arg].sum().reset_index()

In [66]:
emd_t_asC('Active')

Unnamed: 0,Country,Active
0,Afghanistan,1089052
1,Bahrain,415304
2,Djibouti,78062
3,Egypt,2758600
4,Iran,3114236
5,Iraq,1181275
6,Jordan,24058
7,Kuwait,891729
8,Lebanon,80579
9,Libya,46510


In [67]:
#printing some in data Country wise
for i in ['Confirmed','Deaths','Recovered','Active'] :
  print(f'Motst no.of {i} in {emd_t_asC(i).sort_values(by=i,ascending=False).iloc[0,0]} is {emd_t_asC(i).sort_values(by=i,ascending=False).reset_index().loc[0,i]}')
  print(f'Least no.of {i} in {emd_t_asC(i).sort_values(by=i,ascending=True).iloc[0,0]} is {emd_t_asC(i).sort_values(by=i,ascending=True).reset_index().loc[0,i]}')

Motst no.of Confirmed in Iran is 19339267
Least no.of Confirmed in Syria is 20946
Motst no.of Deaths in Iran is 1024136
Least no.of Deaths in Syria is 973
Motst no.of Recovered in Iran is 15200895
Least no.of Recovered in Syria is 0
Motst no.of Active in Pakistan is 5633262
Least no.of Active in Syria is 19973


**Similarly we can do analysis for other who regions.So do it for more one region Europe.**

In [68]:
euro=w_data_R('Europe')
euro.head()

Unnamed: 0,State,Country,Lat,Long,Date,Confirmed,Deaths,Recovered,Active,WHO Region
1,,Albania,41.1533,20.1683,2020-01-22,0,0,0,0,Europe
3,,Andorra,42.5063,1.5218,2020-01-22,0,0,0,0,Europe
7,,Armenia,40.0691,45.0382,2020-01-22,0,0,0,0,Europe
16,,Austria,47.5162,14.5501,2020-01-22,0,0,0,0,Europe
17,,Azerbaijan,40.1431,47.5769,2020-01-22,0,0,0,0,Europe


In [69]:
euro.tail()

Unnamed: 0,State,Country,Lat,Long,Date,Confirmed,Deaths,Recovered,Active,WHO Region
49053,British Virgin Islands,United Kingdom,18.4207,-64.64,2020-07-27,8,1,7,0,Europe
49054,Turks and Caicos Islands,United Kingdom,21.694,-71.7979,2020-07-27,99,2,36,61,Europe
49059,Falkland Islands (Malvinas),United Kingdom,-51.7963,-59.5236,2020-07-27,13,0,13,0,Europe
49060,Saint Pierre and Miquelon,France,46.8852,-56.3159,2020-07-27,4,0,1,3,Europe
49066,,Tajikistan,38.861,71.2761,2020-07-27,7235,60,6028,1147,Europe


In [70]:
euro.shape

(15040, 10)

In [71]:
euro.isnull().sum()

State         10340
Country           0
Lat               0
Long              0
Date              0
Confirmed         0
Deaths            0
Recovered         0
Active            0
WHO Region        0
dtype: int64

In [72]:
#creating another function which will give us whole the data for some specific date and country so we can analize
# by simply passing a values
def euro_date(date) :
  df=euro[euro['Date']==date]
  return df

In [73]:
euro_date('2020-05-01')

Unnamed: 0,State,Country,Lat,Long,Date,Confirmed,Deaths,Recovered,Active,WHO Region
26101,,Albania,41.1533,20.1683,2020-05-01,782,31,488,263,Europe
26103,,Andorra,42.5063,1.5218,2020-05-01,745,43,468,234,Europe
26107,,Armenia,40.0691,45.0382,2020-05-01,2148,33,977,1138,Europe
26116,,Austria,47.5162,14.5501,2020-05-01,15531,589,13110,1832,Europe
26117,,Azerbaijan,40.1431,47.5769,2020-05-01,1854,25,1365,464,Europe
...,...,...,...,...,...,...,...,...,...,...
26346,British Virgin Islands,United Kingdom,18.4207,-64.6400,2020-05-01,6,1,3,2,Europe
26347,Turks and Caicos Islands,United Kingdom,21.6940,-71.7979,2020-05-01,12,1,5,6,Europe
26352,Falkland Islands (Malvinas),United Kingdom,-51.7963,-59.5236,2020-05-01,13,0,13,0,Europe
26353,Saint Pierre and Miquelon,France,46.8852,-56.3159,2020-05-01,1,0,0,1,Europe


In [74]:
# creating function for specific country data
def euro_country(country) :
  if country not in emd['Country'] :
    print('Enter country form Eastern Mediterranean region')

  return euro[euro['Country']==country]


In [75]:
euro_country('United Kingdom')

Enter country form Eastern Mediterranean region


Unnamed: 0,State,Country,Lat,Long,Date,Confirmed,Deaths,Recovered,Active,WHO Region
215,Bermuda,United Kingdom,32.3078,-64.7505,2020-01-22,0,0,0,0,Europe
216,Cayman Islands,United Kingdom,19.3133,-81.2546,2020-01-22,0,0,0,0,Europe
217,Channel Islands,United Kingdom,49.3723,-2.3644,2020-01-22,0,0,0,0,Europe
218,Gibraltar,United Kingdom,36.1408,-5.3536,2020-01-22,0,0,0,0,Europe
219,Isle of Man,United Kingdom,54.2361,-4.5481,2020-01-22,0,0,0,0,Europe
...,...,...,...,...,...,...,...,...,...,...
49028,,United Kingdom,55.3781,-3.4360,2020-07-27,300111,45759,0,254352,Europe
49052,Anguilla,United Kingdom,18.2206,-63.0686,2020-07-27,3,0,3,0,Europe
49053,British Virgin Islands,United Kingdom,18.4207,-64.6400,2020-07-27,8,1,7,0,Europe
49054,Turks and Caicos Islands,United Kingdom,21.6940,-71.7979,2020-07-27,99,2,36,61,Europe


In [76]:
# creating ne more function that take type of cases by the dates
def euro_t_asD(*arg) :
  return emd.groupby('Date')[arg].sum().reset_index()

In [77]:
euro_t_asD('Deaths')

Unnamed: 0,Date,Deaths
0,2020-01-22,0
1,2020-01-23,0
2,2020-01-24,0
3,2020-01-25,0
4,2020-01-26,0
...,...,...
183,2020-07-23,36575
184,2020-07-24,37033
185,2020-07-25,37467
186,2020-07-26,37894


In [78]:
#printing some in data date wise
for i in ['Confirmed','Deaths','Recovered','Active'] :
  print(f'Motst no.of {i} on {euro_t_asD(i).sort_values(by=i,ascending=False).iloc[0,0]} is {euro_t_asD(i).sort_values(by=i,ascending=False).reset_index().loc[0,i]}')
  print(f'Least no.of {i} on {euro_t_asD(i).sort_values(by=i,ascending=True).iloc[0,0]} is {euro_t_asD(i).sort_values(by=i,ascending=True).reset_index().loc[0,i]}')

Motst no.of Confirmed on 2020-07-27 is 1490744
Least no.of Confirmed on 2020-01-22 is 0
Motst no.of Deaths on 2020-07-27 is 38339
Least no.of Deaths on 2020-01-22 is 0
Motst no.of Recovered on 2020-07-27 is 1201400
Least no.of Recovered on 2020-01-22 is 0
Motst no.of Active on 2020-07-01 is 351468
Least no.of Active on 2020-01-22 is 0


In [79]:
## creating more function that take type of cases by the Country
def euro_t_asC(*arg) :
  return euro.groupby('Country')[arg].sum().reset_index()

In [80]:
euro_t_asC('Active').head()

Unnamed: 0,Country,Active
0,Albania,72117
1,Andorra,19907
2,Armenia,702602
3,Austria,325216
4,Azerbaijan,417033


In [81]:
#printing some in data Country wise
for i in ['Confirmed','Deaths','Recovered','Active'] :
  print(f'Motst no.of {i} in {euro_t_asC(i).sort_values(by=i,ascending=False).iloc[0,0]} is {euro_t_asC(i).sort_values(by=i,ascending=False).reset_index().loc[0,i]}')
  print(f'Least no.of {i} in {euro_t_asC(i).sort_values(by=i,ascending=True).iloc[0,0]} is {euro_t_asC(i).sort_values(by=i,ascending=True).reset_index().loc[0,i]}')

Motst no.of Confirmed in Russia is 45408411
Least no.of Confirmed in Holy See is 1356
Motst no.of Deaths in United Kingdom is 3997775
Least no.of Deaths in Holy See is 0
Motst no.of Recovered in Russia is 25120448
Least no.of Recovered in Sweden is 0
Motst no.of Active in United Kingdom is 22624595
Least no.of Active in Greenland is 135


## **Now, Lets visualize data in each and every possible way.**

In [82]:
#retrive some data and store it into new variable
data1=get_data_t_asD('Confirmed','Deaths','Recovered','Active')
data1.head()

  return cvd.groupby('Date')[arg].sum().reset_index()


Unnamed: 0,Date,Confirmed,Deaths,Recovered,Active
0,2020-01-22,555,17,28,510
1,2020-01-23,654,18,30,606
2,2020-01-24,941,26,36,879
3,2020-01-25,1434,42,39,1353
4,2020-01-26,2118,56,52,2010


In [83]:
# Visualization of Global trend of Confirmed,death,active and recovered cases.
plt.figure(figsize=(15,10)) # setting figure size
trace1 =go.Line(x=data1['Date'],y=data1['Confirmed'],mode='lines',line=dict(color='blue',width=2,dash='solid'),name='Confirmed')
trace2=go.Line(x=data1['Date'],y=data1['Deaths'],mode='lines',line=dict(color='red',width=2,dash='dash'),name='Deaths')
trace3=go.Line(x=data1['Date'],y=data1['Active'],mode='lines',line=dict(color='green',width=2,dash='dashdot'),name='Active')
trace4=go.Line(x=data1['Date'],y=data1['Recovered'],mode='lines',line=dict(color='purple',width=2,dash='dot'),name='Recovered')
fig=go.Figure([trace1,trace2,trace3,trace4])
layouts=go.Layout(xaxis=dict(title='Dates'),
                  yaxis=dict(title='No.of Cases in millions'),
                  title='Global trend of the cases')
fig=go.Figure(data=[trace1,trace2,trace3,trace4],layout=layouts)
fig.show()

Please replace it with one of the following more specific types
  - plotly.graph_objs.scatter.Line
  - plotly.graph_objs.layout.shape.Line
  - etc.



<Figure size 1500x1000 with 0 Axes>

### So, all type of caeses are increasing day by day, also we can see cases are getting confirmed in more numbers but deaths are very less in the comparison eventully we can see that no of recovered cases are more than the no. of active cases and the no. of deaths.

In [84]:
#now lets visiualize this trend in WHO region wise
emd=w_data_R('Eastern Mediterranean')
data2=emd.groupby('Date')['Confirmed','Deaths','Recovered','Active'].sum().reset_index()
data2.head()


Indexing with multiple keys (implicitly converted to a tuple of keys) will be deprecated, use a list instead.



Unnamed: 0,Date,Confirmed,Deaths,Recovered,Active
0,2020-01-22,0,0,0,0
1,2020-01-23,0,0,0,0
2,2020-01-24,0,0,0,0
3,2020-01-25,0,0,0,0
4,2020-01-26,0,0,0,0


In [85]:
cvd['WHO Region'].unique()

array(['Eastern Mediterranean', 'Europe', 'Africa', 'Americas',
       'Western Pacific', 'South-East Asia'], dtype=object)

In [86]:
trace1 =go.Line(x=data2['Date'],y=data2['Confirmed'],mode='lines',line=dict(color='blue',width=2,dash='solid'),name='Confirmed')
trace2=go.Line(x=data2['Date'],y=data2['Deaths'],mode='lines',line=dict(color='red',width=2,dash='dash'),name='Deaths')
trace3=go.Line(x=data2['Date'],y=data2['Active'],mode='lines',line=dict(color='green',width=2,dash='dashdot'),name='Active')
trace4=go.Line(x=data2['Date'],y=data2['Recovered'],mode='lines',line=dict(color='purple',width=2,dash='dot'),name='Recovered')
fig=go.Figure([trace1,trace2,trace3,trace4])
layouts=go.Layout(xaxis=dict(title='Dates'),
                  yaxis=dict(title='No.of Cases in millions'),
                  title="Eastern Mediterranean's trend of the cases")
fig=go.Figure(data=[trace1,trace2,trace3,trace4],layout=layouts)
fig.show()


plotly.graph_objs.Line is deprecated.
Please replace it with one of the following more specific types
  - plotly.graph_objs.scatter.Line
  - plotly.graph_objs.layout.shape.Line
  - etc.




Clearly we can derived that the cases of covid are slowly increase from march.

also the recovered cases are more than active and recovered case, also in july month active cases decreasing.

In [87]:
emd=w_data_R('Europe')
data3=emd.groupby('Date')['Confirmed','Deaths','Recovered','Active'].sum().reset_index()
data3.head()


Indexing with multiple keys (implicitly converted to a tuple of keys) will be deprecated, use a list instead.



Unnamed: 0,Date,Confirmed,Deaths,Recovered,Active
0,2020-01-22,0,0,0,0
1,2020-01-23,0,0,0,0
2,2020-01-24,2,0,0,2
3,2020-01-25,3,0,0,3
4,2020-01-26,3,0,0,3


In [88]:
trace1 =go.Line(x=data3['Date'],y=data3['Confirmed'],mode='lines',line=dict(color='blue',width=2,dash='solid'),name='Confirmed')
trace2=go.Line(x=data3['Date'],y=data3['Deaths'],mode='lines',line=dict(color='red',width=2,dash='dash'),name='Deaths')
trace3=go.Line(x=data3['Date'],y=data3['Active'],mode='lines',line=dict(color='green',width=2,dash='dashdot'),name='Active')
trace4=go.Line(x=data3['Date'],y=data3['Recovered'],mode='lines',line=dict(color='purple',width=2,dash='dot'),name='Recovered')
fig=go.Figure([trace1,trace2,trace3,trace4])
layouts=go.Layout(xaxis=dict(title='Dates'),
                  yaxis=dict(title='No.of Cases in millions'),
                  title="Europe's trend of the cases")
fig=go.Figure(data=[trace1,trace2,trace3,trace4],layout=layouts)
fig.show()


plotly.graph_objs.Line is deprecated.
Please replace it with one of the following more specific types
  - plotly.graph_objs.scatter.Line
  - plotly.graph_objs.layout.shape.Line
  - etc.




We can see that in the europe inbtween march and april cares are exponential increased and in each aspect they are increasing gredually.
Also on 20th of May The recovered and active cases are similar.

In [89]:
emd=w_data_R('Africa')
data4=emd.groupby('Date')['Confirmed','Deaths','Recovered','Active'].sum().reset_index()
data4.head()


Indexing with multiple keys (implicitly converted to a tuple of keys) will be deprecated, use a list instead.



Unnamed: 0,Date,Confirmed,Deaths,Recovered,Active
0,2020-01-22,0,0,0,0
1,2020-01-23,0,0,0,0
2,2020-01-24,0,0,0,0
3,2020-01-25,0,0,0,0
4,2020-01-26,0,0,0,0


In [90]:
trace1 =go.Line(x=data4['Date'],y=data4['Confirmed'],mode='lines',line=dict(color='blue',width=2,dash='solid'),name='Confirmed')
trace2=go.Line(x=data4['Date'],y=data4['Deaths'],mode='lines',line=dict(color='red',width=2,dash='dash'),name='Deaths')
trace3=go.Line(x=data4['Date'],y=data4['Active'],mode='lines',line=dict(color='green',width=2,dash='dashdot'),name='Active')
trace4=go.Line(x=data4['Date'],y=data4['Recovered'],mode='lines',line=dict(color='purple',width=2,dash='dot'),name='Recovered')
fig=go.Figure([trace1,trace2,trace3,trace4])
layouts=go.Layout(xaxis=dict(title='Dates'),
                  yaxis=dict(title='No.of Cases in millions'),
                  title="Africa's trend of the cases")
fig=go.Figure(data=[trace1,trace2,trace3,trace4],layout=layouts)
fig.show()


plotly.graph_objs.Line is deprecated.
Please replace it with one of the following more specific types
  - plotly.graph_objs.scatter.Line
  - plotly.graph_objs.layout.shape.Line
  - etc.




In the African region we can see the jump in the cases after May and it is expontially,

around 15th june recovered and active cases are almost same also they are following each other till 14th of july after that active cases are riding on the bumps.

and deaths are very less in compare to other aspect.


In [91]:
emd=w_data_R('Americas')
data5=emd.groupby('Date')['Confirmed','Deaths','Recovered','Active'].sum().reset_index()
data5.head()


Indexing with multiple keys (implicitly converted to a tuple of keys) will be deprecated, use a list instead.



Unnamed: 0,Date,Confirmed,Deaths,Recovered,Active
0,2020-01-22,1,0,0,1
1,2020-01-23,1,0,0,1
2,2020-01-24,2,0,0,2
3,2020-01-25,2,0,0,2
4,2020-01-26,6,0,0,6


In [92]:
trace1 =go.Line(x=data5['Date'],y=data5['Confirmed'],mode='lines',line=dict(color='blue',width=2,dash='solid'),name='Confirmed')
trace2=go.Line(x=data5['Date'],y=data5['Deaths'],mode='lines',line=dict(color='red',width=2,dash='dash'),name='Deaths')
trace3=go.Line(x=data5['Date'],y=data5['Active'],mode='lines',line=dict(color='green',width=2,dash='dashdot'),name='Active')
trace4=go.Line(x=data5['Date'],y=data5['Recovered'],mode='lines',line=dict(color='purple',width=2,dash='dot'),name='Recovered')
fig=go.Figure([trace1,trace2,trace3,trace4])
layouts=go.Layout(xaxis=dict(title='Dates'),
                  yaxis=dict(title='No.of Cases in millions'),
                  title="Americas's trend of the cases")
fig=go.Figure(data=[trace1,trace2,trace3,trace4],layout=layouts)
fig.show()


plotly.graph_objs.Line is deprecated.
Please replace it with one of the following more specific types
  - plotly.graph_objs.scatter.Line
  - plotly.graph_objs.layout.shape.Line
  - etc.




In the American region after late march cases are incresed.
initially Active cases are also high and in july they are gettig less
similarlly recovered cases are also increased after june.Deaths are also slowly increase.

**This region has close to 9 million confirmed cases which is more than global comfirmed cases and most no. of all aspect from this region.**

In [93]:
emd=w_data_R('Western Pacific')
data6=emd.groupby('Date')['Confirmed','Deaths','Recovered','Active'].sum().reset_index()
data6.head()


Indexing with multiple keys (implicitly converted to a tuple of keys) will be deprecated, use a list instead.



Unnamed: 0,Date,Confirmed,Deaths,Recovered,Active
0,2020-01-22,552,17,28,507
1,2020-01-23,650,18,30,602
2,2020-01-24,932,26,36,870
3,2020-01-25,1421,42,39,1340
4,2020-01-26,2100,56,50,1994


In [94]:
trace1 =go.Line(x=data6['Date'],y=data6['Confirmed'],mode='lines',line=dict(color='blue',width=2,dash='solid'),name='Confirmed')
trace2=go.Line(x=data6['Date'],y=data6['Deaths'],mode='lines',line=dict(color='red',width=2,dash='dash'),name='Deaths')
trace3=go.Line(x=data6['Date'],y=data6['Active'],mode='lines',line=dict(color='green',width=2,dash='dashdot'),name='Active')
trace4=go.Line(x=data6['Date'],y=data6['Recovered'],mode='lines',line=dict(color='purple',width=2,dash='dot'),name='Recovered')
fig=go.Figure([trace1,trace2,trace3,trace4])
layouts=go.Layout(xaxis=dict(title='Dates'),
                  yaxis=dict(title='No.of Cases in millions'),
                  title="Western Pacific's trend of the cases")
fig=go.Figure(data=[trace1,trace2,trace3,trace4],layout=layouts)
fig.show()


plotly.graph_objs.Line is deprecated.
Please replace it with one of the following more specific types
  - plotly.graph_objs.scatter.Line
  - plotly.graph_objs.layout.shape.Line
  - etc.




This region has high no of confirmed and active cases from the begining.
after 17th of feb active cases are getting decrease.In the begining of march recovered cases are increasing.

In [95]:
emd=w_data_R('South-East Asia')
data6=emd.groupby('Date')['Confirmed','Deaths','Recovered','Active'].sum().reset_index()
data6.head()


Indexing with multiple keys (implicitly converted to a tuple of keys) will be deprecated, use a list instead.



Unnamed: 0,Date,Confirmed,Deaths,Recovered,Active
0,2020-01-22,2,0,0,2
1,2020-01-23,3,0,0,3
2,2020-01-24,5,0,0,5
3,2020-01-25,8,0,0,8
4,2020-01-26,9,0,2,7


In [96]:
trace1 =go.Line(x=data6['Date'],y=data6['Confirmed'],mode='lines',line=dict(color='blue',width=2,dash='solid'),name='Confirmed')
trace2=go.Line(x=data6['Date'],y=data6['Deaths'],mode='lines',line=dict(color='red',width=2,dash='dash'),name='Deaths')
trace3=go.Line(x=data6['Date'],y=data6['Active'],mode='lines',line=dict(color='green',width=2,dash='dashdot'),name='Active')
trace4=go.Line(x=data6['Date'],y=data6['Recovered'],mode='lines',line=dict(color='purple',width=2,dash='dot'),name='Recovered')
fig=go.Figure([trace1,trace2,trace3,trace4])
layouts=go.Layout(xaxis=dict(title='Dates'),
                  yaxis=dict(title='No.of Cases in millions'),
                  title="Western Pacific's trend of the cases")
fig=go.Figure(data=[trace1,trace2,trace3,trace4],layout=layouts)
fig.show()


plotly.graph_objs.Line is deprecated.
Please replace it with one of the following more specific types
  - plotly.graph_objs.scatter.Line
  - plotly.graph_objs.layout.shape.Line
  - etc.




In this region we can see that cases are increasing expontially in all aspect except Deaths cases.

Now, Lets try to visualize data forSome countries like US and India.

In [97]:
df1=us_data_t_asD('Confirmed','Deaths','Recovered','Active')
df1.head()


Indexing with multiple keys (implicitly converted to a tuple of keys) will be deprecated, use a list instead.



Unnamed: 0,Date,Confirmed,Deaths,Recovered,Active
0,2020-01-22,1,0,0,1
1,2020-01-23,1,0,0,1
2,2020-01-24,2,0,0,2
3,2020-01-25,2,0,0,2
4,2020-01-26,5,0,0,5


In [98]:
trace1 =go.Line(x=df1['Date'],y=df1['Confirmed'],mode='lines',line=dict(color='blue',width=2,dash='solid'),name='Confirmed')
trace2=go.Line(x=df1['Date'],y=df1['Deaths'],mode='lines',line=dict(color='red',width=2,dash='dash'),name='Deaths')
trace3=go.Line(x=df1['Date'],y=df1['Active'],mode='lines',line=dict(color='green',width=2,dash='dashdot'),name='Active')
trace4=go.Line(x=df1['Date'],y=df1['Recovered'],mode='lines',line=dict(color='purple',width=2,dash='dot'),name='Recovered')
fig=go.Figure([trace1,trace2,trace3,trace4])
layouts=go.Layout(xaxis=dict(title='Dates'),
                  yaxis=dict(title='No.of Cases in millions'),
                  title="US's trend of the cases")
fig=go.Figure(data=[trace1,trace2,trace3,trace4],layout=layouts)
fig.show()


plotly.graph_objs.Line is deprecated.
Please replace it with one of the following more specific types
  - plotly.graph_objs.scatter.Line
  - plotly.graph_objs.layout.shape.Line
  - etc.




Clearly we can observed that US has most number of cases in all the aspect.

In [99]:
df2=ind_data_t_asD('Confirmed','Deaths','Recovered','Active')
df2.head()


Indexing with multiple keys (implicitly converted to a tuple of keys) will be deprecated, use a list instead.



Unnamed: 0,Date,Confirmed,Deaths,Recovered,Active
0,2020-01-22,0,0,0,0
1,2020-01-23,0,0,0,0
2,2020-01-24,0,0,0,0
3,2020-01-25,0,0,0,0
4,2020-01-26,0,0,0,0


In [100]:
trace1 =go.Line(x=df2['Date'],y=df2['Confirmed'],mode='lines',line=dict(color='blue',width=2,dash='solid'),name='Confirmed')
trace2=go.Line(x=df2['Date'],y=df2['Deaths'],mode='lines',line=dict(color='red',width=2,dash='dash'),name='Deaths')
trace3=go.Line(x=df2['Date'],y=df2['Active'],mode='lines',line=dict(color='green',width=2,dash='dashdot'),name='Active')
trace4=go.Line(x=df2['Date'],y=df2['Recovered'],mode='lines',line=dict(color='purple',width=2,dash='dot'),name='Recovered')
fig=go.Figure([trace1,trace2,trace3,trace4])
layouts=go.Layout(xaxis=dict(title='Dates'),
                  yaxis=dict(title='No.of Cases in millions'),
                  title="US's trend of the cases")
fig=go.Figure(data=[trace1,trace2,trace3,trace4],layout=layouts)
fig.show()


plotly.graph_objs.Line is deprecated.
Please replace it with one of the following more specific types
  - plotly.graph_objs.scatter.Line
  - plotly.graph_objs.layout.shape.Line
  - etc.




So, In India most no. of cases in all aspect are comming after late of may and in the begining of june month.

In [101]:
# First plot the data of confirmed cases
d1=get_data_t_asW('Confirmed','Recovered','Deaths','Active')
labels =d1['WHO Region']
col=list(d1.columns[1:])
for i in col :
  values = d1[i]

# Create a Pie chart trace
  pie_trace = go.Pie(
    labels=labels,        # List of string labels for each pie slice
    values=values,        # List of numerical values representing the size of each slice
    hoverinfo='label+percent',  # Information displayed when hovering over the slices
    hole=0,             # Size of the center hole (0 for a complete circle, 1 for no hole)
    pull=[0, 0.1, 0,0,0,0],     # List of pull values to separate slices (default is None)
    marker=dict(colors=['red', 'green', 'blue','purple','grey','pink']),  # List of colors for each slice
    textinfo='label+percent',     # Information displayed on each slice (options: 'label', 'percent', 'value', 'text', 'label+percent+value', etc.)
    #textposition='outside', # Position of text ('inside', 'outside', 'auto', 'none')
    #title='My Pie Chart',  # Title of the pie chart
    showlegend=True       # Whether to display the legend
  )

# Create a layout for the pie chart
  layout = go.Layout(
    title=f'Pie chart of {i} cases WHO region wise',
    legend=dict(orientation='h', x=0, y=1.1),  # Legend position and orientation
  )

# Create a figure with the Pie trace and layout
  fig = go.Figure(data=[pie_trace], layout=layout)

# Display the figure
  fig.show()



Indexing with multiple keys (implicitly converted to a tuple of keys) will be deprecated, use a list instead.



-In all aspect of cases Amarican region is at the top with 48.6%,40.4%,44.6%,57% percent respectively.

-At second Erupe hase mos no of cases in all the aspeect with 30%,31.7%,44.4%,26.8% respectively

-African nantions have least no of cases with 2.63%,2.88%,2.15% comfirmed, Recovered and Deaths respectively.

-Western Pacific nations have least no of Active Cases with 1.66%.

-

In [102]:
d2=get_data_t_asC('Confirmed','Deaths','Recovered','Active')


Indexing with multiple keys (implicitly converted to a tuple of keys) will be deprecated, use a list instead.



In [103]:
def plot_treemap(cases) :
  if cases in ['Deaths','Recovered'] :
    print('Either enter Confirmed and Active only')
  else :
    import plotly.express as px
# Create a color scale for the treemaps
    color_scale = px.colors.qualitative.Plotly

    columns = [cases]

# Create a dictionary to specify labels for columns
    labels = {
    'Confirmed':'Confirmed',
    'Deaths':'Deaths',
    'Recovered':'Recovered',
    'Active':'Active'
     }

    for i in columns:
    # Create the treemap
      fig = px.treemap(
                     d2,
                     values=i,
                     path=['Country'],
                     color=i,
                     color_continuous_scale=color_scale
     )

    # Customize the layout
      fig.update_layout(
        title=f'Treemap of {labels[i]} by Country',
        margin=dict(l=0, r=0, b=0, t=40),
      )

    # Customize the color scale and labels
      fig.update_traces(
        marker_line_width=1.5,
        hovertemplate='<b>%{label}</b><br>%{value}<extra></extra>',
        textinfo='label+value+percent parent'
       )

    # Show the plot
      return fig.show()

In [104]:
plot_treemap('Recovered')

Either enter Confirmed and Active only


In [105]:
plot_treemap('Confirmed')

In [106]:
plot_treemap('Active')

In [107]:
get_data_t_asC('Confirmed','Deaths','Recovered','Active').nlargest(10,'Deaths')


Indexing with multiple keys (implicitly converted to a tuple of keys) will be deprecated, use a list instead.



Unnamed: 0,Country,Confirmed,Deaths,Recovered,Active
173,US,224345948,11011411,56353416,156981121
177,United Kingdom,26748587,3997775,126217,22624595
23,Brazil,89524967,3938034,54492873,31094060
85,Italy,26745145,3707717,15673910,7363518
61,France,21210926,3048524,7182115,10980287
157,Spain,27404045,3033030,15093583,9277432
111,Mexico,14946202,1728277,11141225,2076700
79,India,40883464,1111831,23783720,15987913
81,Iran,19339267,1024136,15200895,3114236
16,Belgium,6281116,963679,1627492,3689945


In [108]:
def plot_barr(cases) :
   d3=get_data_t_asC('Confirmed','Deaths','Recovered','Active').nlargest(10,cases)#it will print top 10 countries for given cases
   x_data=d3[cases]
   y_data=d3['Country']
   colors=['royalblue', 'cornflowerblue', 'deepskyblue','limegreen', 'forestgreen', 'seagreen','tomato', 'firebrick', 'darkred','sienna']
  # Create a Bar chart trace
   bar_trace = go.Bar(
      x=x_data,            # Data for the x-axis (category labels)
      y=y_data,            # Data for the y-axis (height of bars)
      name='My Bar Chart', # Name of the bar chart (used in legends)
      marker=dict(color=colors, line=dict(color='black', width=2)),  # Marker style (color, border color, and width)
      text= d3['Country'],  # Text labels for each bar
      hoverinfo='x+y+text',  # Information displayed when hovering over the bars
      orientation='h',     # Bar orientation ('v' for vertical, 'h' for horizontal)
      width=0.8,            # Width of the bars (0 to 1, where 1 is the full category width)
      opacity=0.7,          # Opacity of the bars (0 to 1)
      base=0,               # Base position for stacked bars (default is 0)
      offset=0,             # Offset position for overlapping bars (default is 0)
    )

# Create a layout (if needed)
   layout = go.Layout(title=f'Top 10 countries for {cases} cases', xaxis=dict(title='Cases in Millions'), yaxis=dict(title='Countries'))

# Create a figure with the Bar trace and layout
   fig = go.Figure(data=[bar_trace], layout=layout)

# Display the figure
   return fig.show()

In [109]:
for i in ['Confirmed','Deaths','Recovered','Active'] :
  plot_barr(i)


Indexing with multiple keys (implicitly converted to a tuple of keys) will be deprecated, use a list instead.




Indexing with multiple keys (implicitly converted to a tuple of keys) will be deprecated, use a list instead.




Indexing with multiple keys (implicitly converted to a tuple of keys) will be deprecated, use a list instead.




Indexing with multiple keys (implicitly converted to a tuple of keys) will be deprecated, use a list instead.



# **Model Building**

In [110]:
cvd.head()

Unnamed: 0,State,Country,Lat,Long,Date,Confirmed,Deaths,Recovered,Active,WHO Region
0,,Afghanistan,33.93911,67.709953,2020-01-22,0,0,0,0,Eastern Mediterranean
1,,Albania,41.1533,20.1683,2020-01-22,0,0,0,0,Europe
2,,Algeria,28.0339,1.6596,2020-01-22,0,0,0,0,Africa
3,,Andorra,42.5063,1.5218,2020-01-22,0,0,0,0,Europe
4,,Angola,-11.2027,17.8739,2020-01-22,0,0,0,0,Africa


In [111]:
#first covert this date column data type to datetime type
cvd['Date']=pd.to_datetime(cvd['Date'])

In [112]:
cvd.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 49068 entries, 0 to 49067
Data columns (total 10 columns):
 #   Column      Non-Null Count  Dtype         
---  ------      --------------  -----         
 0   State       14664 non-null  object        
 1   Country     49068 non-null  object        
 2   Lat         49068 non-null  float64       
 3   Long        49068 non-null  float64       
 4   Date        49068 non-null  datetime64[ns]
 5   Confirmed   49068 non-null  int64         
 6   Deaths      49068 non-null  int64         
 7   Recovered   49068 non-null  int64         
 8   Active      49068 non-null  int64         
 9   WHO Region  49068 non-null  object        
dtypes: datetime64[ns](1), float64(2), int64(4), object(3)
memory usage: 3.7+ MB


In [113]:
confirmed1=get_data_t_asD('Confirmed')
confirmed1

Unnamed: 0,Date,Confirmed
0,2020-01-22,555
1,2020-01-23,654
2,2020-01-24,941
3,2020-01-25,1434
4,2020-01-26,2118
...,...,...
183,2020-07-23,15510481
184,2020-07-24,15791645
185,2020-07-25,16047190
186,2020-07-26,16251796


In [114]:
deaths1=get_data_t_asD('Deaths')
deaths1

Unnamed: 0,Date,Deaths
0,2020-01-22,17
1,2020-01-23,18
2,2020-01-24,26
3,2020-01-25,42
4,2020-01-26,56
...,...,...
183,2020-07-23,633506
184,2020-07-24,639650
185,2020-07-25,644517
186,2020-07-26,648621


In [115]:
recovered1=get_data_t_asD('Recovered')
recovered1

Unnamed: 0,Date,Recovered
0,2020-01-22,28
1,2020-01-23,30
2,2020-01-24,36
3,2020-01-25,39
4,2020-01-26,52
...,...,...
183,2020-07-23,8710969
184,2020-07-24,8939705
185,2020-07-25,9158743
186,2020-07-26,9293464


In [116]:
active1=get_data_t_asD('Active')
active1

Unnamed: 0,Date,Active
0,2020-01-22,510
1,2020-01-23,606
2,2020-01-24,879
3,2020-01-25,1353
4,2020-01-26,2010
...,...,...
183,2020-07-23,6166006
184,2020-07-24,6212290
185,2020-07-25,6243930
186,2020-07-26,6309711


## Firstly do forcasting for confirmed cases for Global data using 'fbprophet' time series model

In [117]:
from prophet import Prophet # importing a model
m=Prophet() # assigning a model

In [118]:
#before fitting our data we need to change the column names
confirmed1.columns=['ds','y']
confirmed1

Unnamed: 0,ds,y
0,2020-01-22,555
1,2020-01-23,654
2,2020-01-24,941
3,2020-01-25,1434
4,2020-01-26,2118
...,...,...
183,2020-07-23,15510481
184,2020-07-24,15791645
185,2020-07-25,16047190
186,2020-07-26,16251796


In [119]:
#fittinf a data into the model
m.fit(confirmed1)

INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
INFO:prophet:Disabling daily seasonality. Run prophet with daily_seasonality=True to override this.
DEBUG:cmdstanpy:input tempfile: /tmp/tmp6tw36ddm/ysjqg0q5.json
DEBUG:cmdstanpy:input tempfile: /tmp/tmp6tw36ddm/8mhcqqm1.json
DEBUG:cmdstanpy:idx 0
DEBUG:cmdstanpy:running CmdStan, num_threads: None
DEBUG:cmdstanpy:CmdStan args: ['/usr/local/lib/python3.10/dist-packages/prophet/stan_model/prophet_model.bin', 'random', 'seed=1310', 'data', 'file=/tmp/tmp6tw36ddm/ysjqg0q5.json', 'init=/tmp/tmp6tw36ddm/8mhcqqm1.json', 'output', 'file=/tmp/tmp6tw36ddm/prophet_modelbms2dhjk/prophet_model-20240101180753.csv', 'method=optimize', 'algorithm=lbfgs', 'iter=10000']
18:07:53 - cmdstanpy - INFO - Chain [1] start processing
INFO:cmdstanpy:Chain [1] start processing
18:07:54 - cmdstanpy - INFO - Chain [1] done processing
INFO:cmdstanpy:Chain [1] done processing


<prophet.forecaster.Prophet at 0x7a4a6d2c5d20>

In [120]:
# creating future dataframe for 10 days
future_confirmed1=m.make_future_dataframe(periods=10)
future_confirmed1

Unnamed: 0,ds
0,2020-01-22
1,2020-01-23
2,2020-01-24
3,2020-01-25
4,2020-01-26
...,...
193,2020-08-02
194,2020-08-03
195,2020-08-04
196,2020-08-05


In [121]:
# now forcasting for these 10 days
forecast_confirmed1=m.predict(future_confirmed1)

In [122]:
forecast_confirmed1 # get a whole dataframe

Unnamed: 0,ds,trend,yhat_lower,yhat_upper,trend_lower,trend_upper,additive_terms,additive_terms_lower,additive_terms_upper,weekly,weekly_lower,weekly_upper,multiplicative_terms,multiplicative_terms_lower,multiplicative_terms_upper,yhat
0,2020-01-22,-9.613281e+03,-1.217809e+05,8.084907e+04,-9.613281e+03,-9.613281e+03,-11063.561776,-11063.561776,-11063.561776,-11063.561776,-11063.561776,-11063.561776,0.0,0.0,0.0,-2.067684e+04
1,2020-01-23,-6.933404e+03,-1.087865e+05,9.749554e+04,-6.933404e+03,-6.933404e+03,-1117.543336,-1117.543336,-1117.543336,-1117.543336,-1117.543336,-1117.543336,0.0,0.0,0.0,-8.050948e+03
2,2020-01-24,-4.253528e+03,-8.345971e+04,1.196942e+05,-4.253528e+03,-4.253528e+03,10080.982351,10080.982351,10080.982351,10080.982351,10080.982351,10080.982351,0.0,0.0,0.0,5.827455e+03
3,2020-01-25,-1.573651e+03,-9.327229e+04,1.191487e+05,-1.573651e+03,-1.573651e+03,13750.330594,13750.330594,13750.330594,13750.330594,13750.330594,13750.330594,0.0,0.0,0.0,1.217668e+04
4,2020-01-26,1.106226e+03,-9.185528e+04,1.199533e+05,1.106226e+03,1.106226e+03,7298.794381,7298.794381,7298.794381,7298.794381,7298.794381,7298.794381,0.0,0.0,0.0,8.405020e+03
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
193,2020-08-02,1.735700e+07,1.725547e+07,1.748066e+07,1.732480e+07,1.738420e+07,7298.794381,7298.794381,7298.794381,7298.794381,7298.794381,7298.794381,0.0,0.0,0.0,1.736430e+07
194,2020-08-03,1.756099e+07,1.743890e+07,1.767974e+07,1.751862e+07,1.759737e+07,-2102.756726,-2102.756726,-2102.756726,-2102.756726,-2102.756726,-2102.756726,0.0,0.0,0.0,1.755889e+07
195,2020-08-04,1.776498e+07,1.762330e+07,1.787336e+07,1.771103e+07,1.781243e+07,-16846.245488,-16846.245488,-16846.245488,-16846.245488,-16846.245488,-16846.245488,0.0,0.0,0.0,1.774813e+07
196,2020-08-05,1.796897e+07,1.783267e+07,1.808223e+07,1.790198e+07,1.802818e+07,-11063.561776,-11063.561776,-11063.561776,-11063.561776,-11063.561776,-11063.561776,0.0,0.0,0.0,1.795791e+07


In [123]:
# but important is
forecast_confirmed1[['ds','yhat','yhat_lower','yhat_upper']]

Unnamed: 0,ds,yhat,yhat_lower,yhat_upper
0,2020-01-22,-2.067684e+04,-1.217809e+05,8.084907e+04
1,2020-01-23,-8.050948e+03,-1.087865e+05,9.749554e+04
2,2020-01-24,5.827455e+03,-8.345971e+04,1.196942e+05
3,2020-01-25,1.217668e+04,-9.327229e+04,1.191487e+05
4,2020-01-26,8.405020e+03,-9.185528e+04,1.199533e+05
...,...,...,...,...
193,2020-08-02,1.736430e+07,1.725547e+07,1.748066e+07
194,2020-08-03,1.755889e+07,1.743890e+07,1.767974e+07
195,2020-08-04,1.774813e+07,1.762330e+07,1.787336e+07
196,2020-08-05,1.795791e+07,1.783267e+07,1.808223e+07


In [124]:
# ploting this forcasting
#Plot the original and forecasted time series using Plotly
fig = go.Figure()

# Plot the original time series
fig.add_trace(go.Scatter(x=confirmed1['ds'], y=confirmed1['y'], mode='lines', name='Original', line=dict(color='blue')))

# Plot the forecasted values
fig.add_trace(go.Scatter(x=forecast_confirmed1['ds'], y=forecast_confirmed1['yhat'], mode='lines', name='Forecast', line=dict(color='red')))

# Add upper and lower bounds of the forecast
fig.add_trace(go.Scatter(x=forecast_confirmed1['ds'], y=forecast_confirmed1['yhat_upper'], fill='tonexty', mode='none', name='Upper Bound', fillcolor='rgba(255,0,0,0.2)'))
fig.add_trace(go.Scatter(x=forecast_confirmed1['ds'], y=forecast_confirmed1['yhat_lower'], fill='tonexty', mode='none', name='Lower Bound', fillcolor='rgba(255,0,0,0.2)'))

# Customize layout
fig.update_layout(title='Global forecasting for next 10 days of Confirmed cases',
                  xaxis_title='Date',
                  yaxis_title='Value(in millions)')

# Display the figure
fig.show()


### Now,similarly we will forcast for Deaths,Recovered and Active cases

In [126]:
#before fitting our data we need to change the column names
deaths1.columns=['ds','y']
deaths1

Unnamed: 0,ds,y
0,2020-01-22,17
1,2020-01-23,18
2,2020-01-24,26
3,2020-01-25,42
4,2020-01-26,56
...,...,...
183,2020-07-23,633506
184,2020-07-24,639650
185,2020-07-25,644517
186,2020-07-26,648621


In [129]:
#fittinf a data into the model
m1=Prophet()
m1.fit(deaths1)

INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
INFO:prophet:Disabling daily seasonality. Run prophet with daily_seasonality=True to override this.
DEBUG:cmdstanpy:input tempfile: /tmp/tmp6tw36ddm/hxjjjbd0.json
DEBUG:cmdstanpy:input tempfile: /tmp/tmp6tw36ddm/lg7u2pds.json
DEBUG:cmdstanpy:idx 0
DEBUG:cmdstanpy:running CmdStan, num_threads: None
DEBUG:cmdstanpy:CmdStan args: ['/usr/local/lib/python3.10/dist-packages/prophet/stan_model/prophet_model.bin', 'random', 'seed=25555', 'data', 'file=/tmp/tmp6tw36ddm/hxjjjbd0.json', 'init=/tmp/tmp6tw36ddm/lg7u2pds.json', 'output', 'file=/tmp/tmp6tw36ddm/prophet_model5ydihuhm/prophet_model-20240101181358.csv', 'method=optimize', 'algorithm=lbfgs', 'iter=10000']
18:13:58 - cmdstanpy - INFO - Chain [1] start processing
INFO:cmdstanpy:Chain [1] start processing
18:13:58 - cmdstanpy - INFO - Chain [1] done processing
INFO:cmdstanpy:Chain [1] done processing


<prophet.forecaster.Prophet at 0x7a4a6d171cc0>

In [131]:
# creating future dataframe for 10 days
future_deaths1=m1.make_future_dataframe(periods=10)
future_deaths1

Unnamed: 0,ds
0,2020-01-22
1,2020-01-23
2,2020-01-24
3,2020-01-25
4,2020-01-26
...,...
193,2020-08-02
194,2020-08-03
195,2020-08-04
196,2020-08-05


In [138]:
# now forcasting for these 10 days
forecast_deaths1=m1.predict(future_deaths1)

In [139]:
forecast_deaths1 # get a whole dataframe

Unnamed: 0,ds,trend,yhat_lower,yhat_upper,trend_lower,trend_upper,additive_terms,additive_terms_lower,additive_terms_upper,weekly,weekly_lower,weekly_upper,multiplicative_terms,multiplicative_terms_lower,multiplicative_terms_upper,yhat
0,2020-01-22,-522.113015,-2730.250056,1205.615816,-522.113015,-522.113015,-191.187036,-191.187036,-191.187036,-191.187036,-191.187036,-191.187036,0.0,0.0,0.0,-713.300051
1,2020-01-23,-432.419147,-2009.073420,1840.699713,-432.419147,-432.419147,388.672090,388.672090,388.672090,388.672090,388.672090,388.672090,0.0,0.0,0.0,-43.747057
2,2020-01-24,-342.725280,-1345.965927,2536.171135,-342.725280,-342.725280,874.263328,874.263328,874.263328,874.263328,874.263328,874.263328,0.0,0.0,0.0,531.538048
3,2020-01-25,-253.031412,-1509.578809,2546.090676,-253.031412,-253.031412,726.888550,726.888550,726.888550,726.888550,726.888550,726.888550,0.0,0.0,0.0,473.857138
4,2020-01-26,-163.337545,-2155.750525,1619.921721,-163.337545,-163.337545,-165.600810,-165.600810,-165.600810,-165.600810,-165.600810,-165.600810,0.0,0.0,0.0,-328.938355
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
193,2020-08-02,678190.945830,675389.674868,680755.533902,676751.608410,679650.238695,-165.600810,-165.600810,-165.600810,-165.600810,-165.600810,-165.600810,0.0,0.0,0.0,678025.345020
194,2020-08-03,683220.353253,679542.909883,685299.954190,681391.718031,685333.268949,-904.693854,-904.693854,-904.693854,-904.693854,-904.693854,-904.693854,0.0,0.0,0.0,682315.659400
195,2020-08-04,688249.760677,684354.058081,690941.337281,686011.116276,691034.083115,-728.342268,-728.342268,-728.342268,-728.342268,-728.342268,-728.342268,0.0,0.0,0.0,687521.418409
196,2020-08-05,693279.168100,689573.979763,696833.223797,690450.396630,696691.279540,-191.187036,-191.187036,-191.187036,-191.187036,-191.187036,-191.187036,0.0,0.0,0.0,693087.981064


In [140]:
# but important is
forecast_deaths1[['ds','yhat','yhat_lower','yhat_upper']]

Unnamed: 0,ds,yhat,yhat_lower,yhat_upper
0,2020-01-22,-713.300051,-2730.250056,1205.615816
1,2020-01-23,-43.747057,-2009.073420,1840.699713
2,2020-01-24,531.538048,-1345.965927,2536.171135
3,2020-01-25,473.857138,-1509.578809,2546.090676
4,2020-01-26,-328.938355,-2155.750525,1619.921721
...,...,...,...,...
193,2020-08-02,678025.345020,675389.674868,680755.533902
194,2020-08-03,682315.659400,679542.909883,685299.954190
195,2020-08-04,687521.418409,684354.058081,690941.337281
196,2020-08-05,693087.981064,689573.979763,696833.223797


In [141]:
# ploting this forcasting# ploting this forcasting
#Plot the original and forecasted time series using Plotly
fig = go.Figure()

# Plot the original time series
fig.add_trace(go.Scatter(x=confirmed1['ds'], y=confirmed1['y'], mode='lines', name='Original', line=dict(color='blue')))

# Plot the forecasted values
fig.add_trace(go.Scatter(x=forecast_confirmed1['ds'], y=forecast_confirmed1['yhat'], mode='lines', name='Forecast', line=dict(color='red')))

# Add upper and lower bounds of the forecast
fig.add_trace(go.Scatter(x=forecast_confirmed1['ds'], y=forecast_confirmed1['yhat_upper'], fill='tonexty', mode='none', name='Upper Bound', fillcolor='rgba(255,0,0,0.2)'))
fig.add_trace(go.Scatter(x=forecast_confirmed1['ds'], y=forecast_confirmed1['yhat_lower'], fill='tonexty', mode='none', name='Lower Bound', fillcolor='rgba(255,0,0,0.2)'))

# Customize layout
fig.update_layout(title='Global forecasting for next 10 days of Confirmed cases',
                  xaxis_title='Date',
                  yaxis_title='Value(in millions)')

# Display the figure
fig.show()

#Plot the original and forecasted time series using Plotly
fig = go.Figure()

# Plot the original time series
fig.add_trace(go.Scatter(x=deaths1['ds'], y=deaths1['y'], mode='lines', name='Original', line=dict(color='blue')))

# Plot the forecasted values
fig.add_trace(go.Scatter(x=forecast_deaths1['ds'], y=forecast_deaths1['yhat'], mode='lines', name='Forecast', line=dict(color='red')))

# Add upper and lower bounds of the forecast
fig.add_trace(go.Scatter(x=forecast_deaths1['ds'], y=forecast_deaths1['yhat_upper'], fill='tonexty', mode='none', name='Upper Bound', fillcolor='rgba(255,0,0,0.2)'))
fig.add_trace(go.Scatter(x=forecast_deaths1['ds'], y=forecast_deaths1['yhat_lower'], fill='tonexty', mode='none', name='Lower Bound', fillcolor='rgba(255,0,0,0.2)'))

# Customize layout
fig.update_layout(title='Global forecasting for next 10 days of Deaths cases',
                  xaxis_title='Date',
                  yaxis_title='Value(in millions)')

# Display the figure
fig.show()


In [142]:
#before fitting our data we need to change the column names
recovered1.columns=['ds','y']
recovered1

Unnamed: 0,ds,y
0,2020-01-22,28
1,2020-01-23,30
2,2020-01-24,36
3,2020-01-25,39
4,2020-01-26,52
...,...,...
183,2020-07-23,8710969
184,2020-07-24,8939705
185,2020-07-25,9158743
186,2020-07-26,9293464


In [143]:
#fittinf a data into the model
m2=Prophet()
m2.fit(recovered1)

INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
INFO:prophet:Disabling daily seasonality. Run prophet with daily_seasonality=True to override this.
DEBUG:cmdstanpy:input tempfile: /tmp/tmp6tw36ddm/_x_dx30v.json
DEBUG:cmdstanpy:input tempfile: /tmp/tmp6tw36ddm/gbpf6jp9.json
DEBUG:cmdstanpy:idx 0
DEBUG:cmdstanpy:running CmdStan, num_threads: None
DEBUG:cmdstanpy:CmdStan args: ['/usr/local/lib/python3.10/dist-packages/prophet/stan_model/prophet_model.bin', 'random', 'seed=4555', 'data', 'file=/tmp/tmp6tw36ddm/_x_dx30v.json', 'init=/tmp/tmp6tw36ddm/gbpf6jp9.json', 'output', 'file=/tmp/tmp6tw36ddm/prophet_model3k0wtu_i/prophet_model-20240101182418.csv', 'method=optimize', 'algorithm=lbfgs', 'iter=10000']
18:24:18 - cmdstanpy - INFO - Chain [1] start processing
INFO:cmdstanpy:Chain [1] start processing
18:24:18 - cmdstanpy - INFO - Chain [1] done processing
INFO:cmdstanpy:Chain [1] done processing


<prophet.forecaster.Prophet at 0x7a4a70348670>

In [144]:
# creating future dataframe for 10 days
future_recovered1=m2.make_future_dataframe(periods=10)
future_recovered1

Unnamed: 0,ds
0,2020-01-22
1,2020-01-23
2,2020-01-24
3,2020-01-25
4,2020-01-26
...,...
193,2020-08-02
194,2020-08-03
195,2020-08-04
196,2020-08-05


In [145]:
# now forcasting for these 10 days
forecast_recovered1=m2.predict(future_recovered1)

In [146]:
forecast_recovered1 # get a whole dataframe

Unnamed: 0,ds,trend,yhat_lower,yhat_upper,trend_lower,trend_upper,additive_terms,additive_terms_lower,additive_terms_upper,weekly,weekly_lower,weekly_upper,multiplicative_terms,multiplicative_terms_lower,multiplicative_terms_upper,yhat
0,2020-01-22,-1.360148e+04,-9.714853e+04,5.695098e+04,-1.360148e+04,-1.360148e+04,-4840.630407,-4840.630407,-4840.630407,-4840.630407,-4840.630407,-4840.630407,0.0,0.0,0.0,-1.844211e+04
1,2020-01-23,-1.243665e+04,-8.308243e+04,6.797224e+04,-1.243665e+04,-1.243665e+04,1254.800836,1254.800836,1254.800836,1254.800836,1254.800836,1254.800836,0.0,0.0,0.0,-1.118185e+04
2,2020-01-24,-1.127183e+04,-7.798842e+04,7.396565e+04,-1.127183e+04,-1.127183e+04,6096.150842,6096.150842,6096.150842,6096.150842,6096.150842,6096.150842,0.0,0.0,0.0,-5.175679e+03
3,2020-01-25,-1.010701e+04,-8.616918e+04,8.106928e+04,-1.010701e+04,-1.010701e+04,9699.439346,9699.439346,9699.439346,9699.439346,9699.439346,9699.439346,0.0,0.0,0.0,-4.075673e+02
4,2020-01-26,-8.942184e+03,-8.321980e+04,6.589943e+04,-8.942184e+03,-8.942184e+03,-883.483111,-883.483111,-883.483111,-883.483111,-883.483111,-883.483111,0.0,0.0,0.0,-9.825667e+03
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
193,2020-08-02,1.000420e+07,9.922295e+06,1.008963e+07,9.983488e+06,1.002418e+07,-883.483111,-883.483111,-883.483111,-883.483111,-883.483111,-883.483111,0.0,0.0,0.0,1.000332e+07
194,2020-08-03,1.014076e+07,1.005442e+07,1.022670e+07,1.011274e+07,1.016667e+07,-941.314732,-941.314732,-941.314732,-941.314732,-941.314732,-941.314732,0.0,0.0,0.0,1.013982e+07
195,2020-08-04,1.027732e+07,1.018040e+07,1.036071e+07,1.024164e+07,1.031013e+07,-10384.962773,-10384.962773,-10384.962773,-10384.962773,-10384.962773,-10384.962773,0.0,0.0,0.0,1.026693e+07
196,2020-08-05,1.041388e+07,1.030827e+07,1.050259e+07,1.036931e+07,1.045383e+07,-4840.630407,-4840.630407,-4840.630407,-4840.630407,-4840.630407,-4840.630407,0.0,0.0,0.0,1.040903e+07


In [147]:
# but important is
forecast_recovered1[['ds','yhat','yhat_lower','yhat_upper']]

Unnamed: 0,ds,yhat,yhat_lower,yhat_upper
0,2020-01-22,-1.844211e+04,-9.714853e+04,5.695098e+04
1,2020-01-23,-1.118185e+04,-8.308243e+04,6.797224e+04
2,2020-01-24,-5.175679e+03,-7.798842e+04,7.396565e+04
3,2020-01-25,-4.075673e+02,-8.616918e+04,8.106928e+04
4,2020-01-26,-9.825667e+03,-8.321980e+04,6.589943e+04
...,...,...,...,...
193,2020-08-02,1.000332e+07,9.922295e+06,1.008963e+07
194,2020-08-03,1.013982e+07,1.005442e+07,1.022670e+07
195,2020-08-04,1.026693e+07,1.018040e+07,1.036071e+07
196,2020-08-05,1.040903e+07,1.030827e+07,1.050259e+07


In [148]:
# ploting this forcasting# ploting this forcasting
#Plot the original and forecasted time series using Plotly
fig = go.Figure()

# Plot the original time series
fig.add_trace(go.Scatter(x=recovered1['ds'], y=recovered1['y'], mode='lines', name='Original', line=dict(color='blue')))

# Plot the forecasted values
fig.add_trace(go.Scatter(x=forecast_recovered1['ds'], y=forecast_recovered1['yhat'], mode='lines', name='Forecast', line=dict(color='red')))

# Add upper and lower bounds of the forecast
fig.add_trace(go.Scatter(x=forecast_recovered1['ds'], y=forecast_recovered1['yhat_upper'], fill='tonexty', mode='none', name='Upper Bound', fillcolor='rgba(255,0,0,0.2)'))
fig.add_trace(go.Scatter(x=forecast_recovered1['ds'], y=forecast_recovered1['yhat_lower'], fill='tonexty', mode='none', name='Lower Bound', fillcolor='rgba(255,0,0,0.2)'))

# Customize layout
fig.update_layout(title='Global forecasting for next 10 days of Recovered cases',
                  xaxis_title='Date',
                  yaxis_title='Value(in millions)')

# Display the figure
fig.show()

In [149]:
#before fitting our data we need to change the column names
active1.columns=['ds','y']
active1

Unnamed: 0,ds,y
0,2020-01-22,510
1,2020-01-23,606
2,2020-01-24,879
3,2020-01-25,1353
4,2020-01-26,2010
...,...,...
183,2020-07-23,6166006
184,2020-07-24,6212290
185,2020-07-25,6243930
186,2020-07-26,6309711


In [150]:
#fittinf a data into the model
m3=Prophet()
m3.fit(active1)

INFO:prophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.
INFO:prophet:Disabling daily seasonality. Run prophet with daily_seasonality=True to override this.
DEBUG:cmdstanpy:input tempfile: /tmp/tmp6tw36ddm/exj5uja2.json
DEBUG:cmdstanpy:input tempfile: /tmp/tmp6tw36ddm/eeg8k4is.json
DEBUG:cmdstanpy:idx 0
DEBUG:cmdstanpy:running CmdStan, num_threads: None
DEBUG:cmdstanpy:CmdStan args: ['/usr/local/lib/python3.10/dist-packages/prophet/stan_model/prophet_model.bin', 'random', 'seed=40434', 'data', 'file=/tmp/tmp6tw36ddm/exj5uja2.json', 'init=/tmp/tmp6tw36ddm/eeg8k4is.json', 'output', 'file=/tmp/tmp6tw36ddm/prophet_modelaw8zf_9g/prophet_model-20240101183004.csv', 'method=optimize', 'algorithm=lbfgs', 'iter=10000']
18:30:04 - cmdstanpy - INFO - Chain [1] start processing
INFO:cmdstanpy:Chain [1] start processing
18:30:04 - cmdstanpy - INFO - Chain [1] done processing
INFO:cmdstanpy:Chain [1] done processing


<prophet.forecaster.Prophet at 0x7a4a6c12ae30>

In [151]:
# creating future dataframe for 10 days
future_active1=m3.make_future_dataframe(periods=10)
future_active1

Unnamed: 0,ds
0,2020-01-22
1,2020-01-23
2,2020-01-24
3,2020-01-25
4,2020-01-26
...,...
193,2020-08-02
194,2020-08-03
195,2020-08-04
196,2020-08-05


In [152]:
# now forcasting for these 10 days
forecast_active1=m3.predict(future_active1)

In [153]:
forecast_active1 # get a whole dataframe

Unnamed: 0,ds,trend,yhat_lower,yhat_upper,trend_lower,trend_upper,additive_terms,additive_terms_lower,additive_terms_upper,weekly,weekly_lower,weekly_upper,multiplicative_terms,multiplicative_terms_lower,multiplicative_terms_upper,yhat
0,2020-01-22,-5.051890e+02,-5.263583e+04,4.148217e+04,-5.051890e+02,-5.051890e+02,-5734.813501,-5734.813501,-5734.813501,-5734.813501,-5734.813501,-5734.813501,0.0,0.0,0.0,-6.240003e+03
1,2020-01-23,1.357199e+03,-4.448311e+04,4.422305e+04,1.357199e+03,1.357199e+03,-2543.414742,-2543.414742,-2543.414742,-2543.414742,-2543.414742,-2543.414742,0.0,0.0,0.0,-1.186216e+03
2,2020-01-24,3.219587e+03,-3.952030e+04,4.907899e+04,3.219587e+03,3.219587e+03,3230.786676,3230.786676,3230.786676,3230.786676,3230.786676,3230.786676,0.0,0.0,0.0,6.450374e+03
3,2020-01-25,5.081975e+03,-3.914585e+04,5.510779e+04,5.081975e+03,5.081975e+03,3128.003846,3128.003846,3128.003846,3128.003846,3128.003846,3128.003846,0.0,0.0,0.0,8.209979e+03
4,2020-01-26,6.944363e+03,-3.066490e+04,5.996503e+04,6.944363e+03,6.944363e+03,8151.914786,8151.914786,8151.914786,8151.914786,8151.914786,8151.914786,0.0,0.0,0.0,1.509628e+04
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
193,2020-08-02,6.670619e+06,6.629992e+06,6.727811e+06,6.653892e+06,6.682738e+06,8151.914786,8151.914786,8151.914786,8151.914786,8151.914786,8151.914786,0.0,0.0,0.0,6.678771e+06
194,2020-08-03,6.732887e+06,6.679981e+06,6.785966e+06,6.711248e+06,6.749131e+06,-365.530371,-365.530371,-365.530371,-365.530371,-365.530371,-365.530371,0.0,0.0,0.0,6.732521e+06
195,2020-08-04,6.795154e+06,6.736238e+06,6.840085e+06,6.768266e+06,6.815211e+06,-5866.946694,-5866.946694,-5866.946694,-5866.946694,-5866.946694,-5866.946694,0.0,0.0,0.0,6.789287e+06
196,2020-08-05,6.857422e+06,6.791232e+06,6.906379e+06,6.824886e+06,6.882778e+06,-5734.813501,-5734.813501,-5734.813501,-5734.813501,-5734.813501,-5734.813501,0.0,0.0,0.0,6.851687e+06


In [154]:
# but important is
forecast_active1[['ds','yhat','yhat_lower','yhat_upper']]

Unnamed: 0,ds,yhat,yhat_lower,yhat_upper
0,2020-01-22,-6.240003e+03,-5.263583e+04,4.148217e+04
1,2020-01-23,-1.186216e+03,-4.448311e+04,4.422305e+04
2,2020-01-24,6.450374e+03,-3.952030e+04,4.907899e+04
3,2020-01-25,8.209979e+03,-3.914585e+04,5.510779e+04
4,2020-01-26,1.509628e+04,-3.066490e+04,5.996503e+04
...,...,...,...,...
193,2020-08-02,6.678771e+06,6.629992e+06,6.727811e+06
194,2020-08-03,6.732521e+06,6.679981e+06,6.785966e+06
195,2020-08-04,6.789287e+06,6.736238e+06,6.840085e+06
196,2020-08-05,6.851687e+06,6.791232e+06,6.906379e+06


In [155]:
# ploting this forcasting# ploting this forcasting
#Plot the original and forecasted time series using Plotly
fig = go.Figure()

# Plot the original time series
fig.add_trace(go.Scatter(x=active1['ds'], y=active1['y'], mode='lines', name='Original', line=dict(color='blue')))

# Plot the forecasted values
fig.add_trace(go.Scatter(x=forecast_active1['ds'], y=forecast_active1['yhat'], mode='lines', name='Forecast', line=dict(color='red')))

# Add upper and lower bounds of the forecast
fig.add_trace(go.Scatter(x=forecast_active1['ds'], y=forecast_active1['yhat_upper'], fill='tonexty', mode='none', name='Upper Bound', fillcolor='rgba(255,0,0,0.2)'))
fig.add_trace(go.Scatter(x=forecast_active1['ds'], y=forecast_active1['yhat_lower'], fill='tonexty', mode='none', name='Lower Bound', fillcolor='rgba(255,0,0,0.2)'))

# Customize layout
fig.update_layout(title='Global forecasting for next 10 days of Active cases',
                  xaxis_title='Date',
                  yaxis_title='Value(in millions)')

# Display the figure
fig.show()