<img src='https://cutewallpaper.org/21/nepali-flag-photo/Nepali-Stand-Flag.gif'>

# Analyzing the status of corona virus in Nepal.

## Web Scrapping

In this notebook i have used the data from the website name `The kathmandu post` website https://kathmandupost.com/covid19 which contains the data of everydays update. I used the data of district to study about corona cases in Nepal.


This notebook is based on the analysis of data from the very beginning of the covid 19 to 23rd of june 2020. The notebook is updated in every run.

In [9]:
import csv
import requests
from bs4 import BeautifulSoup


def scrape_data(url):

    response = requests.get(url, timeout=10)
    soup = BeautifulSoup(response.content, 'html.parser')

    table = soup.find_all('table')[1]

    rows = table.select('tbody>tr')

    #header = [th.text.rstrip() for th in rows[0].find_all('th')]
    header = ['Districts','Confirmed','Deaths','Recovered','New']
    #header = [header.text.rstrip() for hearder in rows[0].find_all('th')]

    with open('records.csv', 'w') as csv_file:
        writer = csv.writer(csv_file)
        writer.writerow(header)
        for row in rows:
            data = [th.text.strip() for th in row.find_all('td')]
            writer.writerow(data)
            

            
if __name__=="__main__":
    url = "https://kathmandupost.com/covid19"
    scrape_data(url)


let us break this apart and see how it works line by line.

* Lines 1 - 3 Imports all the packages needed to run the application.

* Line 6 We define the fuction scrape_data that takes a url parameter.

* Line 8 We make a get request to the url using the get method of the requests library.

* Line 9 We create a beatuful soup tree structure from the content of the response from the server. This object is easy to navigate and search through.

* Line 11 We search throught the beatufiful object soup to find the second table in the document which contains the data we want using the it's find_all method. The beautifulsoup object's find_all method searches for all html tags that match the filter/search-term in the tree structure.

* Line 13 This line of code selects all the tr elements where the parent is a tbody element from the table. tr elements represents the table rows.

* Line 15 The first row ussually contains the header cells. We serch throught the first row in the rows list to get the text values of all th elements in that row. 

* Line 17 - 22 This opens a file and creates a new file object. The w mode is used to ensure the file is open for writing. First we write the header row, then loop through the rest of the rows ignoring the first row to get the data contained within and write the data for all those rows to the file object.

* Line 25 -27 We check to ensure the module is run as the main program and call the function scrape_data with a specified url to scrape the data.

### Standard Imports

In [2]:
import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt
import plotly.graph_objects as go
import plotly.express as px
pd.set_option('display.max_rows', None)
import datetime
from plotly.subplots import make_subplots

In [3]:
df = pd.read_csv('records.csv')
df = df.drop('New',axis=1)
df.head()

Unnamed: 0,Districts,Confirmed,Deaths,Recovered
0,Rautahat,1159,0,243
1,Dailekh,748,2,300
2,Kapilvastu,705,0,40
3,Sarlahi,616,0,129
4,Mahottari,539,0,26


Adding the Active cases column because we have only Confirmed, Deaths and Recovered data

In [4]:
df['Active'] = df['Confirmed']-(df['Recovered']+df['Deaths'])
df.head()

Unnamed: 0,Districts,Confirmed,Deaths,Recovered,Active
0,Rautahat,1159,0,243,916
1,Dailekh,748,2,300,446
2,Kapilvastu,705,0,40,665
3,Sarlahi,616,0,129,487
4,Mahottari,539,0,26,513


In [5]:
current_stats = {'Total Confirmed':sum(df['Confirmed']),
                'Total Deaths':sum(df['Deaths']),
                'Total Recovered':sum(df['Recovered']),
                'Active Cases':sum(df['Active'])}

In [6]:
corona_cases = pd.DataFrame(current_stats,index=['Corona Cases'])
corona_cases

Unnamed: 0,Total Confirmed,Total Deaths,Total Recovered,Active Cases
Corona Cases,10099,24,2220,7855


In [7]:
Data_Nepal = df[["Confirmed","Deaths","Recovered","Active"]].sum().reset_index()
Data_Nepal

Unnamed: 0,index,0
0,Confirmed,10099
1,Deaths,24
2,Recovered,2220
3,Active,7855


## Total stats in a pie.


In [8]:
labels = ["Deaths","Recovered","Active Cases"]
values = df[["Deaths","Recovered","Active"]].sum()
fig = px.pie(df, 
             values=values, 
             names=labels,
             color_discrete_sequence=['rgb(56,255,26)','rgb(33,14,185)','rgb(45,77,77)'],
             hole=0.5)
fig.update_layout(
    title='Total cases : '+str(df["Confirmed"].sum()),
    template='plotly_dark'
)
fig.show()

 ## Confirmed

> <img src = 'https://cbswire.dk/wp-content/uploads/2020/05/giphy-kopi-9.gif' height=300 width=800/>

### Ten most infected Districts

### **Bar chart**

In [10]:
fig = go.Figure(data=[go.Bar(
            x=df['Districts'][0:10], y=df['Confirmed'][0:10],
            text=df['Confirmed'][0:10],
            textposition='auto',
            marker_color='blue',
            

        )])
fig.update_layout(
    title='Ten most infected Districts of Nepal',
    xaxis_title="Districts",
    yaxis_title="Confirmed Cases",
        template='plotly_dark'

)
fig.show()


### **Scatter Plot**

In [11]:
fig = go.Figure(data=[go.Scatter(
    x=df['Districts'][0:10],
    y=df['Confirmed'][0:10],
    mode='markers',
    
    marker=dict(
        color=100+np.random.randn(500),
        size=(df['Confirmed'][0:10]/10),
        showscale=True
        )
)])

fig.update_layout(
    title='10 Most infected Districts of Nepal',
    xaxis_title="Districts",
    yaxis_title="Confirmed Cases",
    template='plotly_dark'
)
fig.show()

### **Pie Chart**

In [12]:
fig = px.pie(df, values=df['Confirmed'], 
             names=df['Districts'],
             title='Conformed cases',
            )
fig.update_traces(textposition='inside', textinfo='percent+label')
fig.update_layout(
    template='plotly_dark'
)
fig.show()

# Recovered Cases

In [13]:
Recovered_per_districts = df.groupby(["Districts"])["Recovered"].sum().reset_index().sort_values("Recovered",ascending=False).reset_index(drop=True)

### **Pie Chart**

In [14]:
fig = px.pie(Recovered_per_districts, values=Recovered_per_districts['Recovered'], 
             names=Recovered_per_districts['Districts'],
             title='Recovered cases',
            )
fig.update_traces(textposition='inside', textinfo='percent+label')
fig.update_layout(
    template='plotly_dark'
)
fig.show()

### **Bar Chart**

In [15]:
fig = go.Figure(data=[go.Bar(
            x=Recovered_per_districts['Districts'][0:10], y=Recovered_per_districts['Recovered'][0:10],
            text=Recovered_per_districts['Recovered'][0:10],
            textposition='auto',
            marker_color='red',

        )])
fig.update_layout(
    title='10 Most Recovered Districts',
    xaxis_title="Districts",
    yaxis_title="Recovered Cases",
    template='plotly_dark'
)
fig.show()

### **Scatter Plot**

In [16]:
fig = go.Figure(data=[go.Scatter(
    x=Recovered_per_districts['Districts'][0:10],
    y=Recovered_per_districts['Recovered'][0:10],
    mode='markers',
    marker=dict(
        color=100+np.random.randn(500),
        size=[100, 55, 50, 30, 25, 20,20,20,20,20],
        showscale=True
        )
)])
fig.update_layout(
    title='10 most recovered districts',
    xaxis_title="Districts",
    yaxis_title="Recovered Cases",
    template='plotly_dark'

)
fig.show()

# Active cases

In [17]:
Active_per_district= df.groupby(["Districts"])["Active"].sum().reset_index().sort_values("Active",ascending=False).reset_index(drop=True)

Tabular representation of data for each active cases in Nepal

In [18]:
headerColor = 'grey'
rowEvenColor = 'lightgrey'
rowOddColor = 'white'

fig = go.Figure(data=[go.Table(
  header=dict(
    values=['<b>Districts</b>','<b>Active Cases</b>'],
    line_color='darkslategray',
    fill_color=headerColor,
    align=['left','center'],
    font=dict(color='white', size=12)
  ),
  cells=dict(
    values=[
      Active_per_district['Districts'],
      Active_per_district['Active'],
      ],
    line_color='darkslategray',
    # 2-D list of colors for alternating rows
    fill_color = [[rowOddColor,rowEvenColor,rowOddColor, rowEvenColor,rowOddColor]*len(df)],
    align = ['left', 'center'],
    font = dict(color = 'darkslategray', size = 11)
    ))
])
fig.update_layout(
    title='Active Cases In Each Districts',
    template='plotly_dark'
)
fig.show()

### **Bar Chart**

In [19]:
fig = go.Figure(data=[go.Bar(
            x=Active_per_district['Districts'][0:10], y=Active_per_district['Active'][0:10],
            text=Active_per_district['Active'][0:10],
            marker_color='yellow',
           
        )])
fig.update_layout(
    title='10 Most Active Cases Districts',
    xaxis_title="Districts",
    yaxis_title="Active Cases",
    template='plotly_dark'
)
fig.show()

### **Scatter Plot**

In [20]:
fig = go.Figure(data=[go.Scatter(
    x=Active_per_district['Districts'][0:10],
    y=Active_per_district['Active'][0:10],
    mode='markers',
    marker=dict(
        color=10+np.random.randn(200),

        size=Active_per_district['Active'][0:10]/10,
        showscale=True
        )
)])
fig.update_layout(
    title='10 Most Active Cases Districts',
    xaxis_title="Districts",
    yaxis_title="Active Cases",
        template='plotly_dark'

)
fig.show()

# Deaths in Each Districts

In [21]:
Deaths_per_district = df.groupby(["Districts"])["Deaths"].sum().reset_index().sort_values("Deaths",ascending=False).reset_index(drop=True)

Representation of deaths of each country in the tabular form with maximum death to the minimum death.

In [22]:
headerColor = 'grey'
rowEvenColor = 'lightgrey'
rowOddColor = 'white'

fig = go.Figure(data=[go.Table(
  header=dict(
    values=['<b>Districts</b>','<b>Deaths</b>'],
    line_color='darkslategray',
    fill_color=headerColor,
    align=['left','center'],
    font=dict(color='white', size=12)
  ),
  cells=dict(
    values=[
      Deaths_per_district['Districts'],
      Deaths_per_district['Deaths'],
      ],
    line_color='darkslategray',
    # 2-D list of colors for alternating rows
    fill_color = [[rowOddColor,rowEvenColor,rowOddColor, rowEvenColor,rowOddColor]*len(df)],
    align = ['left', 'center'],
    font = dict(color = 'darkslategray', size = 11)
    ))
])
fig.update_layout(
    title='Deaths In Each Districts',
    template = 'plotly_dark'
)
fig.show()

### **Bar Graph**

In [23]:
fig = go.Figure(data=[go.Bar(
            x=Deaths_per_district['Districts'][0:10], y=Deaths_per_district['Deaths'][0:10],
            text=Deaths_per_district['Deaths'][0:10],
            textposition='auto',
            marker_color='darkviolet'

        )])
fig.update_layout(
    title='10 Most death districts',
    xaxis_title="Districts",
    yaxis_title="Deaths",
        template='plotly_dark'

)
fig.show()

### **Scatter Plot**

In [24]:
fig = go.Figure(data=[go.Scatter(
    x=Deaths_per_district['Districts'][0:10],
    y=Deaths_per_district['Deaths'][0:10],
    mode='markers',
    marker=dict(
        color=[100, 140, 135, 130, 125, 120,115,110,105,100],

        size=Deaths_per_district['Deaths'][0:10]/0.05,
        showscale=True
        )
)])
fig.update_layout(
    title='10 Most Death Cases Districts',
    xaxis_title="Districts",
    yaxis_title="Active Cases",
        template='plotly_dark'

)
fig.show()

# Time Series Analysis

The source for the time series data is https://github.com/datasets/covid-19/tree/master/data



In [25]:
import io
import requests
url="https://raw.githubusercontent.com/datasets/covid-19/master/data/time-series-19-covid-combined.csv"
s=requests.get(url).content
time_series_data = pd.read_csv(io.StringIO(s.decode('utf-8')))
time_series_data.head()

Unnamed: 0,Date,Country/Region,Province/State,Lat,Long,Confirmed,Recovered,Deaths
0,2020-01-22,Afghanistan,,33.0,65.0,0.0,0.0,0.0
1,2020-01-23,Afghanistan,,33.0,65.0,0.0,0.0,0.0
2,2020-01-24,Afghanistan,,33.0,65.0,0.0,0.0,0.0
3,2020-01-25,Afghanistan,,33.0,65.0,0.0,0.0,0.0
4,2020-01-26,Afghanistan,,33.0,65.0,0.0,0.0,0.0


In [26]:
Data_Nepal = time_series_data[(time_series_data['Country/Region'] == 'Nepal') ].reset_index(drop=True)
Data_Nepal.head()

Unnamed: 0,Date,Country/Region,Province/State,Lat,Long,Confirmed,Recovered,Deaths
0,2020-01-22,Nepal,,28.1667,84.25,0.0,0.0,0.0
1,2020-01-23,Nepal,,28.1667,84.25,0.0,0.0,0.0
2,2020-01-24,Nepal,,28.1667,84.25,0.0,0.0,0.0
3,2020-01-25,Nepal,,28.1667,84.25,1.0,0.0,0.0
4,2020-01-26,Nepal,,28.1667,84.25,1.0,0.0,0.0


Adding the active cases column

In [27]:
Data_Nepal['Active'] = Data_Nepal['Confirmed']-(Data_Nepal['Recovered'] + Data_Nepal['Deaths'])

In [28]:
Data_Nepal.tail()

Unnamed: 0,Date,Country/Region,Province/State,Lat,Long,Confirmed,Recovered,Deaths,Active
149,2020-06-19,Nepal,,28.1667,84.25,8274.0,1402.0,22.0,6850.0
150,2020-06-20,Nepal,,28.1667,84.25,8605.0,1578.0,22.0,7005.0
151,2020-06-21,Nepal,,28.1667,84.25,9026.0,1772.0,23.0,7231.0
152,2020-06-22,Nepal,,28.1667,84.25,9561.0,2148.0,23.0,7390.0
153,2020-06-23,Nepal,,28.1667,84.25,10099.0,2224.0,24.0,7851.0


## Evolution of corona cases over time

In [29]:
fig = go.Figure()
fig.add_trace(go.Scatter(x=Data_Nepal['Date'], y=Data_Nepal['Confirmed'],
                    mode='lines',
                    name='Confirmed cases'))

fig.add_trace(go.Scatter(x=Data_Nepal['Date'], y=Data_Nepal['Active'],
                    mode='lines',
                    name='Active cases',line=dict( dash='dot')))
fig.add_trace(go.Scatter(x=Data_Nepal['Date'], y=Data_Nepal['Deaths'],name='Deaths',
                                   marker_color='black',mode='lines',line=dict( dash='dot') ))
fig.add_trace(go.Scatter(x=Data_Nepal['Date'], y=Data_Nepal['Recovered'],
                    mode='lines',
                    name='Recovered cases',marker_color='green'))
fig.update_layout(
    title='Evolution of cases over time in Nepal',
    template='plotly_dark',

)

fig.show()

## Evolution of Confirmed cases over time

In [30]:
fig = go.Figure()
fig.add_trace(go.Scatter(x=Data_Nepal.index, y=Data_Nepal['Confirmed'],
                    mode='markers',
                    name='Confirmed cases'))


fig.update_layout(
    title='Evolution of Confirmed cases over time in Nepal',
        template='plotly_dark'

)

fig.show()

## Evolution of Active cases over time

In [31]:
fig = go.Figure()


fig.add_trace(go.Scatter(x=Data_Nepal.index, y=Data_Nepal['Active'],
                    mode='lines',marker_color='yellow',
                    name='Active cases',line=dict( dash='dot')))

fig.update_layout(
    title='Evolution of Acitive cases over time in Nepal',
        template='plotly_dark'

)

fig.show()

## Evolution of Recovered cases over time in Nepal

In [32]:
fig = go.Figure()

fig.add_trace(go.Scatter(x=Data_Nepal.index, y=Data_Nepal['Recovered'],
                    mode='lines',
                    name='Recovered cases',marker_color='green'))

fig.update_layout(
    title='Evolution of Recovered cases over time in Nepal',
        template='plotly_dark'

)

fig.show()

## Evolution of death over time in Nepal

In [33]:
fig = go.Figure()

fig.add_trace(go.Scatter(x=Data_Nepal.index, y=Data_Nepal['Deaths'],name='Deaths',
                                   marker_color='red',mode='lines',line=dict( dash='dot') ))

fig.update_layout(
    title='Evolution of Deaths over time in Nepal',
        template='plotly_dark'

)

fig.show()

This is the short analysis on the basis of simple data about corona in Nepal. Will be updated with the more information in the future.
Do Upvote if you like and give feedbacks so that I can improve myself.