<p style = "font-family:palatino linotype,serif;font-size:25px;">
    It is believed that at some point of immunization against coronavirus people get collective immune protection and after that spread of coronavirus is supposed to be rapidly decreasing. Let's make a chart with the most immunized countries.
    </p>

In [None]:
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))
import warnings
warnings.filterwarnings("ignore")

In [None]:
df=pd.read_csv('/kaggle/input/covid-world-vaccination-progress/country_vaccinations.csv')
#df.isna().sum()

<p style = "font-family:palatino linotype,serif;font-size:25px;">
    Let's get max value of fully vaccinated people by countries 
    </p>

In [None]:
#drop NAN values of fully vacinated
people_fully_vaccinated=df.dropna(subset=['people_fully_vaccinated'])
people_vaccinated=df.dropna(subset=['total_vaccinations'])
#show the change in number of countries
print('save % of the countries:', 100*people_fully_vaccinated['country'].nunique()/df['country'].nunique())
#get max number of fully vacinated people
people_fully_vaccinated_by_countries=people_fully_vaccinated.groupby('country').max('people_fully_vaccinated').sort_values(by='people_fully_vaccinated_per_hundred', ascending=False)

<p style = "font-family:palatino linotype,serif;font-size:25px;">
    In this chunk, we will figure out what is the total ratio of people acquired immunity against corona including people who recovered from the sickness.
    </p>

In [None]:
#look on the ratio of fully vaccinated people
people_fully_vaccinated_by_countries['people_fully_vaccinated_per_hundred']
people_fully_vaccinated_by_countries.reset_index(inplace=True)

#get the list of the countries
#list_of_these_countries=list(people_fully_vaccinated_by_countries['country'])

#let's include ratio of people who got immunization from sickness in these countries
real_time_covid_data=pd.read_csv('/kaggle/input/covid-19/data/countries-aggregated.csv')
#get recent information of recovered
real_time_covid_data_recovered=real_time_covid_data.groupby('Country').max('Recovered')
real_time_covid_data_recovered.reset_index(inplace=True)

#merge these two tables
real_time_covid_data_recovered['country']=real_time_covid_data_recovered['Country']
result = pd.merge(people_fully_vaccinated_by_countries, real_time_covid_data_recovered[['country','Recovered']], how="left", on=["country"])

#manually add numbers for US (from worldometer.com) and remove NaN
result.set_index('country', inplace=True)
result.loc[['United States'],['Recovered']]=15330949
result.reset_index(inplace=True)
result.dropna(subset=['Recovered'], inplace=True)
#get the ratio of recovered people to total population
result['population']=result['people_fully_vaccinated']*100/result['people_fully_vaccinated_per_hundred']
result['Recovered_per_hundred']=100*result['Recovered']/result['population']
#get total ratio of immunized people and remove zero values
result['immunized']=result['Recovered_per_hundred']+result['people_fully_vaccinated_per_hundred']
result.sort_values(by='immunized', ascending=False,inplace=True)
result= result[result['immunized'] != 0]
result[['country','immunized']]

In [None]:
from geopy.geocoders import Nominatim 
geolocator = Nominatim(user_agent="wsdfb") 
latitude=[] 
longitude=[] 
errors=[] 
proved_regions=[] 
regions=list(result.country)
for i in regions: 
    try: 
        location = geolocator.geocode(str(i)) 
        latitude.append(location.latitude) 
        longitude.append(location.longitude) 
        proved_regions.append(i) 
    except AttributeError: 
        errors.append(i)
df1=pd.DataFrame({'country':proved_regions,'latitude':latitude,'longitude':longitude}) 
df1.to_csv(r'df1.csv') 
result = pd.merge(result, df1, how="left", on=["country"])

<p style = "font-family:palatino linotype,serif;font-size:25px;">
    As we can see Israel is the most probable country-candidate for the acquisition of collective immunity among countries which introduced vaccination
    </p>

In [None]:
#visualize the result
import matplotlib.pyplot as plt
import plotly.express as px
fig = px.scatter_mapbox(result,                        
                            lat="latitude",                        lon="longitude",
                        color="Recovered_per_hundred",
                        color_continuous_scale='Bluered', 
                        hover_name="country",
                        height=600,
                        zoom=1,
                        size="immunized",
                        size_max=40,
                        opacity=0.4,
                        width=1300)
fig.update_layout(mapbox_style='open-street-map')
fig.show()

<p style = "font-family:palatino linotype,serif;font-size:25px;">
    Let's explore the progression of immunization with time series data regardless of vaccination. The formula: total vaccinated value + number of recovered.
    </p>

In [None]:
real_time_covid_data.reset_index(inplace=True)

#merge time series data daily recovers and fully_vaccinated 
people_vaccinated['Date']=people_vaccinated['date']
people_vaccinated['Country']=people_vaccinated['country']
result_daily_cases = pd.merge(real_time_covid_data, people_vaccinated, how="left", on=["Country",'Date'])
#get population of the countries and merge with our table
population=pd.read_csv('/kaggle/input/world-population/WorldPopulation.csv')
population['Country']=population['Country Name']
population=population[['Country','2019','Country Code']]
result_daily_cases = pd.merge(result_daily_cases, population, how="left", on=["Country"])
#get immunized ratio to total population
result_daily_cases['Recovered_per_hundred']=100*result_daily_cases['Recovered']/result_daily_cases['2019']
result_daily_cases['total_vaccinations_per_hundred'].fillna(0,inplace=True)
result_daily_cases['immunized']=result_daily_cases['Recovered_per_hundred']+result_daily_cases['total_vaccinations_per_hundred']
result_daily_cases.sort_values(by=['Country','Date'], ascending=True,inplace=True)

In [None]:
result_daily_cases.dropna(subset=['immunized'],inplace=True)
result_daily_cases['Immunization, % of population']=result_daily_cases['immunized']
fig = px.choropleth(result_daily_cases,                            
                     locations="Country Code",           
                     color='Immunization, % of population',                    
                     hover_name="Country",             
                     animation_frame="Date",       
                     projection= 'natural earth',        
                     color_continuous_scale= 'Reds',  
                     range_color=[0,10])    
fig.layout.updatemenus[0].buttons[0].args[1]['frame']['duration'] = 5
fig.layout.updatemenus[0].buttons[0].args[1]['transition']['duration'] = 5
fig.show()          
fig.write_html("example_map.html")            