This python program uses a text based input to create a user interaction
with certain columns of data from our world in covid dataset.

The Pandas and Matplotlib library were mainly used for the data analysis;
the Pandas library providing functions for data cleaning and providing summary information.

The Matplotlib library is used for data visualization.


Project Workflow:
Data Loading and Preparation
User Interaction for Data Exploration
Statistical Summary for Selected Countries
Comparative Analysis between Africa and Europe


In [11]:
# importing necessary libraries
import pandas as pd
import matplotlib.pyplot as plt

# reading the data file
df_country = pd.read_csv("covid_data.csv")

# the .unique() method helps to eliminate duplicate objects in a column;returning only one of each item.
country_names = df_country.location.unique()

# displaying the columns
df_country.columns

# the list() method creates an empty list
columns = list()



  df_country = pd.read_csv("covid_data.csv")


 This block of code uses a for loop to iterate over the column names; using a conditional statement to check for specific names and thus appending it to an intially empty list.
    

In [12]:
# Create a dictionary to map column indices to column names
for i in df_country.columns:
  if i in ("total_cases", "new_cases", "population", "aged_65_older", "median_age", "new_vaccinations", "total_vaccinations",
    "icu_patients", "hosp_patients", "total_deaths", "new_deaths"):
        columns.append(i)

column_selector={ k: v for k, v in enumerate(columns) } # a dictionary being created from a list comprehension

# Welcome message and instructions
print("Hello and welcome to the COVID-19 data exploration tools")
print(" ")
print("Select a valid attribute to check from the options below")
print(column_selector)
print(" ")

Hello and welcome to the COVID-19 data exploration tools
 
Select a valid attribute to check from the options below
{0: 'total_cases', 1: 'new_cases', 2: 'total_deaths', 3: 'new_deaths', 4: 'icu_patients', 5: 'hosp_patients', 6: 'total_vaccinations', 7: 'new_vaccinations', 8: 'median_age', 9: 'aged_65_older', 10: 'population'}
 


 A while loop is used to put a block of code in an unending run state till a desired outcome is achieved. this is usually followed by the command "break" to ensure the block of code ends. In the code below, .isnumeric(), one of the inbuilt pycharm functions is used to ensure that whatever value is entered is a number. Once this condition is met, the code progresses as normal. The .get() dictionary method is also implemented, so that when a number that is present in the column selector is entered, the correponding column is called upon.

In [13]:
def stats(x_selected):
  while True:
    number = (input("Enter a number from the above \n"))
    result = number.isnumeric()
    if result:
# the if statement is a conditional command in python, in the above scenario - if result: means the same thing as if result == True
        valid_attribute = int(number)
        while valid_attribute not in column_selector:
            print(column_selector)
            print("Please select a number from above")
            break
        else:
            item_selected = column_selector.get(result)
            print(item_selected)
            break
    else:
        print(column_selector)
        print("Invalid command. Please enter a number.")

print(" ")

 


 This function gives the summary information for chosen columns for the selected country. :param x_selected: this is the arguments that will be passed into the function based on user input from a dictionary. return:

In [14]:
# Function to display statistical summary for selected countries
def stats(x_selected):
    while True:
        print(country_names)
        print(" ")
        x = input("Type the country you want information about from the list above. \n").capitalize()
        # The .capitalize() method converts any input into a capitalized format - The first letter is in upper case while the rest
        # are in lower case. Since that was the format used in the countries selected and bearing in mind that python is case sensitive
        if x not in country_names:  # Checks if the user input is among the list of displayed options.
            print("info on selected country not available or kindly check your spelling.")
            print("enter a valid country")
            continue
        else:
            print(" ")
            print(f"Would you like to see summary information on the selected country '{x}'")  # f-string is used to display the selected country
            answer = input(" Type in yes to continue or any key to check another parameter \n").upper()
            if answer == "YES":
                print(df_country[df_country.location == x].describe()[x_selected])
                # The .describe() method is used to display the summary statistics for each country, this would give an overview of the data
                answer = input("Would you like to check for another country?\n enter yes to continue or any key to check another parameter \n").lower()
                if answer == "yes":
                    continue  # In the event that the user wants to continue the analysis with another country
                              # the keyword continue returns the program to the top of the loop
                break
            print(" ")
            break

print(" ")


 


In [16]:
# Function to compare selected parameter between Africa and Europe
def compare():
    print(column_selector)
    while True:
        try:
            number = int(input("Enter a number from the above \n"))
            break
        except ValueError:
            print("Select a number")

    item_selected = column_selector.get(number, "total_deaths")

    if item_selected in ("total_cases", "new_cases", "total_deaths", "new_deaths", "icu_patients", "hosp_patients"):
        # If the item selected is among the listed items, a bar chart is plotted
        print(f"The parameter selected was {item_selected}")
        continent_africa = df_country[df_country.continent == "Africa"].groupby("location")[item_selected].mean().reset_index()
        continent_europe = df_country[df_country.continent == "Europe"].groupby("location")[item_selected].mean().reset_index()

        plt.figure(figsize=(15, 8))
        plt.subplot(2, 1, 1)
        plt.bar(continent_africa['location'], continent_africa[item_selected])
        plt.title(f'{item_selected} in African Countries')
        plt.xticks(rotation=90)

        plt.subplot(2, 1, 2)
        plt.bar(continent_europe['location'], continent_europe[item_selected])
        plt.title(f'{item_selected} in European Countries')
        plt.xticks(rotation=90)

        plt.tight_layout()
        plt.show()

    elif item_selected in ("total_vaccinations", "new_vaccinations", "median_age", "aged_65_older", "population"):
        # If the item selected is among the listed items, a line chart is plotted
        print(f"The parameter selected was {item_selected}")
        continent_africa = df_country[df_country.continent == "Africa"].groupby("location")[item_selected].mean().reset_index()
        continent_europe = df_country[df_country.continent == "Europe"].groupby("location")[item_selected].mean().reset_index()

        plt.figure(figsize=(15, 8))
        plt.subplot(2, 1, 1)
        plt.plot(continent_africa['location'], continent_africa[item_selected])
        plt.title(f'{item_selected} in African Countries')
        plt.xticks(rotation=90)

        plt.subplot(2, 1, 2)
        plt.plot(continent_europe['location'], continent_europe[item_selected])
        plt.title(f'{item_selected} in European Countries')
        plt.xticks(rotation=90)

        plt.tight_layout()
        plt.show()

    print("Would you like to compare another parameter?")
    selection = input("Type 'yes' to continue or any key to quit: \n").lower()
    if selection == "yes":
        compare()
    print("Goodbye")

# Main execution
while True:
    number = input("Enter a number from the above \n")
    if number.isnumeric():
        valid_attribute = int(number)
        if valid_attribute in column_selector:
            item_selected = column_selector.get(valid_attribute)
            print(f"You selected: {item_selected}")
            stats(item_selected)
            break
    else:
        print("Invalid command. Please enter a number.")
        print(column_selector)

print(" ")
print("Would you like to compare parameters between Africa and Europe using a graph?")
choice = input("Enter 'yes' or 'no': \n").lower()
if choice == "yes":
    compare()
else:
    print("Thank you for your time.")


Enter a number from the above 
2
You selected: total_deaths
['Angola' 'Egypt' 'Ethiopia' 'United Kingdom' '250' 'Ghana' 'Kenya'
 'Madagascar' 'Nigeria' 'Rwanda' 'South Africa' 'Zimbabwe' 'Belgium'
 'Finland' 'France' 'Germany' 'Greece' 'Italy' 'Netherlands' 'Russia'
 'Spain']
 
Type the country you want information about from the list above. 
Angola
 
Would you like to see summary information on the selected country 'Angola'
 Type in yes to continue or any key to check another parameter 
yes
count     929.000000
mean     1002.663079
std       757.807244
min         0.000000
25%       303.000000
50%       889.000000
75%      1898.000000
max      1917.000000
Name: total_deaths, dtype: float64
Would you like to check for another country?
 enter yes to continue or any key to check another parameter 
no
 
Would you like to compare parameters between Africa and Europe using a graph?
Enter 'yes' or 'no': 
no
Thank you for your time.
