# Week 5 Homework (my solution)

### Assignment Specification

Write two Python programs, part one and part two, to extract information about COVID-19 vaccination statistics for various countries.

### Part 1

Parse the vaccinations.csv file according to the guidelines:

- The first row has a description of each column.
- Each row after that contains one day's cumulative vaccination data for one country.
- The last row of data for each country contains the most recent vaccination data for that country.

The program should output the most recent statistics for the percentage of each country's population that has been vaccinated.

Create a dictionary to store the percentage value for each country. For one country, this percentage value can be updated multiple times until you have the most recent value stored.

- You should ignore any rows of the file that have either a blank or a value of zero for the total_vaccinations_per_hundred.
- Percentages should be output with exactly one digit after the decimal point.

Sample output:

### Your Solution

In [None]:
def part1():
    # First, define a dictionary to store your data.
    
    countries = {}
    
    # Open the vaccinations.csv file, and iterate through each
    # line after the first one. Split each line by the csv
    # delimiter, i.e. a comma "," and read the relevant column
    # from each line into the dictionary. The key should be the
    # country name, but converted to all caps as to match the
    # expected output. We're looking for the row called
    # total_vaccinations_per_hundred, which is the 8th row (zero
    # indexed.)
    
    with open("../vaccinations.csv", "r") as f:
        # discard the first line
        f.readline()
        
        for line in f:
            columns = line.split(",")
            total_vax = columns[8]
            
            # ignore columns with blank or zero for total_vax
            if total_vax == "" or total_vax == "0":
                continue
            
            country_name = columns[0].upper()
            
            countries[country_name] = float(total_vax)
            
    
    # Then, iterate through the dictionary and print each entry
    # according to the output specification.
    
    for country in countries:
        print("{}: {:.1f}".format(country, countries[country]))
    
# --- test your solution ---

part1()

### Part 2

Again, parse the vaccinations.csv, but this time, take info from the user. When the program runs, it should ask for the user to input a country. Then, it should output the following:

1. The vaccination percentage for the country the user chose.
2. The average vaccination percentage for all countries in the CSV file.
3. The country with the lowest vaccination percentage out of all countries, as well as the actual lowest vaccination percentage value found.
4. The country with the highest vaccination percentage out of all countries, as well as the actual highest vaccination percentage value found.

Additional requirements:

5. The program should accept a country name without regard to capitalisation. (e.g. Canada, canada, cAnADA, would all be read the same.
6. If the user gives a country name that does not exist in the file, you should output a message stating so.
7. You should ignore any rows of the file that have either a blank or a value of zero for the *total_vaccinations_per_hundred*.
8. Percentages should be output with exactly one digit after the decimal point.
9. If more than one country has the highest or lowest vaccination percentage, you can choose any of those country's names to output.

Sample output:

### Your Solution

In [None]:
def part2():
    # Before you start, take input from the user for which country
    # they're interested in. Convert this info to UPPER CASE so that
    # we can compare it to the data.
    
    search = input("Which country would you like data for? ").upper()
    
    # If you like, you can copy your solution to part 1 as a
    # starting point for the rest.
    
    countries = {}
    
    with open("../vaccinations.csv", "r") as f:
        # discard the first line
        f.readline()
        
        for line in f:
            columns = line.split(",")
            total_vax = columns[8]
            
            # ignore columns with blank or zero for total_vax
            if total_vax == "" or total_vax == "0":
                continue
            
            country_name = columns[0].upper()
            
            countries[country_name] = float(total_vax)
    
    # Once you've built your dictionary of countries, iterate through
    # it to compute the average (2), lowest (3), and highest (4)
    # population vaccination percentages
    
    lowest = ""
    highest = ""
    average = 0
    search_found = False
    
    for c in countries:
        total_vax = countries[c]
        
        if lowest == "":
            lowest = c
        elif total_vax < countries[lowest]:
            lowest = c
        
        if highest == "":
            highest = c
        elif total_vax > countries[highest]:
            highest = c
        
        average += total_vax
        
        if search == c:
            search_found = True
    
    average = average / len(countries)
    
    # Finally, look up the country the user provided. First check if
    # it's in the dictionary. If it is, look up its value and print
    # it. If it isn't, let the user know they screwed up. Then print
    # the other statistics collected.
    
    if search_found:
        print("{} has vaccinated {:.1f}% of their population.".format(search, countries[search]))
    else:
        print("There is no data for the country {}".format(search))
    print("Worldwide average: {:.1f}%".format(average))
    print("Country with lowest percentage is {} with {:.1f}%".format(lowest, countries[lowest]))
    print("Country with highest percentage is {} with {:.1f}%".format(highest, countries[highest]))
    
# --- test your solution ---
part2()