# Exercises: Functions

### In this assignment I created functions to solve some problems on a coronovirus data set which has been adapted into a dictionary.
The goal of this assignment was for me to make a set of functions that can be re-used on any CSV file which is in the same format as corona.csv, thus if corona.csv were updated, all of my functions could be re-used to gather the same metrics as before. 
This project encouarged me to learn how to write re-usable functions and practice skills in:
- Using the .zip function
- Adding comments within functions to explain what is happening and document my own thought process
- Using for loops to iterate through elements
- Utilising conditional statements to to get required values

In this notebook you will create numerous functions, all of which have a single parameter `data`.

Running the code cell below will assign to `latest` an example of an argument suitable for passing to each function as the `data` parameter. You'll see after each incomplete function a code cell which will call that function using `latest`, so that you can check your function is working as expected. **Don't change the function names**.

The dataset, which relates to Coronavirus cases across the world, was taken from [worldmeters.info](https://www.worldometers.info/coronavirus/#countries) on February 19th 2020.

We have used the `pandas` package for convenience to import and process the dataset from the `corona.csv` file, which you can examine via Jupyter or a spreadsheet application if you want to.

 **Complete the subsequent exercises using Python only**.

In [1]:
import pandas as pd

df = pd.read_csv('data/corona.csv').fillna(0).astype(dtype = int, errors='ignore')\
    .sort_values(by='Total Cases', ascending=False)

latest = df.to_dict('list')

`latest` is a dictionary, where each key is a column heading in the CSV, and each value is a list containing the values in the given column from each row of the CSV:

In [2]:
latest.keys()

dict_keys(['Country', 'Total Cases', 'New Cases', 'Total Deaths', 'New Deaths', 'Recovered', 'Serious'])

We can therefore access the column and cell values as follows:

In [53]:
print(latest['Country'])

['China', 'Diamond Princess', 'Singapore', 'Japan', 'Hong Kong', 'S. Korea', 'Thailand', 'USA', 'Taiwan', 'Malaysia', 'Vietnam', 'Germany', 'Australia', 'France', 'Macao', 'U.K.', 'U.A.E.', 'Canada', 'Philippines', 'Italy', 'India', 'Russia', 'Spain', 'Nepal', 'Belgium', 'Sri Lanka', 'Finland', 'Egypt', 'Cambodia', 'Sweden']


... and elements at a given position in all of the lists are from the same row of the CSV:

In [54]:
print(latest['Country'][0])
print(latest['Total Cases'][0])

China
74187


**When writing your functions, you can assume that the dataset will be ordered by `Total Cases`**, with the data for the countries highest number of cases coming first in each list. The number of rows in the CSV file may change, but the lengths of each resulting list (i.e. column) will always be the same as one another.

Your goal is to make a set of functions that can be re-used on any CSV file which is in the same format as `corona.csv` and as described above; thus if `corona.csv` were updated, all of your functions could be re-used to gather the same metrics as before.

We encourage you to re-use previous functions within other functions where possible.

Create a function which returns the worldwide number of reported cases, i.e. the sum of `Total Cases`:

In [3]:
def case_count(data):    
    return (sum(latest['Total Cases']))
   

You can test your function using the following cell:

In [56]:
(case_count(latest))

75307

Create a function which returns the number of countries which have reported cases, i.e. the number of countries listed in the table (`Diamond Princess` can be treated as a country for all functions):

In [57]:
def country_count(data):
    return(len(latest['Country']))
    

You can test your function using the following cell:

In [58]:
country_count(latest)

30

Create a function which returns the average number of cases over all listed countries:

In [59]:
def average_cases(data):
     return sum(latest['Total Cases']) / len(latest['Country'])

    

You can test your function using the following cell:

In [60]:
average_cases(latest)

2510.233333333333

Create a function which returns the number of countries where `Total Cases` equals `1`:

In [7]:
def single_case_country_count(data):
    #create empty list
    one_case = []
    #for loop that finds values equal to '1' in 'total cases column'
    for i in latest['Total Cases']:
        if i == 1:
            #appends finding to 'one_case' list
            one_case.append(i)
            #returns length of 'one_case' list
    return (len(one_case))

You can test your function using the following cell:

In [5]:
single_case_country_count(latest)

7

Create a function which returns a list of countries the number of cases is equal to one:

Hint: you can use the `zip()` function in Python to iterate over two lists at the same time.

In [15]:
def single_case_countries(data):
    #creating list varialble name to zip
    list1 = latest['Country']
    list2 = latest['Total Cases']
    #Python's zip() function creates an iterator that will aggregate elements from two or more iterables
    country_cases = list(zip(list1, list2))
    #create empty list to append to
    list3 = []
    #create for loop
    for l1, l2 in country_cases:
        #if 2nd element == '1'
        if l2 == 1:
            #append 1st element (that being the country name)
            list3.append(l1)
    return(list3)
   
    
    

You can test your function using the following cell:

In [16]:
single_case_countries(latest)

['Nepal', 'Belgium', 'Sri Lanka', 'Finland', 'Egypt', 'Cambodia', 'Sweden']

Create a function which returns a list of countries in which there are still active cases, i.e. where `Total Cases` minus `Total Deaths` exceeds `Recovered`. You may find the `enumerate()` Python function helpful.    

In [21]:
def active_countries(data):
    combined_list = list(zip(latest['Country'], latest['Total Cases'], latest['Total Deaths'], latest['Recovered'])) 
    active_list = []
    for i in combined_list:
        # where Total Cases (i[1]) minus Total Deaths (i[2]) exceeds Recovered (i[3])
        if i[1] - i[2] > i[3]:
            active_list.append(i[0])
    return(active_list)
    

You can test your function using the following cell:

In [22]:
active_countries(latest)

['China',
 'Diamond Princess',
 'Singapore',
 'Japan',
 'Hong Kong',
 'S. Korea',
 'Thailand',
 'USA',
 'Taiwan',
 'Malaysia',
 'Vietnam',
 'Germany',
 'Australia',
 'France',
 'Macao',
 'U.K.',
 'U.A.E.',
 'Canada',
 'Italy',
 'Egypt',
 'Sweden']

Create a function which returns a list of countries where there are no longer any active cases:

In [25]:
def cleared_countries(data):
    combined_list = list(zip(latest['Country'], latest['Total Cases'], latest['Total Deaths'], latest['Recovered'])) 
    no_case= []
    for i in combined_list:
        if i[1] == i[2] + i[3]:
            no_case.append(i[0])
    return(no_case)

You can test your function using the following cell:

In [26]:
cleared_countries(latest)

['Philippines',
 'India',
 'Russia',
 'Spain',
 'Nepal',
 'Belgium',
 'Sri Lanka',
 'Finland',
 'Cambodia']