# Exercises: Functions

In this notebook, you will create numerous functions, all of which have a single parameter `data`.

Running the code cell below will assign to `latest` an example of an argument suitable for passing to each function as the `data` parameter. You'll see after each incomplete function a code cell which will call that function using `latest`, so that you can check your function is working as expected. **Don't change the function names**.

The dataset, which relates to Coronavirus cases across the world, was taken from [worldmeters.info](https://www.worldometers.info/coronavirus/#countries) on February 19th 2020.

We have used the `pandas` package for convenience to import and process the dataset from the `corona.csv` file, which you can examine via Jupyter or a spreadsheet application if you want to.

There's no need to understand the `pandas` code cell yet, although feel free to read it and have a think about what it's likely to be doing; you'll learn more about that later. **Complete the subsequent exercises using Python only**.

In [1]:
import pandas as pd

df = pd.read_csv('data/corona.csv').fillna(0).astype(dtype=int, errors='ignore').sort_values(by='Total Cases', ascending=False)

latest = df.to_dict('list')

In [2]:
latest

{'Country': ['China',
  'Diamond Princess',
  'Singapore',
  'Japan',
  'Hong Kong',
  'S. Korea',
  'Thailand',
  'USA',
  'Taiwan',
  'Malaysia',
  'Vietnam',
  'Germany',
  'Australia',
  'France',
  'Macao',
  'U.K.',
  'U.A.E.',
  'Canada',
  'Philippines',
  'Italy',
  'India',
  'Russia',
  'Spain',
  'Nepal',
  'Belgium',
  'Sri Lanka',
  'Finland',
  'Egypt',
  'Cambodia',
  'Sweden'],
 'Total Cases': [74187,
  621,
  81,
  80,
  63,
  51,
  35,
  29,
  23,
  22,
  16,
  16,
  15,
  12,
  10,
  9,
  9,
  8,
  3,
  3,
  3,
  2,
  2,
  1,
  1,
  1,
  1,
  1,
  1,
  1],
 'New Cases': [1751,
  79,
  0,
  6,
  1,
  20,
  0,
  0,
  1,
  0,
  0,
  0,
  0,
  0,
  0,
  0,
  0,
  0,
  0,
  0,
  0,
  0,
  0,
  0,
  0,
  0,
  0,
  0,
  0,
  0],
 'Total Deaths': [2006,
  0,
  0,
  1,
  2,
  0,
  0,
  0,
  1,
  0,
  0,
  0,
  0,
  1,
  0,
  0,
  0,
  0,
  1,
  0,
  0,
  0,
  0,
  0,
  0,
  0,
  0,
  0,
  0,
  0],
 'New Deaths': [138,
  0,
  0,
  0,
  1,
  0,
  0,
  0,
  0,
  0,
  0,
  0,
  

`latest` is a dictionary, where each key is a column heading in the CSV, and each value is a list containing the values in the given column from each row of the CSV:

Run the following code cells further explore and understand the structure of the `latest` dictionary.

In [3]:
latest.keys()

dict_keys(['Country', 'Total Cases', 'New Cases', 'Total Deaths', 'New Deaths', 'Recovered', 'Serious'])

We can therefore access the column and cell values as follows:

In [4]:
print(latest['Country'])

['China', 'Diamond Princess', 'Singapore', 'Japan', 'Hong Kong', 'S. Korea', 'Thailand', 'USA', 'Taiwan', 'Malaysia', 'Vietnam', 'Germany', 'Australia', 'France', 'Macao', 'U.K.', 'U.A.E.', 'Canada', 'Philippines', 'Italy', 'India', 'Russia', 'Spain', 'Nepal', 'Belgium', 'Sri Lanka', 'Finland', 'Egypt', 'Cambodia', 'Sweden']


... and elements at a given position in all of the lists are from the same row of the CSV:

In [5]:
print(latest['Country'][0])
print(latest['Total Cases'][0])
print(latest['Total Deaths'][0])
print(latest['Recovered'][0])

China
74187
2006
14796


In [6]:
print(latest['Country'][3])
print(latest['Total Cases'][3])
print(latest['Total Deaths'][3])
print(latest['Recovered'][3])

Japan
80
1
20


**When writing your functions, you can assume that the dataset will be ordered by `Total Cases`**, with the data for the countries highest number of cases coming first in each list. The number of rows in the CSV file may change, but the lengths of each resulting list (i.e. column) will always be the same as one another.

Your goal is to make a set of functions that can be re-used on any CSV file which is in the same format as `corona.csv` and as described above; thus if `corona.csv` were updated, all of your functions could be re-used to gather the same metrics as before.

We encourage you to re-use previous functions within other functions where possible.

**Q1.** Create a function which returns the worldwide number of reported cases, i.e. the sum of `Total Cases` from `latest` dictionary:

- Call the function `case_count()` which takes one parameter called `data`
- `data` parameter represents a dictionary similar to `latest`
- `data['Total Cases']` statement can be used to examine 'Total Cases' info
- You may find `sum()` function useful, to sum up values in `data['Total Cases']`




In [7]:

def case_count(data):
    total = sum(data['Total Cases'])
    return total



You can test your function using the following cell:

In [8]:
case_count(latest)

75307

**Q2.** Create a function which returns the number of countries which have reported cases, i.e. the number of countries listed in `Country` from `latest` dictionary:

- Call the function `country_count()` which takes one parameter called `data`
- `data` parameter represents a dictionary similar to `latest`
- `data['Country']` statement can be used to examine 'Country' info
- You may find `len()` function useful, to calculate the number of countries in `data['Country']`




In [9]:

def country_count(data):
    countries_count = len(data['Country'])
    return countries_count


You can test your function using the following cell:

In [10]:
country_count(latest)

30

**Q3.** Create a function which returns the average number of cases over all listed countries:

- Call the function `average_cases()` which takes one parameter called `data`
- `data` parameter represents a dictionary similar to `latest`
- Use `case_count()` and `country_count()` functions as part of your working to calculate the average number of cases:i.e. 
```python
case_count(data)/country_count(data)
```



In [11]:

def average_cases(data):
    average = case_count(data)/country_count(data)
    return average


You can test your function using the following cell:

In [12]:
average_cases(latest)

2510.233333333333

**Q4.** Create a function which returns the number of countries where `Total Cases` equals `1`:

- Call the function `single_case_country_count()` which takes one parameter called `data`
- `data` parameter represents a dictionary similar to `latest`
- `data['Total Cases']` statement can be used to examine 'Total Cases' info


- Consider using a `for` loop to iterate through values in `Total Cases`
- Use an `if condition` within the `for` loop to check for `Total Cases` equivalent to one: `== 1`
- Also make sure to create a counter to track the number of countries matching the above criteria: `count = 0` 



In [13]:

def single_case_country_count(data):
    count = 0
    
    for cases_count in data['Total Cases']:
        if cases_count == 1:
            count += 1
            
    return count



You can test your function using the following cell:

In [14]:
single_case_country_count(latest)

7

**Q5.** Create a function which returns a list of Country names where the number of cases is equal to one:

Hint: you can use the `zip()` function in Python to iterate over two lists at the same time.



In [15]:


def single_case_countries(data):
    countries = []
    for country, cases in zip(data["Country"], data["Total Cases"]):
        if cases == 1:
            countries.append(country)
    return countries




You can test your function using the following cell:

In [16]:
single_case_countries(latest)

['Nepal', 'Belgium', 'Sri Lanka', 'Finland', 'Egypt', 'Cambodia', 'Sweden']

**Q6.** Create a function which returns a list of countries in which there are still active cases, i.e. where `Total Cases` minus `Total Deaths` exceeds `Recovered`. You may find the `enumerate()` Python function helpful.




In [17]:


def active_countries(data):
    countries = []
    for i, country in enumerate(data['Country']):
        if data['Total Cases'][i] - data['Total Deaths'][i] > data['Recovered'][i]:
            countries.append(country)
    return countries




You can test your function using the following cell:

In [18]:
active_countries(latest)

['China',
 'Diamond Princess',
 'Singapore',
 'Japan',
 'Hong Kong',
 'S. Korea',
 'Thailand',
 'USA',
 'Taiwan',
 'Malaysia',
 'Vietnam',
 'Germany',
 'Australia',
 'France',
 'Macao',
 'U.K.',
 'U.A.E.',
 'Canada',
 'Italy',
 'Egypt',
 'Sweden']

**Q7.** Create a function which returns a list of countries where there are no longer any active cases: i.e. where `Total Cases` minus `Total Deaths` equals `Recovered`. You may find the `enumerate()` Python function helpful.

Look at the above question for inspiration, follow a similar logic while creating your solution.

In [19]:

def cleared_countries(data):
    countries = []
    for i, country in enumerate(data['Country']):
        if data['Total Cases'][i] - data['Total Deaths'][i] == data['Recovered'][i]:
            countries.append(country)
    return countries



You can test your function using the following cell:

In [20]:
cleared_countries(latest)

['Philippines',
 'India',
 'Russia',
 'Spain',
 'Nepal',
 'Belgium',
 'Sri Lanka',
 'Finland',
 'Cambodia']

This Project looked at analysing the coronavirus dataset and practising Python functions by answering certain questions based on the dataset. Overall I feel for this project, I was able to answer all the questions correctly and undertaking this project allowed me to become comfortable at writing Python functions. In creating these different python functions for this particular dataset allowed me to learn some information about the dataset I was analysing from the answers that were drawn from the functions. I will continue to build upon my knowledge of Python functions in future projects I undertake.