# Transforming and Combining Data

In the previous module you worked on a dataset that combined two different `World Health
Organization datasets: population and the number of deaths due to tuberculosis`.
They could be combined because they share a `common attribute: the countries`. This
week you will learn the techniques behind the creation of such a combined dataset.

In [1]:
import warnings
warnings.simplefilter('ignore', FutureWarning)

import pandas as pd

## What if...?

The third conversion, from abbreviated country names to full names, can’t be written as a
simple formula, because each abbreviation is expanded differently.
What I need is the Python code equivalent of:

- if the name is ‘UK’, return ‘United Kingdom’,
- otherwise if the name is ‘USA’, return ‘United States’,
- otherwise return the name.

The last part basically says that if the name is none of the known abbreviations, return it
unchanged. Translating the English sentence to Python is straightforward.

The next function uses the full form of the conditional statement to expand the abbreviated country names UK and USA and leave other names unchanged.

In [2]:
def expandCountry (name):
    if name == 'UK':
        return 'United Kingdom'
    elif name == 'USA':
        return 'United States'
    else:
        return name

expandCountry('India') == 'India'

True

Note that ‘otherwise if’ is written `'elif'` in Python, not `'else if'`. As you might expect,
‘if’, ‘elif’ and ‘else’ are reserved words.
The computer will evaluate one condition at a time, from top to bottom, and execute only
the instructions of the first condition that is true. Note that there is no condition after
`'else'` , it is a ‘catch all’ in case all previous conditions fail.
Note again the colons at the end of lines and that code after the colon must be indented.
That is how Python distinguishes which lines of code belong to which condition.

There are almost always many ways to write the same function. A conditional statement
does not need to have an 'elif' or 'else' part. In that case, if the condition is false,
nothing happens.

Here is the same function, written differently, using the simplest form of the conditional statement, without the `elif` and `else` parts.

In [3]:
def expandCountry (name):
    if name == 'UK':
        name = 'United Kingdom'
    if name == 'USA':
        name = 'United States'
    return name

### Tasks

1. Write more tests.
- Explain why the second version of the function works. Note how the code is indented.
- Extend both versions to expand 'St. Lucia' to 'Saint Lucia'.
- Write a function to translate some country names from their original language to English, e.g. 'Brasil' to 'Brazil', 'España' to 'Spain' and 'Deutschland' to 'Germany'.
- Can you think of a different way of expanding abbreviated country names? You're not expected to write any code. Hint: this is a course about data tables.

'''
The computer will evaluate one condition at a time, from top to bottom, 
and changes the value assigned to name variable based on which condition is true.
If no condition is met, it returns the initial value we assigned to name parameter.
'''

In [5]:
def expandCountry (name):
    if name == 'UK':
        return 'United Kingdom'
    elif name == 'USA':
        return 'United States'
    elif name == 'St. Lucia':
        return 'Saint Lucia'
    else:
        return name

expandCountry('Nigeria') == 'Nigeria'

True

In [7]:
def expandCountry (name):
    if name == 'UK':
        name = 'United Kingdom'
    if name == 'USA':
        name = 'United States'
    if name == 'St. Lucia':
        name = 'Saint Lucia'
    return name

expandCountry('USA') == 'United States'

True

In [2]:
def changeCountryName (name):
    if name == 'Brasil':
        return 'Brazil'
    elif name == 'España':
        return 'Spain'
    elif name == 'Deutschland':
        return 'Germany'
    else:
        return name

changeCountryName('Nigeria') == 'Nigeria'

True