# Exercise notebook :

In [None]:
import warnings
warnings.simplefilter('ignore', FutureWarning)

import pandas pd
from datetime import datetime

In [None]:
df = pd.read_csv('WHO POP TB all.csv')

## Exercise 1: Applying methods to a dataframe column
The `iloc` attribute and the <code>head()</code> and <code>tail()</code> methods discussed earlier can be used with single columns.

In [None]:
df['TB deaths'].iloc[2] # third value of deaths column

In [None]:
df['Population (1000s)'].tail() # last five values of population column      

### Tasks

In the code cell below, write the code to get and display the 55th row in the dataframe <code>df</code>.

In the code cell below write the code to display the first 10 rows of the dataframe <code>df</code>.

In the code cell below, select and display the first eight rows from the <code>'Country'</code> and <code>'TB deaths'</code> columns.

**Now go back to the course.**

## Exercise 2: Comparison operators
In Expressions, you learned that Python has arithmetic operators: +, /, - and * and that
expressions such as 5 + 2 evaluate to a value (in this case the number 7)

Python has the following comparison operators:

    == (equals)
    != (not equal)
    < (less than)
    > (greater than)
    <= (less than or equal to)
    >= (greater than or equal to)

Expressions involving these operators always evaluate to a Boolean value, that
is True or False. Here are some examples:
    
- 2 = = 2 evaluates to True

- 2 + 2 = = 5 evaluates to False

- 2 != 1 + 1 evaluates to False

- 45 < 50 evaluates to True

- 20 > 30 evaluates to False

- 100 <= 100 evaluates to True

- 101 >= 100 evaluates to True

The comparison operators can be used with other types of data, not just numbers. Used
with strings they compare using alphabetical order. For example:

`'aardvark' < 'zebra' evaluates to True`

In Calculating over columns you saw that when applied to whole columns, the arithmetic operators did the calculations row by row. Similarly, an expression like `df['Country']>= 'K'` will compare the country names, row by row, against the string 'K' and record whether the result is `True or False` in a series.

If such an expression is put within square brackets immediately after a dataframe’s name,
a new dataframe is obtained with only those rows where the result is True. So:
`df[df['Country'] >= 'K']`
returns a new dataframe with all the columns of df but with only the rows corresponding
to countries starting with K or a letter later in the alphabet.
As another example, to see the data for countries with over 80 million inhabitants, the
following code will return and display a new dataframe with all the columns of df but with
only the rows where it is `True` that the value in the `'Population (1000s)'` column is
greater than 80000

The following code will get and display all the rows in `df` where it is `True` that the value in the `'Population (1000s)'` column is greater than `80000`.

In [None]:
df[df['Population (1000s)'] > 80000]         

### Task
In the code cell below write code to find all the rows in <code>df</code> where TB deaths exceed 10000.

## Exercise 3: Bitwise operators
To build more complicated expressions involving column comparisons, there are two
bitwise operators.

Pandas has two operators to make more complicated queries. Use the operator `&` (means 'and') to select rows where two conditions are both true. Use the operator `|` (means 'or') to select rows where at least one condition is true. Don't forget to put parentheses around _each_ comparison. For example, the following expression selects only countries with a population over 80 million inhabitants **and** with more that 10 thousand deaths.

The `& operator means ‘and’ and the | operator (vertical bar, not uppercase letter ‘i’) means
‘or’`. 

So, for example the expression:
    
`(df['Country'] >= 'Latvia') & (df['Country'] <= 'Sweden')`

will evaluate to a series containing Boolean values where the values areTrue only if the
equivalent rows in the dataframe contain the countries `‘Latvia’ to ‘Sweden’`, inclusive.
However, the following expression which uses `| (or) rather than & (and)`:
    
`(df['Country'] >= 'Latvia') | (df['Country'] <= 'Sweden')`

will evaluate to `True` for all countries, because every country comes alphabetically after
`‘Latvia’ (e.g. the ‘UK’) or before 'Sweden' (e.g. ‘Brazil’)`.

Note the round brackets around each comparison. Without them you will get an error.
The whole expression with multiple comparisons has to be put within `df[…]` to get a
dataframe with only those rows that match the condition.

As a further example, using different columns, it is relatively easy to find the rows
in df where `'Population (1000s)' is greater than 80000 and where 'TB deaths' are
greater than 10000`.

In [None]:
df[(df['Population (1000s)'] > 80000) & (df['TB deaths'] > 10000)]

If the same columns will be used repeatedly in the program, the code becomes more readable if written as follows:

In [None]:
population = df['Population (1000s)']
deaths = df['TB deaths']
df[(population > 80000) & (deaths > 10000)]  

### Task
In the code cell below find  all the countries where the Population (1000s) is **less than or equal to** 50000 **or** TB deaths are **greater than or equal to** 20000.