# Table of Contents
* [First Example: Population Growth](#First-Example:-Population-Growth)


# First Example: Population Growth

This example serves to show how with just a little effort Python can be used for very powerful data analyses.

Using data from the [UN Department of Economic and Social Affairs - Population Division](http://esa.un.org/unpd/wpp/Download/Standard/Population/) we are going to take yearly population data for *more* and *less* developed regions and generate a report of the 5-year population growth between 1950 and 2015.

The percent population growth is modeled using the following function
$$
\left( \sqrt[y_{f} - y_{i}]{\frac{p_{f}}{p_{i}}} -1 \right) \times 100
$$

$p_i$ and $p_f$ are the populations a the initial year $y_i$ and final year $y_f$ over the chosen interval. In this the interval is five years.

The data set is stored in a CSV file in the `data` directory. The first few lines of the file are shown here. The populations of *more_developed* and *less_developed* regions are shown in thousands.

```
year,more_developed,less_developed
1950,812989,1712161
1951,822320,1749547
1952,832149,1785792
1953,842294,1821735
1954,852613,1858064
1955,863004,1895310
```

We need to write Python code to do the following

1. Read the CSV file into an appropriate data structure
2. Iterate over the data to compute population growth over a given interval
3. Print the results for *more_developed* and *less_developed* columns

In [None]:
# Some functions we'll need

import csv

def get_data(data_file):
    '''This function reads a CSV file and transforms values to integers
    
    The first line is expected to be a header.
    
    processed_data is a list-of-lists.
    '''
    
    with open(data_file) as f:
        raw_data = csv.reader(f)
    
        # The first line is expected to be a header, not integers
        processed_data = [next(raw_data)]
    
        # All other lines need to be converted
        # to integers to allow computation
        for line in raw_data:
            processed = [int(i) for i in line]
            processed_data.append(processed)
        
    return processed_data


def compute_growth(population_i,year_i,population_f,year_f):
    '''Compute percent population growth over a given interval
    
    population_i : initial population
    year_i       : initial year
    population_f : final population
    year_f       : final year
    '''
    return ((population_f/population_i)**(1/(year_f-year_i))-1)*100

In [None]:
data = get_data('data/world_population_development.csv')

interval = 5 # years

print("Percent Population Growth in more and less developed regions\n")
print("%5s %5s %5s" % ('year','more','less'))

# We need to store the previous entries
# and only compute the growth after they
# are stored
previous = None
for year,less,more in data[1::interval]: # do not process the first line
    # The first iteration only stores its values
    # since there were no previous values.
    if previous is None:
        previous = (year,less,more)
        continue
    else:
        year_i,less_i,more_i=previous
    
    growth_more = compute_growth(more_i,year_i,more,year)
    growth_less = compute_growth(less_i,year_i,less,year)
    
    previous = (year,less,more)
    
    print("%5s %5.2f %5.2f" % (year,growth_less,growth_more))

The above implementation made use of several Python-specific programming features. We'll cover these topics and many more during this course.

* **tuple expansion**: `year_i,less_i,more_i=previous` and `previous=(year,less,more)`
* **List slicing**: `data[1::n_years]`
* **List comprehension**: `processed = [int(i) for i in line.split(',')]`