# Defining functions and working with time in Python: Exercises

**Author**: Andrea Ballatore (Birkbeck, University of London)

**Abstract**: Learn how to use and define functions and how to manipulate temporal information in Python.

## Setup
This is to check that your environment is set up correctly (it should print 'env ok', ignore warnings).

In [1]:
# Test geospatial libraries
# check environment
import os
print("Conda env:", os.environ['CONDA_DEFAULT_ENV'])
assert os.environ['CONDA_DEFAULT_ENV'] == 'geoprogv1'
# spatial libraries 
import fiona as fi
import geopandas as gpd
import pandas as pd
import pysal as sal

# create output folders
folders = ['tmp']
for f in folders:
    if not os.path.exists(f):
        os.makedirs(f)

print('env ok')

Conda env: geoprogv1
env ok


-----
## Exercises

When you are in doubt about how a package or a function work, use the Python website (https://docs.python.org/3.9/) and **Google** to find relevant documentation.

a. Write and test a function that takes the coordinates of two points and returns the Euclidean distance. Test the function with at least 4 pairs of points.

In [2]:
def euclidean_distance(pt_a, pt_b):
    """ Calculate Euclidean distance """
    pt_a_x = pt_a[0]
    pt_a_y = pt_a[1]
    pt_b_x = pt_b[0]
    pt_b_y = pt_b[1]
    
    # TODO insert code here
    dist = None
    return dist

# test function
print(euclidean_distance([1,2],[2,3]))
print(euclidean_distance([0,0],[0,0]))

None
None


b. Given a list of values representing GDP annual growth, write a function that classifies them with the following categories:

- `bust`: < -4
- `negative`: < -.05
- `zero`: [-.05,.05]
- `positive`: > .05
- `boom`: > 4

In [1]:
def classify_growth(growth_rate):
    # insert your code here
    return 'TODO'

growth_rates = [-4.103, 10.53, -.4, 0.1, .5, 4.56, -2.45]
# test on this list and save results in growth_rates_classified
growth_rates_classified = None

c. Write a utility function that, given a string, it trims it, replaces all spaces with `_`, and makes it lower case.

In [4]:
def clean_string(s):
    print(s)
    clean_s = None
    # TODO: insert your code here
    return clean_s

# test function
clean_strings = []
for input_s in [' City of London ', 'Hammersmith ', '  Lambeth', 'ENFIELD ', 'KensinGton and ChelsEa  ']:
    clean_strings.append(clean_string(input_s))

print(clean_strings)

 City of London 
Hammersmith 
  Lambeth
ENFIELD 
KensinGton and ChelsEa  
[None, None, None, None, None]


d. Given a data frame with data about cities, write a function to calculate the population growth rate between 2000 and 2020 (e.g., 10% or -4.2%). Round values to the second decimal digit.

In [5]:
cities_df = pd.DataFrame({"city_id" : [1,2,3,4],
    "city_name" : ["London", "Lagos", "Hong Kong", "Lima"],
    "population_2000" : [7195000,  7281000, 6665000,  7294000],
    "population_2020" : [8982000, 14368000, 7451000, 10719000],
    "area_km2" : [1572, 1171, 1106, 2672]})
cities_df

Unnamed: 0,city_id,city_name,population_2000,population_2020,area_km2
0,1,London,7195000,8982000,1572
1,2,Lagos,7281000,14368000,1171
2,3,Hong Kong,6665000,7451000,1106
3,4,Lima,7294000,10719000,2672


In [6]:
# insert your code here
def calculate_growth(row):
    # TODO: complete code
    growth_rate = None
    return growth_rate

# call calculate_growth on cities_df with `apply`

e. A function `convert_area` has to be able to convert areas between m$^2$, km$^2$, and mi$^2$. It should handle all combinations. If `in_unit == out_unit`, just return the same value.

In [7]:
def convert_area(area, in_unit, out_unit):
    # valid units: 'sqm','sqkm','sqmi'
    # TODO
    conv_area = None
    return conv_area
    
print(convert_area(10,'sqm','sqkm'))

None


f. Modify and test `convert_areas` to give an error if it is supplied a negative value (areas cannot be smaller than 0). Then create a list with 6 areas in km$^2$. Use `convert_areas` to convert them to m$^2$ and mi$^2$. 

In [8]:
areas_km = []
# TODO: insert your code here

e. Write a function to check the validity of a lon/lat pair checking their range (e.g., -180,180). The function should return `False` if the pair is invalid, and `True` otherwise. The function should rely on two sub-functions `is_lat_valid` and `is_lon_valid`.

In [9]:
def is_lat_valid(lat):
    # TODO: complete the function
    return

def is_lon_valid(lon):
    # TODO: complete the function
    return

def is_latlon_valid(xlon, ylat):
    # TODO: complete the function
    return

    
# TODO: test the function here
print(is_latlon_valid(42.3, 5.53))
print(is_lat_valid(-493))
print(is_lon_valid(2.54))

None
None
None


f. Given a list of timestamps (for example representing GPS fixes), write a function that sorts them and then calculates the interval (_timedelta_) between them. Observe the structure of the result `datetime.now()` to understand the `datetime` type. Enter some dates of notable events as specified below:

In [2]:
from datetime import datetime
import time

# build some example timestamps with sleep between them
example_timestamps = []
# format: datetime(year, month, day, hour, min, seconds, decimal)

# Fall of the Berlin wall
example_timestamps.append(datetime(1989, 11, 9, 0,0,0))

# TODO: enter date of beginning of the Iraq War

# TODO: enter date of 9/11 attacks

# TODO: enter date of beginning of the Syrian Civil War

# add now
example_timestamps.append(datetime.now())
time.sleep(2)
example_timestamps.append(datetime.now())

example_timestamps

[datetime.datetime(1989, 11, 9, 0, 0),
 datetime.datetime(2021, 1, 18, 15, 6, 41, 229884),
 datetime.datetime(2021, 1, 18, 15, 6, 43, 231830)]

In [51]:
def time_intervals(timestamps):
    intervals = []
    # TODO: use a for loop to look 
    # at pairs of timestamps
    return intervals

print(time_intervals(example_timestamps))

[]


g. Using `pytz`, write a function that given a datetime in UTC, returns the time shifted all common time zones (`pytz.common_timezones`). The function `astimezone()` supports the conversion between different timezones. Return the results in a pandas data frame with the following columns: `time_zone`,`time_iso`,`time_ctime`,`hours`,`minutes`. In the tests, save the results to a CSV file to inspect them more easily. Some ideas are discussed on [StackOverflow](https://stackoverflow.com/questions/25264811/pytz-converting-utc-and-timezone-to-local-time).

In [52]:
import pytz

def all_time_zones(a_utc_datetime):
    """ 
    Generates a data frame with a_utc_datetime in all common time zones.
    @returns a data frame 
    """
    # TODO: complete the code here
    times_df = pd.DataFrame()
    for time_zone in pytz.common_timezones:
        # calculate time zone and add result to times_df
        print(time_zone)
        
    return times_df

now_world_df = all_time_zones(datetime.utcnow())
now_world_df.to_csv('tmp/all_time_zones.csv', index=False)
print(now_world_d)

Africa/Abidjan
Africa/Accra
Africa/Addis_Ababa
Africa/Algiers
Africa/Asmara
Africa/Bamako
Africa/Bangui
Africa/Banjul
Africa/Bissau
Africa/Blantyre
Africa/Brazzaville
Africa/Bujumbura
Africa/Cairo
Africa/Casablanca
Africa/Ceuta
Africa/Conakry
Africa/Dakar
Africa/Dar_es_Salaam
Africa/Djibouti
Africa/Douala
Africa/El_Aaiun
Africa/Freetown
Africa/Gaborone
Africa/Harare
Africa/Johannesburg
Africa/Juba
Africa/Kampala
Africa/Khartoum
Africa/Kigali
Africa/Kinshasa
Africa/Lagos
Africa/Libreville
Africa/Lome
Africa/Luanda
Africa/Lubumbashi
Africa/Lusaka
Africa/Malabo
Africa/Maputo
Africa/Maseru
Africa/Mbabane
Africa/Mogadishu
Africa/Monrovia
Africa/Nairobi
Africa/Ndjamena
Africa/Niamey
Africa/Nouakchott
Africa/Ouagadougou
Africa/Porto-Novo
Africa/Sao_Tome
Africa/Tripoli
Africa/Tunis
Africa/Windhoek
America/Adak
America/Anchorage
America/Anguilla
America/Antigua
America/Araguaina
America/Argentina/Buenos_Aires
America/Argentina/Catamarca
America/Argentina/Cordoba
America/Argentina/Jujuy
America/

----
End of notebook.
