# Introduction to Python Geo-data processing - 1 

## Outline
* Package managers
  * conda ([Anaconda](https://anaconda.org/) [miniconda](https://www.anaconda.com/docs/getting-started/miniconda/main))
  * [pip](https://packaging.python.org/en/latest/guides/installing-using-pip-and-virtual-environments/) environments
* Explore the anatomy of an `.ipynb`
  * Running Jupyter Lab remotely (`jupyter lab --no-browser --port=8888` on server and `ssh -Y -N -L localhost:8889:localhost:8888 username@hostname` on local client)
* Code and data sharing via github ([https://github.com/UP-RS-ESP](https://github.com/UP-RS-ESP))
* Coding in python
  * General Introduction                          (*Python_introduction_1_general*)
  * numpy and arrays                              (*Python_introduction_2_numpy*)
  * pandas                                        (*Python_introduction_pandas_EQ_data*)
    * data input/output
    * some simple analysis steps
    * displaying geographic data
  * Example of 2D time series and a Gaussian Hill (*Python_introduction_GaussianHill_gradient*)
  * Linear Regression walk through                (*Python_introduction_5_linear_regression*)
    

## Online resources
* https://docs.python.org/3/tutorial/index.html
* [Python Tutorial for Beginners on YouTube](https://www.youtube.com/watch?v=cBQ4c1IQJSE)
* [Python for Data Analysis](https://wesmckinney.com/book/) and [Jupyter Labs](https://github.com/wesm/pydata-book)

 


## Table of Contents
* [Jupyter Anatomy](#Jupyter-Anatomy)
* [Python Code Structure](#Python-Code-Structure)
* [Data Types and Manipulation](#Data-Types-and-Manipulation)
  * [Data Structures](#Data-Structures)
* [for Loops](#For-Loops)
* [The `range` Function](#The-range-Function)
* [Conditional Statements](#Conditional-Statements)
* [List Comprehension](#List-Comprehension)
* [Packages and Functions](#Packages-and-Functions)
  * [Lambda Functions](#Lambda-Functions)

___

> **Note:** The best way to learn python is **hands on**: You have a data analysis task and python is the tool.
The first few notebooks are mostly about code structure, elements, and syntax. We'll use some synthetic (fake) data. Further notebooks will be more of a "learn by doing" approach using real environmental datasets.

___
# Jupyter Anatomy

Jupyter notebooks are made up of **cells** that can either contain **markdown** text like this one or `code` like the next cell. Markdown is easy to write in, and you can find lots of cheat sheets with a simple [Google Search ;)](https://www.google.com/search?q=markdown+cheat+sheet&rlz=1C1CHBF_enDE766DE766&oq=markdown&aqs=chrome.0.69i59j69i57j69i59l2j0j69i60l3.2005j0j7&sourceid=chrome&ie=UTF-8). You can also use the cells here for reference on how to _make_ **accents**, headings like the ones above (use `#`, `##`, `###`, etc.), 
1. lists
2. of 
3. things,

and

* bulleted,
* or even
  * nested lists
  * of things. Neat!
  
You can also put some `code in line` or:

```
write a block of
indented code
```

In [1]:
# here is a code cell, it's just like an open ipython line
5 ** 2

25

![](img/jupyter-top-menu.png)

In the top menu you can manually do things like insert new cells, change the format of a cell to code or markdown, and **run** cells. If you get tired of clicking the toolbar to do these and other repetitive tasks you can also learn some keyboard shortcuts. 

Press `Esc` if you are currently in an active cell (green bar on the left side), and then press `h` to bring up a help menu (`h` again to close it). The most important key strokes that you may want to memorize are:

* **`Shift + Enter`** to run the current cell and move to the next below
* **`Ctrl / Cmd + Enter`** does the same but without moving to the next cell
* **`Enter`** to re-enter editing mode on a cell (the left side will turn green)
* **`Esc`** to end editing (the left side will turn blue)
* **`Up/Down Arrow`** to move between cells
* **`m`** to change a cell to markdown format while out of editing mode (**`y`** changes it back to code format)
* **`b`** to insert a new cell under the current (it will default to code format)

If you are in a markdown cell when you **run** it (using e.g., `Shift + Enter`) then it will be typeset, whereas a code cell will send the lines of python code to the underlying IPython interpreter. Every open notebook has its own associated IPython kernel running in the background (check the "Running" tab on the landing page), so you can think of Jupyter Notebooks as just a fancy wrapper for an interactive IPython!

> **AND REMEMBER**: When you get frustrated the [internet](https://www.google.com/search?q=jupyter+how+to+run+a+cell&rlz=1C1CHBF_enDE766DE766&oq=jupyter+how+to+run+a+cell&aqs=chrome..69i57j0l5.4658j0j7&sourceid=chrome&ie=UTF-8) is [your](https://www.google.com/search?q=how+to+open+a+jupyter+notebook&rlz=1C1CHBF_enDE766DE766&oq=how+to+open+a+jupyt&aqs=chrome.1.69i57j0l7.4560j0j9&sourceid=chrome&ie=UTF-8) best [resource](https://www.google.com/search?rlz=1C1CHBF_enDE766DE766&sxsrf=ALeKk02B0Pj48PwBvuoOhmYiklyrsW2ZAQ%3A1599591459899&ei=I9RXX_6kNsrgkgWYuZII&q=how+to+change+the+format+of+a+cell+in+jupyter+notebook&oq=how+to+change+the+format+of+a+cell+in+jupyter&gs_lcp=CgZwc3ktYWIQAxgAMggIIRAWEB0QHjoECAAQRzoECCMQJzoFCAAQkQI6AggAOgIILjoHCAAQFBCHAjoFCAAQywE6BggAEBYQHjoECCEQClCQNljIYWCcZmgBcAF4AIABmAKIAeInkgEGOS4yMy42mAEAoAEBqgEHZ3dzLXdpesABAQ&sclient=psy-ab) :)


<div class="alert alert-block alert-warning">
<b>Task 1.2</b>

Create a markdown cell below this box. In it, summarize a few facts about python from the first paragraph of the Wikipedia page here: https://en.wikipedia.org/wiki/Python_(programming_language). The summary can be in bullet points. Include the Wiki link in a text hyperlink (so only text shows and not the full https link. Look to the above cells for help!
    
Below this new markdown cell, put a `code` cell that prints the `Hello, world!` statement.
</div>

___
# Python Code Structure

In [63]:
# as we already saw, you can put comments, which python will ignore,
# using the hashtag symbol. it's best to keep them short and use
# new lines frequently when they get longggg like this one

# python codes begin with imports of any packages that you plan to use
import os
import numpy as np
import matplotlib.pyplot as plt

# and in jupyter we include the below "magic" command (indicated by the %)
# this tells matplotlib to put any new figures inside the document and not in a pop up
%matplotlib inline

When we ran the above cell nothing was output (but you should now see a line number next to `In` as you would in an IPython console) 

We just imported a couple of packages that we'll use today: the built in `os`, or operating system, package that comes standard with your python install; the `conda`-installed `numpy`, or numerical python, package that we can use for math, manipulating arrays, and more; and the `conda`-installed `matplotlib` package that we use for plotting data.

We included the standard **aliasing** of the last two modules (using the python keyword `as`). The short names `np` and `plt` keep the code succinct and readable.

A **module** is a `.py` file with function definitions and other statements. Python **packages** are a way of organizing modules into larger entities (e.g., `matplotlib` is a package and `.pyplot` is one it's modules). More generally in programming, these called **libraries**. For all intents and purposes, all three terms are interchangeable! 

This brings up an import point in python about **namespaces**. Any time we import a package or create a variable or function, we have to pick a name that doesn't already exist in the current python namespace. Some names are always taken like `print`, `int`, `for`, etc. (see [here](https://docs.python.org/3/library/functions.html#built-in-funcs) and [here](https://docs.python.org/3.8/reference/lexical_analysis.html#keywords)). If a word is highlighted in green in Jupyter, then you know it's a built in python object that you can't overwrite. Any new names that we introduce (like `version` below) only exist in the current kernel and will disappear when we shutdown the notebook!

In [3]:
# we can use the print() function that we already saw to get some package info
version = np.__version__
print(version)

# and we can see the type of the new 'version' variable using another built in
type(version)

2.2.5


str

In [4]:
# if I try and write something
outside of a comment

SyntaxError: invalid syntax (4193456261.py, line 2)

Then we see our first **error message**. These are very important in python and quickly allow us to find **bugs** in the code. We see it threw a `SyntaxError` and it also shows which line caused the problem (`line 2`). This is an easy example, but error messages are important to read carefully as your code gets longer and more complex.

In [5]:
# let's see another type of error: `AttributeError`
print(np.__versin__)

AttributeError: module 'numpy' has no attribute '__versin__'

We'll talk more about code structure and learn a lot by hands on experience as we go through the course.

<div class="alert alert-block alert-warning">
<b>Task 1.3</b>

Make a code cell below this task and try to run the following to take the square root of 4: `numpy.sqr(4)`. What is(are) the error message(s)? Can you fix the code such that it outputs the square root of 4?
</div>

___
# Data Types and Manipulation

Let's briefly re-iterate data types that you will encounter when working with python and data analysis . We define some new **variables** below using the `=` notation.

In [6]:
# we define some new variables with different fundamental data types
integer = 5
decimal = 1.6
string = 'foo'
boolean = True

In [7]:
# now try using the 'type' function to print the data type of each variable
type(integer)

int

Now that we have some variables, how can we keep track of them? We can use the ipython magic command `%who`, or `%whos` to display a bit more information:

In [8]:
%who

boolean	 decimal	 integer	 np	 os	 plt	 string	 version	 


In [9]:
%whos

Variable   Type      Data/Info
------------------------------
boolean    bool      True
decimal    float     1.6
integer    int       5
np         module    <module 'numpy' from '/ho<...>kages/numpy/__init__.py'>
os         module    <module 'os' (frozen)>
plt        module    <module 'matplotlib.pyplo<...>es/matplotlib/pyplot.py'>
string     str       foo
version    str       2.2.5


In [11]:
# we can change between data types using the built in data type keywords
new = str(integer)
print(new, type(new))

# NOTE: evertime we re-assign 'new' using '=' we erase the previous assignment
new = int(decimal)
print(new, type(new))

5 <class 'str'>
1 <class 'int'>


What happend with that last example? When we converted the float to an integer python automatically rounded it down to the nearest integer value. If we wanted to round it up instead we could invoke a `numpy` function first:

In [12]:
# use the ? to bring up the function's documentation
np.round?

[31mSignature:[39m       np.round(a, decimals=[32m0[39m, out=[38;5;28;01mNone[39;00m)
[31mCall signature:[39m  np.round(*args, **kwargs)
[31mType:[39m            _ArrayFunctionDispatcher
[31mString form:[39m     <function round at 0x7256600d2fc0>
[31mFile:[39m            ~/miniconda3/envs/Py3_geodata/lib/python3.12/site-packages/numpy/_core/fromnumeric.py
[31mDocstring:[39m      
Evenly round to the given number of decimals.

Parameters
----------
a : array_like
    Input data.
decimals : int, optional
    Number of decimal places to round to (default: 0).  If
    decimals is negative, it specifies the number of positions to
    the left of the decimal point.
out : ndarray, optional
    Alternative output array in which to place the result. It must have
    the same shape as the expected output, but the type of the output
    values will be cast if necessary. See :ref:`ufuncs-output-type`
    for more details.

Returns
-------
rounded_array : ndarray
    An array of the

In [13]:
# convert decimal to integer with proper rounding
new = int(np.round(decimal, 0))
print(new)

2


Above you see how we strung together multiple functions in one line (`int` and `np.round`). Let's look at some more basic math.

In [14]:
# if we add an integer and float we get a float
print(integer + decimal)
type(integer + decimal)

6.6


float

<div class="alert alert-block alert-warning">
<b>Task 1.4</b>

Create two variables: an integer and a float. Use the following mathematical operators on them: +, -, /, \*, **, %. 

Round the result to 3 decimal places and print that and its data type.
</div>

<div class="alert alert-block alert-warning">
<b>Task 1.5</b>

Create two string variables, each at least 4 characters long. Use the + operator on them and print the result. Also experiment with the string method [`split()`](https://docs.python.org/2/library/string.html#string.split), as in `my_string.split(<separator>)`.
</div>

## Data Structures

Now that we know about basic data types let's group them together in data **collections**, including: **lists, tuples**, and  **dictionaries**, among others. 

These group arbitrary variables together (e.g., strings and integers and floats) in containers. These containers can even contain their own type (e.g., a list of lists).

In [15]:
# define some new containers
my_list = [1, 2, 3, 4]
tup = (1, 2, 3, 4)
dictionary = {'a' : my_list, 'b' : tup, 'c' : decimal}

# look at their type
type(my_list)

list

Let's look at list indexing and dictionary calls now.

In [16]:
# let's look at list indexing
print('my list is: ', my_list)
print('the first value is:', my_list[0])
print('the first two values are:', my_list[:2])
print('the last values is:', my_list[-1])

my list is:  [1, 2, 3, 4]
the first value is: 1
the first two values are: [1, 2]
the last values is: 4


The `:` operator works to **slice** lists with the left side being the start index and the right side being the end index (not inclusive). Putting no value on either side defaults to either the beginning or end of the list. Slicing takes the form `list[start:stop:step]`, but we've just used it with `start:stop` by omitting a third `:`.

In [17]:
# take every second value starting with the first value
my_list[::2]

[1, 3]

> **Note:** Slicing using `:` also works on strings, like below

In [18]:
# slice a long string
foo = 'cutthisup'
print(foo[:3], foo[3:7], foo[7:])

cut this up


In [19]:
# what's wrong with this call?
print('the list variable is length: ', len(my_list))
my_list[4]

the list variable is length:  4


IndexError: list index out of range

**Remember that python uses _0 indexing_, meaning the first value in any sequence is 0 and not 1.**

Another way to get the last value in the list is by counting backwards using negative index values, which also work in python.

In [20]:
# get the last value and second to last value in my_list
print(my_list)
print(my_list[-1])
print(my_list[-2])

[1, 2, 3, 4]
4
3


For a visual of negative vs. positive indexing:
![](img/indexing.jpg)

via: https://www.geeksforgeeks.org/string-slicing-in-python/

> **Note:** For negative indexing the first value is -1, so the the reverse index goes to -7 (rather than 6)... There is one and only one zero!

<div class="alert alert-block alert-warning">
<b>Task 1.6</b>

Create a new list, beginning with the second value and taking every third value after it from this list: [3, 9, 7, 2, 3, 6, 9, 5]
</div>

Making **copies** of lists behaves in ways you might not expect if you are coming from other programming languages. You can assign a list to another variable name like this: 

In [22]:
my_list_c = my_list

`my_list_c` is now a **_copy_** of `my_list`, however, if I change one, I change the other. This is a **shallow copy**. 

In [28]:
print(my_list)

# note here we are re-assigning the value at index 2 to be 1.5
my_list[2] = 1.5

# check the result
print(my_list)
print(my_list_c)

[1, 2, 3, 4]
[1, 2, 1.5, 4]
[1, 2, 3, 4]


To create a **deep copy** independent of the original list we can use the slice notation (taking the entire list):

In [29]:
# copy with a full slice
#my_list_c = my_list[:]
# or copy using the copy object
my_list_c = my_list.copy()

my_list[2] = 3
# check the result
print(my_list)
print(my_list_c)

[1, 2, 3, 4]
[1, 2, 1.5, 4]


If that's confusing, here's a visual example (using words) of shallow vs. deep copying:

![](img/shallow-vs-deep-copy.gif)

via: https://stackoverflow.com/questions/41125834/trying-to-do-a-shallow-copy-on-list-in-python

**Moving on to tuples.**

Tuples are like lists, but they are **immutable** (the elements cannot be changed in place), so they're useful if you don't want to change the data inside them:

In [1]:
# you can access a tuple index like a list
print(tup[0])
# including using slicing
print(tup[1:])

NameError: name 'tup' is not defined

In [31]:
# but you can't change a tuple value by index
tup[0] = 1.1

TypeError: 'tuple' object does not support item assignment

Moving on to dictionary calls...

Remember dictionaries are formed with `{}` curly braces and have `key`:`value` pairs. We can call the `value` by its `key` name:

In [32]:
# let's call a few key:value pairs from the dictionary
print('the a key in my dictionary is: ', dictionary['a'])
print('the b key in my dictionary is: ', dictionary['b'])

the a key in my dictionary is:  [1, 2, 3, 4]
the b key in my dictionary is:  (1, 2, 3, 4)


<div class="alert alert-block alert-warning">
<b>Task 1.7</b>

Create three variables: a list of two integers, a tuple with two floats, and a dictionary with two items where each item is a list of two floats. Use the list methods `append` and `insert` (see [here](https://docs.python.org/3.8/tutorial/datastructures.html#more-on-lists)) to add the tuple once to end of the list, then to the front of the list. After this, create a new key in the dictionary and assign the modified list to this key.
</div>

From this we can get our first look at what makes python an [**object-oriented**](https://en.wikipedia.org/wiki/Object-oriented_programming) language, meaning that it treats all variables, packages, functions, etc. as **objects** with both **attributes** (other Python objects -- like functions or constants -- stored "inside"). Some of these attributes are **methods** (functions associated with the object). So for instance if we wanted to add the objects `b` (non-list) and `a` (list) then we could call the list attribute `append` using the `.` notation. 

The list itself is a variable, but any list object in python has many useful attribute functions associated with it.

In [33]:
a = [0, 2, 5, 10, 15]

In [None]:
# after the a. press Tab on your keyboard to see a long list of internal methods
a.

In [34]:
# if we wanted to add two LISTS we could also just use +
[1, 2, 3] + [4, 5, 6]

[1, 2, 3, 4, 5, 6]

___
# for Loops

For loops are one of the most common elements of any computer code. Basically, they tell your code to step through every value in some collection of values and do something. For instance:

In [35]:
# here is a new list variable
berlin_food = ['doener', 'currywurst', 'falafel', 'pommes']
# for the German speakers, note I dropped the ö in döner, 
# it's best to stick with standard English characters in programming

Here we see for the first time the importance of **indentation** in python. For loops and other code elements (e.g., function definitions) require indents (typically made with either one `Tab` or four `Spacebar` strokes). Blocks of code with the same sized indent are executed together before moving on to the next level of indentation. 

In [38]:
# let's make a for loop to print
for food in berlin_food:
    print(food)
print('done!')

doener
currywurst
falafel
pommes
done!


Note the syntax `for` something `in` something`:`. The colon `:` comes up all over in python syntax whenever you are creating indented code blocks. Essentially it means "now run the below code block". Try removing it in the above instance and see what happens.

What is the `food` variable now? Based on this, how does `in` work?

<div class="alert alert-block alert-warning">
<b>Task 1.8</b>

Create a new list variable containing: one string, one float, one integer, one boolean, one tuple (of two integers), one list (of two integers), and one dictionary (of two keys, each referring to one integer). Iterate over the variable and print the item and the item's data type.
</div>

## The `range` Function

In [39]:
for i in range(5):
    print(i)

0
1
2
3
4


What does the built in runction `range` do? Rather than explain, let's just use the help function `?`

In [None]:
range?

Often we use this function to loop through a data structure (e.g., list) by its index value. Note that we use a little trick below to get the length (so the maximum index value) of the list using python's `len` function:

In [40]:
for i in range(len(berlin_food)):
    print(i)
    print(berlin_food[i])
    
    # we include a blank print statement to 
    # create a space between each loop step
    print()

0
doener

1
currywurst

2
falafel

3
pommes



Why would we do it this (slightly) more complex way? Well what if you had two related lists of the same length and you wanted to pull values from the same index location out of them.

In [41]:
mountains = ['Mt. Everest', 'Aconcagua', 'Denali', 'Kilimanjaro']
continents = ['Asia', 'South America', 'North America', 'Africa']

for i in range(len(mountains)):
    print(mountains[i], 'is in', continents[i])

Mt. Everest is in Asia
Aconcagua is in South America
Denali is in North America
Kilimanjaro is in Africa


We could also use another function called `enumerate` here, which returns tuples of (index, value):

In [42]:
for i, mt in enumerate(mountains):
    print(i)
    print(mt)
    print(mt, 'is in', continents[i])
    print()

0
Mt. Everest
Mt. Everest is in Asia

1
Aconcagua
Aconcagua is in South America

2
Denali
Denali is in North America

3
Kilimanjaro
Kilimanjaro is in Africa



<div class="alert alert-block alert-warning">
<b>Task 1.9</b>

Create a list of 4 integer values, use the range function to loop over each element in the list, _beginning with the second element_ and add it to the previous element.
</div>

In [48]:
a = range(4)
print(list(a))

for i in range(1, len(a)):
    print(a[i-1] + a[i])


[0, 1, 2, 3]
1
3
5


___
# Conditional Statements

Let's introduce the idea of **equality** using the `==` **comparison operator**. As opposed to `=` (**assignment**), `==` equality just means "are these two objects equal?" and returns a `bool` answer of `True` or `False`.

In [49]:
# check equality
print(dictionary)
print(my_list)
dictionary['a'] == my_list

{'a': [1, 2, 3, 4], 'b': (1, 2, 3, 4), 'c': 1.6}
[1, 2, 3, 4]


True

Alternatively we can get some other booleans using the comparison operators: not equal to (`!=`), greater than and less than (`>` and `<`), and greater than or equal to and less than or equal to (`>=` and `<=`).

In [50]:
# test some more operators
print(1.5 != 1)
print(6. > 6)
print(my_list, tup)
print(my_list[2] <= tup[2])

True
False
[1, 2, 3, 4] (1, 2, 3, 4)
True


The idea with **conditionals** in python is pretty simple: `if` and condition is met, `then` do this. They often rely on comparison operators.

In [52]:
# simple example
if np.sin(np.pi) == 0:
    print('nice!')
else:
    # note the \ backslash, telling python to ignore the 
    # apostrophe ' and treat it as a string element
    print('that\'s not what i expected')
    # an alternative is to use " and ': 
    print("that's not what i expected")


that's not what i expected
that's not what i expected


In [56]:
a = np.sin(np.pi)

In [58]:
a.dtype

dtype('float64')

In [65]:
np.log10(a)

np.float64(-15.91198914827845)

## Sidestep: Min and max numbers in variable types

Additional information are [here](https://note.nkmk.me/en/python-sys-float-info-max-min/) and on [wikipedia](https://en.wikipedia.org/wiki/Double-precision_floating-point_format).

In [66]:
import sys

print(sys.float_info)

sys.float_info(max=1.7976931348623157e+308, max_exp=1024, max_10_exp=308, min=2.2250738585072014e-308, min_exp=-1021, min_10_exp=-307, dig=15, mant_dig=53, epsilon=2.220446049250313e-16, radix=2, rounds=1)


In [68]:
print(1.7e308)

1.5e+308


In [70]:
print(1.8e308)

inf


In [77]:
#np.finfo?
np.finfo(np.float64).max

np.float64(1.7976931348623157e+308)

In [78]:
np.finfo(np.float32).max

np.float32(3.4028235e+38)

In [81]:
#np.iinfo?
np.iinfo(np.int16).max

32767

In [82]:
np.iinfo(np.int64).max

9223372036854775807

In [84]:
x = np.iinfo(np.int64).max
print( f'{x:,}' )

9,223,372,036,854,775,807


#### Why should we use different types of floating-point accuracy?

In [85]:
a64 = np.float64(1e3)
a32 = np.float32(1e3)
aint64 = np.int64(1e3)
aint32 = np.int32(1e3)
aint16 = np.int16(1e3)

In [89]:
if a64 == a32:
    print(True)
    
if a64 == aint64:
    print(True)
    

True
True


In [94]:
%whos

Variable      Type       Data/Info
----------------------------------
a             float64    1.2246467991473532e-16
a32           float32    1000.0
a64           float64    1000.0
aint16        int16      1000
aint32        int32      1000
aint64        int64      1000
b             list       n=0
berlin_food   list       n=4
boolean       bool       True
continents    list       n=4
decimal       float      1.6
dictionary    dict       n=3
foo           str        cutthisup
food          str        pommes
i             int        3
integer       int        5
mountains     list       n=4
mt            str        Kilimanjaro
my_list       list       n=4
my_list_c     list       n=4
new           int        2
np            module     <module 'numpy' from '/ho<...>kages/numpy/__init__.py'>
os            module     <module 'os' (frozen)>
plt           module     <module 'matplotlib.pyplo<...>es/matplotlib/pyplot.py'>
string        str        foo
sys           module     <module 'sys' (bu

In [105]:
print(a64.nbytes)
print(a32.nbytes)
print(aint64.nbytes)
print(aint32.nbytes)
print(aint16.nbytes)

8
4
8
4
2


<div class="alert alert-block alert-warning">
<b>Task 1.10</b>

Given an array with 1000 x 1000 elements: What is the memory usage of a float64, float32, and int16 array?
</div>

In [3]:
float64_memory_usage = a64.nbytes * 1000 * 1000
float32_memory_usage = a32.nbytes * 1000 * 1000
int16_memory_usage = aint16.nbytes * 1000 * 1000
print(float64_memory_usage/1e3)
print(float32_memory_usage/1e3)
print(int16_memory_usage/1e3)

NameError: name 'a64' is not defined

Here we have used `if` to test a condition (using `==`), which returned `False` (because $sin(\pi)$ is not 0 using `np.pi` constant), so the code under `else` was run instead.

## Back to if-else statements

We can also use the keyword `not` to reverse the boolean

In [109]:
print(np.sin(np.pi) == 0)
print(not np.sin(np.pi) == 0)

False
True


We can also link several conditions using `elif`, or "else if".

In [110]:
if np.sin(np.pi) == 0:
    print('nice!')
elif np.round(np.sin(np.pi), 10) == 0:
    print('hmmm, close...')
else:
     print('that\'s not what i expected')

hmmm, close...


The above code goes line-by-line checking each conditional. It ran the indented code after `elif` because that returned `True`. Note our use of the `np.round` function to approximate the result.

We can even combine conditional statements (`==`, `!=`, `<`, `<=`, `>`, `>=`) using the keywords `and` and `or`. `and` only returns `True` when both conditions are `True`. `or` returns `True` if even one condition is `True`.

In [111]:
print(True and True)
print(True or False)

True
True


In [112]:
# and to get really confusing :)
not True or not False

True

In [113]:
# combine conditional statements
a = 4
b = 16

if (np.sqrt(b) > a) or (b < a):
    print('cool')
else:
    print('okay, sure')

okay, sure


You can also combine for loops with conditional statements!

In [114]:
temperatures = [-12, 0, 17]

for t in temperatures:
    if t > 0:
        print(t, 'is above freezing')
    elif t == 0:
        print(t, 'is freezing')
    else:
        print(t, 'is below freezing')

-12 is below freezing
0 is freezing
17 is above freezing


> **Note**: Pay careful attention to the indentation in the above code block (2 levels here). Think about the order that things are occuring to produce each of the three printed lines.

<div class="alert alert-block alert-warning">
<b>Task 1.11</b>

Use this list: [1, (3, 4), 2.2, -1]. Loop through the list. For each item: `if` the data type is integer and greater than zero print it, `elif` the data type is tuple `pass` and do nothing (see [here](https://docs.python.org/3/tutorial/controlflow.html#pass-statements)), `else` print the data type.
</div>


One more conditional to touch upon is `while`. A `while` conditional loop executes a code block while some condition is `True`.

> **Note:** Be careful with `while`. If your condition is never met the code will keep running forever! If that happens you need to "Kernel > "Interrupt" in the top menu (or using the keyboard stroke `i, i`, which is the equivalent of `Ctrl + c` killing a process from the command line!

Here's a small example:

In [None]:
# define some variable
n = 10

# define some threshold
threshold = 8

# reduce n until the threshold is reached
while (n != threshold):
    n -= 1
    print(n)

> **Note:** above we used a new arithmetic operator `-=`. This means take the variable and subtract whatever is to the right and assign this new value to the variable (the equivalent of n = n - 1, just more succinct). You can do this with `+=`, `*=`, and `/=`.

In [None]:
# and here's a bad example requiring a keyboard interrupt
n = 10
threshold = 8.5
while (n != threshold):
    n -= 1
    print(n)

> **Note:** This created a huge list of values very quickly in the output below the code block. Navigate to "Cell > Current Outputs > Clear" with the above cell selected to clear this output before we go on.

___
# List Comprehension

This is an easy to use python feature that allows you to create new lists in a single succinct line of code. The general form is:

```
[f(val) for val in collection if condition]
```

which is much nicer than:

```
result = []
for val in collection:
    if condition:
        result.append(f(val))
```

In [None]:
# let's take a list numbers
list_a = [0.1, 5, 16, 0.5, 1]

# and create a new list from that of only
# the number less than or equal to 1, and also convert
# the numbers to strings
list_b = [str(x) for x in list_a if x <= 1]

print(list_b)

<div class="alert alert-block alert-warning">
<b>Task 1.12</b>

Use list comprehension on this list: [0, 2, 1.6, 10, 22.2] to create a new list of each value squared if the value is an integer. Use the built-in function [`isinstance`](https://docs.python.org/3/library/functions.html#isinstance) to check the data type.
</div>

___
# Packages and Functions

We've already seen how packages can be used to call functions (e.g., `np.round`) and we've also seen that python has a number of built in functions (e.g., `print`). Let's discuss functions a bit more here.

In [115]:
# call sine from np
np.sin(np.pi/2)

np.float64(1.0)

> **Note:** In the above example $sin(\pi/2)=1$ as we expect. This is an example of **floating point rounding** during arithmetic. You can read more about it in the official documentation [here](https://docs.python.org/3/tutorial/floatingpoint.html), or, if you really want to give yourself a headache: [here](https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html). We won't worry about these details now, but it's good to be aware of!

NumPy has a lot of useful math functions, here are a few common ones:

|function | purpose|
|------------ |--------------|
|abs(x)  | absolute value|
|cos(x)        |cosine |
|sin(x)       | sine |
|tan(x)      |  tangent |
|arccos(x)   |arccosine |
|arcsin(x)    | arcsine |
|arctan(x)    | arctangent |
|exp(x)      |  exponential |
|log(x)      |  natural logarithm ($ln$) |
|log10(x)    |  base 10 log |
|sqrt(x)    |   square root |

**Functions** are organized, reusable code that take inputs and give outputs. Using them makes your code more readable and effective. We've used `numpy` and built in functions, but let's define our own function now:

In [None]:
# define a simple function
def convert_meters_per_year_to_mm_per_month(annual_precip):
    """
    Takes an annual precipitation in m/yr and converts it
    to mm/month.
    """
    value = annual_precip / 12 * 1000
    return value

* `def` is the function definition keyword
* `convert_meters_per_year_to_mm_per_month` is the function name
* `annual_precip` is the input parameter. It is a **positional** argument, but some functions also have **keyword** arguments. We'll see this later.
* between the triple quotes is the **DocString** providing some function information
* `value` is the calculated value to return
* `return` is the return statement, or what the function outputs

Our function name here is way too long. In a real code we would want to name it something simpler:

In [None]:
# same function, better name
def precip_convert(annual_precip):
    """
    Takes an annual precipitation in m/yr and converts it
    to mm/month.
    """
    value = annual_precip / 12 * 1000
    return value

Then, if we happen to forget what the function does, we can just get the DocString

In [None]:
precip_convert?

> **Note:** A **local namespace** is created when the function is called and destroyed after the function finishes. That means that the variable `value` in our function definition `convert_precip` does not exist in the **global namespace** of our notebook. 

Now instead of writing out this math equation again and again, we can just call it repeatedly in a for loop.

<div class="alert alert-block alert-warning">
<b>Task 1.13</b>

Use this list of annual precipitations in m/yr: [1, 0.5, 2]. Loop through the list, print the value (include the units in the print statement), convert the value to monthly precipitation, and print the converted value and units. In the second print statement, only print the monthly precip to 1 decimal place of significant digits: to do this, you'll need to use [string formatting](https://www.geeksforgeeks.org/python-format-function/).
</div>

> **Note:** The python function `format()` is just one way to accomplish string formatting, you may see a few other common ways as shown [here](https://realpython.com/python-string-formatting/).

## Lambda Functions

One last thing to briefly mention are `lambda` or **anonymous functions**. These are one-line functions that only take a single statement and don't receive a name. They take the form:
```
lambda x: do something to x
```

Which, is similar to how we say mathematically e.g., $f(x) = x^2$

They're helpful when you need to do an operation repeatedly for a short period of time and don't want to add another whole function. Often they are used in conjunction with python's `map()`, `reduce()`, and `filter()` built-in functions, but we'll ignore those for now. 

In any case, it's good to know, so here's an example usage:

In [None]:
# define a function
def key_function(x):
    return x[1]

# take a list
a = [(1, 2), (3, 1), (5, 10), (11, -3)]

# sort the list passing the key function
# to the keyword argument 'key' in the list method 'sort()'
a.sort(key=key_function)
a

The above code means: "sort this list of objects from low to high and use the second value of each object (`x[1]`) to do the sorting". 

vs. (with a lambda function)

In [None]:
a = [(1, 2), (3, 1), (5, 10), (11, -3)]

# do the same thing but replace the function with
# a temporary lambda call
a.sort(key=lambda x: x[1])
a

> **Note:** Above we see a **keyword** call inside a function (`key=` inside `sort()`). Some functions (including ones you design with `def`) take optional **keyword arguments**, that default to some value if they aren't filled. 

<div class="alert alert-block alert-warning">
<b>Task 1.14</b>

Create a new function with 2 input parameters. The function should take the square root of the first parameter, then divide that by the second parameter and return the result. Give the function a docstring. Call the function in a for loop over this list:

foo = [(1, 2), (1.6, 5), (10, 2)]

Print the output in the for loop. *IF* the final value is less than one, *THEN* format the printed string to have only 1 decimal place, *ELSE* print the value as is.
</div>