# Python Introduction <img align="right" src="resources/ama_logo.jpg" width=250 height=250>

This section will discuss the programming language used in Open Data Cube (ODC) code - Python.

## Background
Before we use the ODC in code, we first need to understand the programming language that the ODC runs in.

For an more detailed Python programming tutorial, see [this tutorial](https://www.w3schools.com/python/python_intro.asp).

## Description

Python topics covered include:

* [Formatting](#Formatting)
* [Syntax](#Syntax)
* [Variables](#Variables)
* [Data Types](#Data-Types)
* [Operators](#Operators)
* [Control Flow](#Control-Flow)
* [Commonly Used Functions](#Commonly-Used-Functions)
* [Importing Packages](#Importing-Packages)
***

## Formatting

In [1]:
%%html
<style>
table {align:left; display:block}
</style>
Markdown tables now align left.

## Syntax

### Indentation
Indentation refers to the spaces at the beginning of a code line.

Where in other programming languages the indentation in code is for readability only, the indentation in Python is very important.

Python uses indentation to indicate a block of code.

The following **`if`** block will always run. We will discuss control flow constructs like **`if`** later in this section. Some other things that can constitute code blocks are function and class definitions.

In [2]:
if 5 > 2:
  print("Five is greater than two!")

Five is greater than two!


### Comments

Code comments can be used to explain code or avoid running some code.

Single-line comments begin with the **`#`** character.

Multiline comments begin and end with **`'''`**.

In [3]:
# This is a comment.
'''
This is a multiline comment.
'''

'\nThis is a multiline comment.\n'

Note that multiline comments are actually strings, so in the cell above, the multiline string prints to the cell output.

## Variables

A variable is a name that references an object. An object is a collection of attributes and actions. For example, a number has a value (an attribute) and can be used in arithmetic expressions with other numbers (actions).

Note that a variable does not have a fixed type, but objects do.

### Creation

A variable is created the first time it has a value assigned to it.

For example, the following cell defines a variable **`a`** that references the object `5`. 

In [4]:
a = 5

### Naming rules

There are some naming rules for variables.

* Variable names can only contain alpha-numeric characters and underscores (a-z, A-Z, 0-9, and _)
* Variable names cannot start with a number.
* Variable names are case-sensitive (age, Age and AGE are three different variables)

For example, `2a = 5` is invalid syntax, but `a2 = 5` is valid syntax.

### Multiple assignment

Multiple variables can be assigned values in 1 step (or "statement"), as shown in the following example.

In [5]:
a,b,c = 1,2,3
print(a,b,c)

1 2 3


## Data Types

There are many data types that objects can have in Python.

In the variable declaration example above, the variable `a` was assigned the value `5`, which is an integer.

There are built-in types and user-defined types in Python.

We will only discuss built-in types in this section.

### Built-in types

Built-in types are part of Python itself.

There are 2 collections of types, with both collections including built-in and user-defined types. These collections are normal types and containers, but instances of all types are Python objects.

#### Normal Types

Normal types are any types that do not primarily serve as containers of other objects. They can contain other objects and even containers, but containing objects is not their primary purpose.

The types listed here are some of the most commonly used normal types.

>##### Numbers

There are 2 types of numbers in Python: integers, such as `5`, and floats, such as `1.5`.

Integers have the type `int` and floats have the type `float`.

>##### Strings

Strings can be defined with single quotes (`''`), double quotes (`""`), or as a multiline string with triple quotes (`''' '''` or `""" """`). For example:

In [6]:
mystr = "String"
print(f"mystr: {mystr}")

multiline_str = '''\
First line
Second line\
'''
# \n is the "new line" character
print(f"Multiline string: \n{multiline_str}")
# \ at the end of a line is a line continuation character in Python.
# If these were removed, every new line in the multiline string
# will be part of the string itself. In this example, 
# the line continuation characters are removed, so the
# string has a newline at the beginning and end.
multiline_str_no_line_continuation = '''
First line
Second line
'''
print("Multiline string without using "
      f"line continuation characters: \n\
        {multiline_str_no_line_continuation}")

mystr: String
Multiline string: 
First line
Second line
Multiline string without using line continuation characters: 
        
First line
Second line



The strings that begin with `f` and contain expressions in curly brackets (`{}`) are called "format strings", or "f-strings". You can read more about them [here](https://docs.python.org/3/tutorial/inputoutput.html#formatted-string-literals).

A character in a string can be accessed by its index. Multiple characters can be selected with a slice, which has the format `[low_index:high_index:stride]`, where `low_index` is the starting index value (default: `0`), `high_index` is the index value after the highest index value that can be included (default: 1 more than the number of elements - the elements being characters for strings), and `stride` is the number of positions to move for each element. For example:

In [7]:
print(f"First character in mystr: {mystr[0]}")
print(f"First 3 characters in mystr: {mystr[:3]}")
print(f"Characters in mystr with even indices ([0,2,4]): {mystr[::2]}")
print(f"Characters in mystr with odd indices ([1,3,5]): {mystr[1::2]}")

First character in mystr: S
First 3 characters in mystr: Str
Characters in mystr with even indices ([0,2,4]): Srn
Characters in mystr with odd indices ([1,3,5]): tig


Strings can be combined, or "concatenated", by adding them with the `+` operator.

In [8]:
substring1 = "Hello, "
substring2 = "World"
substring1 + substring2

'Hello, World'

You can check if a string contains another string with the `in` operator:

In [9]:
"Hello" in "Hello, World"

True

>##### Booleans

Booleans represent one of two values: `True` or `False`.

Logical operators return Boolean values. For example:

In [10]:
print(f'10 > 9: {10 > 9}')
print(f'10 == 9: {10 == 9}')
print(f'10 < 9: {10 < 9}')

10 > 9: True
10 == 9: False
10 < 9: False


#### Containers

Container types are any types that primarily serve as containers of other objects.

The objects that a container contains are called the container's elements.

The types listed here are some of the most commonly used container types.

>##### Lists

Lists are ordered (not necessarily sorted) collections of objects. 

The elements do not need to have the same type. 

Lists can be created like this `[value1, value2, ...]`. For example:

In [11]:
mylist = [1,2,3]

Elements can be added to the list with the `append()` method and removed with the `remove()` method:

In [12]:
mylist.append(4)
print("List after append:", mylist)
mylist.remove(4)
print("List after remove:", mylist)

List after append: [1, 2, 3, 4]
List after remove: [1, 2, 3]


Elements can be accessed and changed by their position in the list, with the first element having the index `0`.

In [13]:
print("First element:", mylist[0])
mylist2 = mylist.copy()
mylist2[0] = 5
print("First element after change:", mylist2[0])

First element: 1
First element after change: 5


You can also merge lists with the `extend()` or `+` operators.

In [14]:
mylist3 = mylist.copy()
mylist3.extend(mylist2)
mylist3

[1, 2, 3, 5, 2, 3]

In [15]:
mylist + mylist2 

[1, 2, 3, 5, 2, 3]

>##### Tuples

Tuples are lists that cannot add, remove, or replace elements.

Tuples are often used where lists are used, but using a tuple clarifies that its elements are constant.

The elements do not need to have the same type.

Tuples have the type `tuple` and can be created like this `(value1, value2, ...)`. To create a tuple with a single item, put a comma after the single element. For example:

In [16]:
mytuple = (1,2,3)
single_element_tuple = (1,)

>##### Dictionaries

Dictionaries map values ("keys") to other values ("values"). A single pair of a key and its value is called an item. They are frequently called maps in Python and other programming languages.

Dictionaries have the type `dict` and can be created like this `{key1:value1, key2:value2, ...}`.

For example, the number of occurrences of values in a list could be represented as a map of the values to the number of occurrences of those values. For the list `[1,1,2,2,2,3,3,4]`, this dictionary would be `{1:2, 2:3, 3:2, 4:1}`.

Using indexing syntax, a dictionary value can be accessed or changed by its key and a dictionary item can be added and removed by its key (using the `del` keyword). For example:

In [17]:
mydictionary = {1:2, 2:3, 3:2, 4:1}

# Accessing values
print(f'Value for key 1: {mydictionary[1]}')

# Changing values
mydictionary[1] = 5
print('mydictionary after changing the value of key 1 to 5:\n'
      f'{mydictionary}')

# Adding new item
mydictionary[5] = 0
print('mydictionary after adding item with key 5:\n'
      f'{mydictionary}')

# Removing item
del mydictionary[5]
print('mydictionary after removing item with key 5:\n'
      f'{mydictionary}')

Value for key 1: 2
mydictionary after changing the value of key 1 to 5:
{1: 5, 2: 3, 3: 2, 4: 1}
mydictionary after adding item with key 5:
{1: 5, 2: 3, 3: 2, 4: 1, 5: 0}
mydictionary after removing item with key 5:
{1: 5, 2: 3, 3: 2, 4: 1}


>##### Sets

Sets are unordered collections of unique values.

Sets have the type `set` and can be created like this `{value1, value2, ...}`.

Elements can be added to a set with the `add()` method and removed with the `remove()` method. A logical union or intersection of elements from other containers (not just sets) can be obtained with the `union()` and `intersection()` methods.

In [18]:
myset = {1}
otherset = {1,2,3}
print(f'myset: {myset}')
print(f'otherset: {otherset}\n')

# Adding new element
myset.add(2)
print(f'myset after adding 2: {myset}')

# Removing element
myset.remove(2)
print(f'myset after removing 2: {myset}')

# Union and Intersection
print(f'union of myset and otherset: {myset.union(otherset)}')
print(f'intersection of myset and otherset: {myset.intersection(otherset)}')

myset: {1}
otherset: {1, 2, 3}

myset after adding 2: {1, 2}
myset after removing 2: {1}
union of myset and otherset: {1, 2, 3}
intersection of myset and otherset: {1}


### Type checking

The type of an object can be determined with the built-in `type` function.

In [19]:
print(f'type of 1: {type(1)}')
print(f'type of "str" {type("str")}')

type of 1: <class 'int'>
type of "str" <class 'str'>


### Type casting

An object `obj` can be type cast to another type `mytype` with syntax like `mytype(obj)`. 

For example:

In [20]:
print(f'int cast of 1.5 (float truncates): {int(1.5)}')
print(f'float cast of "1.5" multiplied by 2: {float("1.5") * 2}')

int cast of 1.5 (float truncates): 1
float cast of "1.5" multiplied by 2: 3.0


## Operators

Operators are special combinations of characters used to perform operations on variables and values.

### Types of Operators

Some operators only operate on 1 value. These are called unary operators.

Other operators operate on 2 values. These are called binary operators.

More generally, the number of values an operator or function needs is called its arity.

### Overview of Operators

There are several types of operators: arithmetic, assignment, comparison, logical, identity, membership, and bitwise. These typs of operators are described below.

* **Arithmetic**: These operators perform arithmetic operations.
* **Assignment**: These operators assign values to variables. Most of them first apply another operator before assigning the result.
* **Comparison**: These operators compare 2 values - returning Boolean results.
* **Logical**: These operators combine Boolean values (such as outputs of comparisons).
* **Identity**: These operators determine if object references are the same (not just that objects have the same value).
* **Membership**: These operators determine if a sequence is present in an object.
* **Bitwise**: These operators operate on binary numbers.

The most commonly used operators are the arithmetic, assignment, comparison, and logical operators.

Arithmetic operators include +, -, \*, /, %, \*\*, and //.

Assignment operators include the most commonly used binary operator - the assignment operator: `=`. Other assignment operators first apply an arithmetic or bitwise operator to the value on the right and then assign the result.

Comparison operators include ==, !=, >, <, >=, <=.

Logical operators include **and**, **or**, and **not**.

Identity operators include **is** and **is not**.

Membership operators include **in** and **not in**.

Bitwise operators include &, |, &, ~, <<, and >>.

You can find tables of operators, their descriptions, and examples on [this webpage](https://www.w3schools.com/python/python_operators.asp).

### Special Operators

Python also has unpacking operators, which are used to extract elements of containers. The dictionary unpacking operator is `**`. In ODC code, this is often used to include some common parameters in multiple invocations of a function in a simple way. For example:

In [21]:
def myfunc(a,b,c):
    return a,b,c

common_params = {'b':2, 'c':3}

print(f'myfunc(1, **common_params): {myfunc(1, **common_params)}')
print(f'myfunc(10, **common_params): {myfunc(10, **common_params)}')

myfunc(1, **common_params): (1, 2, 3)
myfunc(10, **common_params): (10, 2, 3)


## Control Flow

Control flow is the order in which statements such as `a = 5` are executed.

A control flow statement is a statement that determines which of two or more paths of execution (sequence of statements) to follow.

### Branching

The most basic control flow structure is the `if` statement.

The code in an `if` statment only executes if its condition is `True`. For example:

In [22]:
a = 5
b = 3
print(f'a, b: {a, b}')
if a < b:
    print('a is less than b')
if a == b:
    print('a equals b')
if a > b:
    print('a is greater than b')

a, b: (5, 3)
a is greater than b


The `if` statment also has optional `elif` and `else` blocks.

An `else` block runs only if the preceding `if` or `elif` blocks did not run. 

`elif` stands for "else, if". An `elif` block runs only if the preceding `if` or `elif` blocks did not run **and** the condition is `True`.

Using `elif` and `else`, the preceding example can also be written as follows:

In [23]:
a = 5
b = 3
print(f'a, b: {a, b}')
if a < b:
    print('a is less than b')
elif a == b:
    print('a equals b')
else:
    print('a is greater than b')

a, b: (5, 3)
a is greater than b


### Looping

A loop is a block of code that repeats until a condition is met.

#### While loop

A `while` loop only repeats as long as its conditional expression is `True`. The following example adds `1` to the variable `a` until it has the value `5`:

In [24]:
a = 1
while (a < 5):
    a += 1
print("value after while loop:", a)

value after while loop: 5


#### For loop

A `for` loop iterates over a sequence, such as a list, tuple, dictionary, set, or string. The following example prints each element of a list on its own line:

In [25]:
mylist = [1,2,3]
for x in mylist:
    print(x)

1
2
3


A `for` loop iterates over the elements of lists in order, but the order of iteration for sets is dependent on their values - not their order.

For dictionaries, `for` loops iterate over keys by default, but instead of using the dictionary `mydictionary` as the object to iterate, use `mydictionary.values()` to iterate over the values, or `mydictionary.items()` to iterate over both. For example:

In [26]:
mydictionary = {1:2, 3:4}
print("Keys:")
for key in mydictionary:
    print(key)
print("Values:")
for value in mydictionary.values():
    print(value)
print("Key-Value Pairs:")
for key, value in mydictionary.items():
    print(key, value)

Keys:
1
3
Values:
2
4
Key-Value Pairs:
1 2
3 4


#### Comprehensions

Comprehensions in Python are a convenient way to construct new sequences (such as lists, sets, dictionaries) using other sequences.

Comprenehsions use `for` loop syntax inside the data types enclosing characters (`[]` for lists and `{}` for sets and dictionaries). The format on comprehension expressions is `output for input in object`. For example:

In [27]:
mylist = [1,2]
newlist = [x*2 for x in mylist]
print(f'newlist: {newlist}')

myset = {1,2}
newset = {x*2 for x in myset}
print(f'newset: {newset}')

mydictionary = {1:2}
newdictionary = {key*2:value*2 for key, value in mydictionary.items()}
print(f'newset: {newset}')

newlist: [2, 4]
newset: {2, 4}
newset: {2, 4}


## Commonly Used Functions

This is a list of functions that are used the most often.

### Sorting

To sort sequences, use the `sorted()` function. For example:

In [28]:
mylist = [3,2,1]
sorted(mylist)

[1, 2, 3]

In [29]:
{1}.copy()

{1}

### Copy

We often want to copy objects to change only a few values in them without changing the original objects.

Many object types - such as lists, sets, and dictionaries - have `copy()` methods. But it is better to use the `deepcopy()` function from the built-in `copy` library.

In the following example, `list2 = list1` makes the `list2` variable reference the same list as the `list1` variable, so changing the value of element `0` to the value `0` applies to both variables.

In [30]:
# Without copying
list1 = [1,2,3]
list2 = list1
list2[0] = 0
print('Without copying (variables reference same list)')
print(f'list1: {list1}')
print(f'list2: {list2}')

Without copying (variables reference same list)
list1: [0, 2, 3]
list2: [0, 2, 3]


In contrast, using `copy.deepcopy()` results in only `list2` being changed.

In [31]:
import copy
list1 = [1,2,3]
list2 = copy.deepcopy(list1)
list2[0] = 0
print('With copying (variables reference different lists)')
print(f'list1: {list1}')
print(f'list2: {list2}')

With copying (variables reference different lists)
list1: [1, 2, 3]
list2: [0, 2, 3]


### Length

The number of elements in a container or the size of an object like a string (the number of characters) can be determined with the `len()` function. For example:

In [32]:
mylist = [1,2,3]
print(f'Number of elements in {mylist}: {len(mylist)}')

mystr = "String"
print(f'Number of elements in "{mystr}": {len(mystr)}')

Number of elements in [1, 2, 3]: 3
Number of elements in "String": 6


### Range

Sometimes we want values within a range. The `range()` function determines what values are within a specified range at some stride (as we did in slicing).

For example, this is the even values between 0 (inclusive) and 10 (exclusive):

In [33]:
print(f'all values between 0 (inclusive) and 10 (exclusive): {list(range(0,10))}')
print(f'even values between 0 (inclusive) and 10 (exclusive): {list(range(0,10,2))}')
print(f'odd values between 0 (inclusive) and 10 (exclusive): {list(range(1,10,2))}')

all values between 0 (inclusive) and 10 (exclusive): [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
even values between 0 (inclusive) and 10 (exclusive): [0, 2, 4, 6, 8]
odd values between 0 (inclusive) and 10 (exclusive): [1, 3, 5, 7, 9]


## Importing Packages

Any imported resources, such as functions, are usually imported at the beginning of a notebook. We import resources with the `import` keyword. Below is an example of such a code cell.

In [34]:
# Python built-in packages #
import sys
import os

# External packages #
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np # Numerical processing
import xarray as xr # Coordinate indexed arrays
import pandas as pd # Tabular data structures for data analysis
import datacube # Facilitates loading data from the Data Cube

# Utilities #
from utils.data_cube_utilities.dc_display_map import display_map

Any imported resources, such as functions, are usually imported at the beginning of a notebook.

There are typically 3 kinds of package imports:
* built-in
* external
* local

**Built-in** packages come with Python, such as `sys` and `os`, which allow access to the environment outside the Python interpreter such as environment variables (`os.environ`) and the system path (`sys.path`). In this case, we are appending the path of the root directory containing notebooks (defined by the `NOTEBOOK_ROOT` environment variable) to the system path because this path also contains a directory called `utils`, which contains Python files from which we import functions. This path is added to the system path for the Python utility files in `utils` to be found.

**External** packages are obtained from package repositories. Some very common packages to import include `matplotlib`, `numpy`, `xarray`, and `pandas`, which we will discuss in future sections. For now, just know that they are common to import.

**Local** packages are stored on the filesystem as Python files (`.py`). These are often called "utilities". In this example, the `display_map()` function is imported from the `/notebooks/utils/data_cube_utilities/dc_display_map.py` file.