# Python Introduction <img align="right" src="../resources/ama_logo.jpg" width=250 height=250>

This section will discuss the programming language used in Open Data Cube (ODC) code - Python.

## Background
Before we use the ODC in code, we first need to understand the programming language that the ODC runs in.

For an more detailed Python programming tutorial, see [this tutorial](https://www.w3schools.com/python/python_intro.asp).

## Description

Python topics covered include:

* syntax
* variables
* data types
* operators
* control flow
* commonly used functions
* importing packages
***

## Syntax

### Indentation
Indentation refers to the spaces at the beginning of a code line.

Where in other programming languages the indentation in code is for readability only, the indentation in Python is very important.

Python uses indentation to indicate a block of code.

The following **`if`** block will always run. We will discuss control flow constructs like **`if`** later in this section. Some other things that can constitute code blocks are function and class definitions.

In [1]:
if 5 > 2:
  print("Five is greater than two!")

Five is greater than two!


### Comments

Code comments can be used to explain code or avoid running some code.

Single-line comments begin with the **`#`** character.

Multiline comments begin and end with **`'''`**.

In [5]:
# This is a comment.
'''
This is a multiline comment.
'''

'\nThis is a multiline comment.\n'

Note that multiline comments are actually strings, so in the cell above, the multiline string prints to the cell output.

## Variables

A variable is a name that references an object. An object is a collection of attributes and actions. For example, a number has a value (an attribute) and can be used in arithmetic expressions with other numbers (actions).

Note that a variable does not have a fixed type, but objects do.

### Creation

A variable is created the first time it has a value assigned to it.

For example, the following cell defines a variable **`a`** that references the object `5`. 

In [6]:
a = 5

### Naming rules

There are some naming rules for variables.

* Variable names can only contain alpha-numeric characters and underscores (a-z, A-Z, 0-9, and _)
* Variable names cannot start with a number.
* Variable names are case-sensitive (age, Age and AGE are three different variables)

For example, `2a = 5` is invalid syntax, but `a2 = 5` is valid syntax.

### Multiple assignment

Multiple variables can be assigned values in 1 step (or "statement"), as shown in the following example.

In [11]:
a,b,c = 1,2,3
print(a,b,c)

1 2 3


## Data types

There are many data types that objects can have in Python.

In the variable declaration example above, the variable `a` was assigned the value `5`, which is an integer.

There are built-in types and user-defined types in Python.

We will only discuss built-in types in this section.

### Built-in types

Built-in types are part of Python itself.

There are 2 collections of types, with both collections including built-in and user-defined types. These collections are normal types and containers, but instances of all types are Python objects.

#### Normal Types

Normal types are any types that do not primarily serve as containers of other objects. They can contain other objects and even containers, but containing objects is not their primary purpose.

The types listed here are some of the most commonly used normal types.

>##### Numbers

There are 2 types of numbers in Python: integers, such as `5`, and floats, such as `1.5`.

Integers have the type `int` and floats have the type `float`.

>##### Strings

Strings can be defined with single quotes (`''`), double quotes (`""`), or as a multiline string with triple quotes (`''' '''` or `""" """`). For example:

In [32]:
mystr = "String"
print(f"mystr: {mystr}")

multiline_str = '''\
First line
Second line\
'''
# \n is the "new line" character
print(f"Multiline string: \n{multiline_str}")
# \ at the end of a line is a line continuation character in Python.
# If these were removed, every new line in the multiline string
# will be part of the string itself. In this example, 
# the line continuation characters are removed, so the
# string has a newline at the beginning and end.
multiline_str_no_line_continuation = '''
First line
Second line
'''
print("Multiline string without using "
      f"line continuation characters: \n\
        {multiline_str_no_line_continuation}")

mystr: String
Multiline string: 
First line
Second line
Multiline string without using line continuation characters: 
        
First line
Second line



The strings that begin with `f` and contain expressions in curly brackets (`{}`) are called "format strings", or "f-strings". You can read more about them [here](https://docs.python.org/3/tutorial/inputoutput.html#formatted-string-literals).

A character in a string can be accessed by its index. Multiple characters can be selected with a slice, which has the format `[low_index:high_index:stride]`, where `low_index` is the starting index value (default: `0`), `high_index` is the index value after the highest index value that can be included (default: 1 more than the number of elements - the elements being characters for strings), and `stride` is the number of positions to move for each element. For example:

In [38]:
print(f"First character in mystr: {mystr[0]}")
print(f"First 3 characters in mystr: {mystr[:3]}")
print(f"Characters in mystr with even indices ([0,2,4]): {mystr[::2]}")
print(f"Characters in mystr with odd indices ([1,3,5]): {mystr[1::2]}")

First character in mystr: S
First 3 characters in mystr: Str
Characters in mystr with even indices ([0,2,4]): Srn
Characters in mystr with odd indices ([1,3,5]): tig


Strings can be combined, or "concatenated", by adding them with the `+` operator.

In [40]:
substring1 = "Hello, "
substring2 = "World"
substring1 + substring2

'Hello, World'

You can check if a string contains another string with the `in` operator:

In [41]:
"Hello" in "Hello, World"

True

>##### Booleans

Booleans represent one of two values: `True` or `False`.

Logical operators return Boolean values. For example:

In [43]:
print(f'10 > 9: {10 > 9}')
print(f'10 == 9: {10 == 9}')
print(f'10 < 9: {10 < 9}')

10 > 9: True
10 == 9: False
10 < 9: False


#### Containers

Container types are any types that primarily serve as containers of other objects.

The objects that a container contains are called the container's elements.

The types listed here are some of the most commonly used container types.

>##### Lists

Lists are ordered (not necessarily sorted) collections of objects. 

The elements do not need to have the same type. 

Lists can be created like this `[value1, value2, ...]`. For example:

In [51]:
mylist = [1,2,3]

Elements can be added to the list with the `append()` method and removed with the `remove()` method:

In [53]:
mylist.append(4)
print("List after append:", mylist)
mylist.remove(4)
print("List after remove:", mylist)

List after append: [1, 2, 3, 4]
List after remove: [1, 2, 3]


Elements can be accessed and changed by their position in the list, with the first element having the index `0`.

In [45]:
print("First element:", mylist[0])
mylist2 = mylist.copy()
mylist2[0] = 5
print("First element after change:", mylist2[0])

First element: 1
First element after change: 5


You can also merge lists with the `extend()` or `+` operators.

In [48]:
mylist3 = mylist.copy()
mylist3.extend(mylist2)
mylist3

[1, 2, 3, 5, 2, 3]

In [49]:
mylist + mylist2 

[1, 2, 3, 5, 2, 3]

>##### Tuples

Tuples are lists that cannot add, remove, or replace elements.

Tuples are often used where lists are used, but using a tuple clarifies that its elements are constant.

The elements do not need to have the same type.

Tuples have the type `tuple` and can be created like this `(value1, value2, ...)`. To create a tuple with a single item, put a comma after the single element. For example:

In [59]:
mytuple = (1,2,3)
single_element_tuple = (1,)

>##### Dictionaries

Dictionaries map values ("keys") to other values ("values"). A single pair of a key and its value is called an item. They are frequently called maps in Python and other programming languages.

Dictionaries have the type `dict` and can be created like this `{key1:value1, key2:value2, ...}`.

For example, the number of occurrences of values in a list could be represented as a map of the values to the number of occurrences of those values. For the list `[1,1,2,2,2,3,3,4]`, this dictionary would be `{1:2, 2:3, 3:2, 4:1}`.

Using indexing syntax, a dictionary value can be accessed or changed by its key and a dictionary item can be added and removed by its key (using the `del` keyword). For example:

In [57]:
mydictionary = {1:2, 2:3, 3:2, 4:1}

# Accessing values
print(f'Value for key 1: {mydictionary[1]}')

# Changing values
mydictionary[1] = 5
print('mydictionary after changing the value of key 1 to 5:\n'
      f'{mydictionary}')

# Adding new item
mydictionary[5] = 0
print('mydictionary after adding item with key 5:\n'
      f'{mydictionary}')

# Removing item
del mydictionary[5]
print('mydictionary after removing item with key 5:\n'
      f'{mydictionary}')

Value for key 1: 2
mydictionary after changing the value of key 1 to 5:
{1: 5, 2: 3, 3: 2, 4: 1}
mydictionary after adding item with key 5:
{1: 5, 2: 3, 3: 2, 4: 1, 5: 0}
mydictionary after removing item with key 5:
{1: 5, 2: 3, 3: 2, 4: 1}


>##### Sets

Sets are unordered collections of unique values.

Sets have the type `set` and can be created like this `{value1, value2, ...}`.

Elements can be added to a set with the `add()` method and removed with the `remove()` method. A logical union or intersection of elements from other containers (not just sets) can be obtained with the `union()` and `intersection()` methods.

In [73]:
myset = {1}
otherset = {1,2,3}
print(f'myset: {myset}')
print(f'otherset: {otherset}\n')

# Adding new element
myset.add(2)
print(f'myset after adding 2: {myset}')

# Removing element
myset.remove(2)
print(f'myset after removing 2: {myset}')

# Union and Intersection
print(f'union of myset and otherset: {myset.union(otherset)}')
print(f'intersection of myset and otherset: {myset.intersection(otherset)}')

myset: {1}
otherset: {1, 2, 3}

myset after adding 2: {1, 2}
myset after removing 2: {1}
union of myset and otherset: {1, 2, 3}
intersection of myset and otherset: {1}


### Type checking

TODO

### Type casting

TODO

## Operators

Operators are special combinations of characters used to perform operations on variables and values.

### Unary

Some operators only operate on 1 value. These are called unary operators.

Th

TODO: Discuss (1) the `not` keyword, (2) dictionary unpacking - particularly its use in specifying reusable sets of arguments for functions - (3) the `len()` function.

### Binary

TODO: Mention =, +, -, \*, /, %, \*\*, //, and how there are assignment convenience operators like +=


## Control flow

### Branching

TODO

### Looping

#### While loop

TODO

#### For loop

TODO

#### Foreach loop

>##### lists and sets (lists iterate in order, sets not)

TODO

>##### Dictionaries (default iterates over keys, use items() for both, values() for values)

TODO

#### Comprehensions

TODO: same mechanics as foreach loops, supports lists, dictionaries, sets


## Commonly Used Functions

### Sorting

TODO: sorted(obj)

### Copy

TODO: import copy; copy.deepcopy() - explain shallow vs deep copying

## Importing packages

Any imported resources, such as functions, are usually imported at the beginning of a notebook. We import resources with the `import` keyword. Below is an example of such a code cell.

TODO: Insert code cell of imports here.

There are typically 3 kinds package imports:
* built-in
* external
* local

**Built-in** packages come with Python, such as `sys` and `os`, which allow access to the environment outside the Python interpreter such as environment variables (`os.environ`) and the system path (`sys.path`). In this case, we are appending the path to the root directory containing notebooks because it also contains a directory called `utils`, which contains Python files from which we import functions.

**External** packages are obtained from package repositories. Some very common packages to import include `matplotlib`, `numpy`, `xarray`, and `pandas`, which we will discuss in future sections. For now, just know that they are common to import.

**Local** packages are stored on the filesystem as Python files (`.py`). These are often called "utilities".