<a href="https://colab.research.google.com/github/andersknudby/Remote-Sensing/blob/master/Chapter_2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Chapter 2 - Variable manipulation and more on arrays and lists
## Integers
Integers may be the easiest kind of variable to deal with – you know exactly what you get when you write things like:

In [None]:
a = 2
b = 2
c = a + b
d = a - b

##### Note that in a notebook you can print a variable by just writing its name on a line.

In [None]:
c

##### But it only works for the last one you write:

In [None]:
c
d

##### To print the content of multiple variables you still use the print statement, which you also need if you work outside notebooks

In [None]:
print(c)
print(d)

##### Or even better (the "str" function is used to convert a number of a string, which is needed to print it together with another string:

In [None]:
print("c is: " + str(c))
print("d is: " + str(d))

#### Floats
##### Floats (floating point numbers) are what normal people calls numbers with decimals, like 0.4, -2.8, or 3.1415926. To make sure you define a number as a float rather than an integer, simply add a decimal value to it, as in the examples below. Floats also act pretty much like you expect they would, as you can verify with the following code:

In [None]:
a = 2.0
b = 0.3
c = a + b
print(c)

#####However, there is one thing to keep in mind when it comes to floats. Some decimal numbers have, in theory, an infinite number of decimals. For example, 10/3 is 3.333333333333 (and so on). But typical floats are stored using 24 bits of computer memory, and while that allows for a lot of decimals to be accurately stored, it is nevertheless a finite amount of memory and so it does not allow us to store the value of 10/3 with perfect precision. This can lead to rounding errors. In most cases such rounding errors are not going to be important, but occasionally they might be. For an illustration, try the following lines of code (this example has comments line by line, which is still easier than using text blocks in some cases):

In [None]:
import math  # ‘math’ is a standard Python library.
a = math.sqrt(2)  # calculate the square root of 2, which is around 1.41
b = pow(a, 2)  # raise a to the power of 2. The value of b should now be 2!
print(b)  # print b with the normal print function.
print("{:10.20f}".format(b))  # print b with 20 decimals.

When I run the code above, 'print(b)' gives me the expected result of 2.0, but the next print statement that provides me with a more detailed look at what the actual value of b is in computer memory, and it turns out it is slightly more than 2. The difference is tiny, but keep it in mind for later.

Note also that most functions automatically create variables of a certain type. For example the function math.sqrt always produces a float, which makes sense because most square roots are decimal numbers. Operators (like '+' or '-') will produce variables of a type that depends on the input. If you are working with integers operators will typically give you an integer, if you are working with floats operators will give you a float, and if you are working with a mix of integers and floats operators will give you a float, to preserve as much precision as possible.

One exception to the rule above is that division with integers gives you a float, even when it didn't need to. But note that integer division doesn't work like this in other Python IDEs, so you need to be careful when diviging integers. I prefer explicitly using floats (e.g. '5.0' instead of '5') when in doubt.

In [None]:
print(type(4-2))
print(type(4/2))

##Strings
Strings are designed to hold text, and as a result, they operate differently than integers and floats. To assign a piece of text to a variable as a string, simply put it in quotation marks:

In [None]:
myName = "Anders Knudby"
numberOfEyes = "2"
CAD_US_Rate = "0.75"

A common use of strings is to hold paths to directories and file, such as:

In [None]:
myHomeDir = "C:/Users/Anders/My Documents/"
dataDir = "D:/data/"
image = dataDir + "testimage.tif"

##Arrays
Arrays are one of the most important variable types we will work with in remote sensing, because they contain structured sequences of (typically) numbers in one or more dimensions. As such, two-dimensional arrays correspond very well to the kind of data we find in a single band of a satellite image, and three-dimensional arrays correspond very well to the kind of data we find in all bands of a satellite image.
The core Python installation doesn’t use arrays, so usually when people refer to arrays in Python what they mean is NumPy arrays. NumPy (Numerical Python) is one of the most commonly used Python libraries because it has lots of good functions to deal with numbers. So, to use arrays, we need to:

In [None]:
import numpy as np

When you ***import*** a library ***as*** something, it changes how you will refer to it later on. In the section on floats, we used the statement *import math*, and then used the 'sqrt' function in the 'math' library by writing 'math.sqrt'. Because we have imported numpy ***as*** np here, we will now not need to write 'numpy.array' to use NumPy’s array function, instead we can simply write 'np.array'. Importing NumPy as np is standard practice, so you might as well get used to it.

To create a NumPy array, you can use different functions. Try the following by pressing Ctrl+Enter in each code block (we won’t keep on putting things into variables, so in this example you will get a more immediate idea of what each command does):

In [None]:
np.array([1, 2, 3])  # Note the curved brackets, followed by the square brackets.

In [None]:
np.array([1, 2.0, 3])  # If one or more numbers in a NumPy array is a float, all numbers become floats

In [None]:
np.array(([1, 2], [3, 4]))  # This structure creates a two-dimensional array, with two “lines” and two “columns”

Sometimes you might want to create an empty array that you can later put some results into. Here are some options:

In [None]:
np.zeros(4)

In [None]:
np.ones(5)

In [None]:
np.empty(7)

Note that the np.empty command creates an array with seemingly random numbers in it. Basically all it does is allocate some space in computer memory for a new array, and whatever values the bits in that memory space already have are not changed. Not having to change the values makes this command very fast, but it also means that if, for some reason, you don’t change the values later, you end up with very odd (and random) results. Use with care!

##Array manipulation and indexing
One of the most useful things about arrays is how you can either access individual values in them, or access all values at the same time.
For example, simple arithmetic operations can be carried out very quickly and simply on an entire arrays at a time. Try the following to see how it works:

In [None]:
a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
b = a + 2
print(b)

You can also run operations on two arrays, as long as they have the same size (or at least as the size of one is a multiple of the size of the other, which is rarely useful):

In [None]:
a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
b = np.array([6, 2, 9, 10, 1, 3, 8, 7, 5, 4])
print(a + b)

Now, if you want to access a single number in an array, you can do this easily if you know its position in the array because all positions are numbered. However, and this is counter-intuitive to most people at first, **position (index) numbers in Python start at 0**:

In [None]:
a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
b = np.array([6, 2, 9, 10, 1, 3, 8, 7, 5, 4])
print(a[0])  # This prints the number at position 0 (the first position) in array ‘a’
print(a[9])  # This prints the number at position 9 (the tenth position) in array ‘a’

In [None]:
print(a[10])  # Explain to yourself why this gives an error

Make sure you understand why you get the answer you do when you run the code block below.

In [None]:
print(b[5] - a[4])

Numpy has some nifty functions for working with arrays to find specific values. Try the following:

In [None]:
a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
print(np.max(a))  # This returns the maximum value in the array
print(np.argmax(a))  # This returns the position of the maximum value in the array, however…
a = np.array([10, 2, 3, 4, 5, 6, 7, 8, 9, 10])  # Note the changed first value
print(np.argmax(a))  # When there are two or more values that share the distinction of being the maximum, np.argmax only tells you where the first one is!
print(np.where(a==10)[0])  # np.where is a great alternative, which allows you to find all the positions in an array where a condition (in this case a being equal to 10) is fulfilled. More on that later.

##Lists
Lists are kind of like arrays, but different in important ways:
* Every element in an array must contain the same type of data (which is why all numbers are converted to floats if a single float is present in the array when it is created), but ***lists can contain different types of variables***.
* Arrays have specific sizes that cannot be changed once they have been created (this is both an advantage and a disadvantage), but ***lists can be expanded or shrunk as necessary through the execution of a program***.

To create a list, use square brackets:

In [None]:
myShoppingList = ["Milk", "Bread", "Candy"]

To create an empty list:

In [None]:
myEmptyList = []

As mentioned above, lists can contain different kinds of variables. The following contains an integer, a float, a string, a NumPy array, and another list (note that when you have lists within lists you end up with multiple square brackets):

In [None]:
myCompletelyCrazyList = [1, 4.5346, "a random string", np.zeros(4), ["Milk", "Bread"]]

Of course you can also put existing variables in a list, like so:

In [None]:
myEvenCrazierList = [myShoppingList, myEmptyList, myCompletelyCrazyList]
myEvenCrazierList

You can access individual elements from a list the same way you did with arrays, like so:

In [None]:
myCompletelyCrazyList[2]  # Also for lists, positions start at 0

A common function used with lists is to add something to the end of it. Say you forgot that you also needed to buy medicine, you could add it to your shopping list with the ‘append’ function:

In [None]:
myShoppingList.append("Medicine")

Note that arrays can be converted to lists with the .tolist() command, and lists can be converted to arrays with the np.asarray() command, like so:

In [None]:
import numpy as np
array = np.array([1, 2])
print("As an array: ", end="")  # By default, print statements end in a 'newline' character, but you can modify that using the 'end' argument 
print(array)
aslist = array.tolist()
print("Now as a list: ", end="")
print(aslist)
backtoarray = np.asarray(aslist)
print("End now as an array again: ", end="")
print(backtoarray)

## Dictionaries
A Python dictionary is not much different from a list, except that each element in it has a name, which can make retrieval of individual elements easier. For example, with the following structure, you (the programmer) don’t need to remember ***where*** in the dictionary the "Name" is found, you just need to remember that it’s called "Name", to find out what its value is:

In [None]:
my_dict = {"Instructor": "Anders Knudby", "Name": "Remote Sensing"}
print(my_dict["Name"])

Learning about the different variable types, if you are doing this for the first time, can be a bit overwhelming. Hopefully this table can help underline the main points:

>Integer | Float | String | Array | List | Dictionary
>--- | --- | --- | --- | --- | ---
><div align="center">Whole number | <div align="center">Decimal number | <div align="center">Text | <div align="center">Structured sequence<br/>of numbers | <div align="center">Structured sequence of<br/>any type of variable | <div align="center">Unstructured list of<br/>named variables
><div align="center">Beware of divisions<br/>in some IDEs | <div align="center">Beware of<br/>rounding errors | <div align="center">Take care when storing<br/>numbers as text | <div align="center">Remember that positions<br/>start at 0 | <div align="center">Remember that positions<br/>start at 0 | <div align="center">Positions are replaced by names,<br/>making retrieval easier


## Exercise
1\) Create an integer variable

2\) Create a float variable

3\) Add the two, and put the result into a new variable

4\) Repeat steps 1-3 with new values

5\) You should now have two variables created by addition of one integer and one float. Find out the 'type' of these two variables. Make sure you understand ***why*** they have the type they do.

6\) Create an array that contains the values of those two variables. Make sure to use the variable names when you do it, like myArray = [firstVariable, secondVariable], instead of myArray = [2, 6].

7\) Turn your array into a list

8\) Some NumPy functions also work on lists **as long as the lists contain numbers**. Try using the functions 'np.sum', and 'np.mean', on your list. If you get an error when you try the first time then try to figure out why. Search for answers on the Internet. Try different things. Ask a friend. Get it to work.