# Introduction to Python

This is a Jupyter notebook. This is a smart and easy way to run Python-code interactively in the browser. The term *interactively* means that we at any time can stop the process and see the individual outputs, change variable expressions or plot the intermediate results. This notebook can also be run on ERDA and Google Drive, which means that you do not have to have Python installed on your own computer to run the files, although I recommend having a local installation up and running as your primary place of coding. Python and Jupyter notebooks are available for all platforms (Windows, Mac, Linux), for free. I recommend installing the [Anaconda distribution](https://www.anaconda.com/products/individual) of Python, which comes with a lot of useful packages pre-installed, as well as text editors especially designed for Python (Spider and JupyterLab). 

This notebook is meant as a very basic introduction to Python, and specifically and introduction to the basis types of object, how to declare them, and the different methods than can be applied. It is not meant as a comprehensive introduction to Python, but rather as a quick guide to get you started.
***


Valentina Espinoza F. (University of Copenhagen)  
10th January 2023 (latest update)

## Variable declaration

Python is and *object-oriented* programming language, which means that everything you create, every variable you declare, is an object. Objects have attributes and methods. Attributes are properties of the object, and methods are functions that can be applied to the object. For example, a *list* is an object, and it has attributes such as the length of the list, and methods such as sorting the list.

Python is also a *dynamically typed* programming language, which means that you do not have to declare the type of a variable when you create it. The type of the variable is inferred from the value you assign to it. This is different from *statically typed* languages such as C, C++ and Java, where you have to declare the type of the variable when you create it. 

### Numbers (integers and floats)

Below we show how to declare integers and floats:

In [2]:
# Declare an integer
my_int = 1

# Declare a float
my_float = 1.0

Notice the how the decimal point allows Python to know which type of variable it is. Floats are numbers with decimals, integer don't. Having them as two different objects is usefull both memory-wise (integers use less memory), and coding wise (Some operations must only be done with integers, supplying a float will raise an error). 

If you want to know the type of a variable, you can check the *variables panel* in your editor, or use the function `type()`:

In [3]:
type(my_int)

int

### Text (strings)

Text, a combination or one or more characters is called a *string*. These are declared using either single or double quotes. The quotes are not part of the string, they are only used to tell Python that the text in between is a string.

In [2]:
# Declare a string
my_string = "Hello World!"
type(my_string)

str

String are useful objects, and have many methods specific to them. You can try the `split` method to split the string wherever a given character appears, or the `replace` methods to replace one character with another.  

In [7]:
# Split string
my_string.split(" ")    # We split wherever a space appears

['Hello', 'World!']

In [13]:
# Replace character in string
my_string.replace("!", "?")    # We replace every exclamation sign with a question mark

'Hello World?'

If you are unsure on the syntax of a particular function or method, you can always use the `help` command, or the more colorful `?` command:

In [9]:
help(my_string.replace)

Help on built-in function replace:

replace(old, new, count=-1, /) method of builtins.str instance
    Return a copy with all occurrences of substring old replaced by new.
    
      count
        Maximum number of occurrences to replace.
        -1 (the default value) means replace all occurrences.
    
    If the optional argument count is given, only the first count occurrences are
    replaced.



In [3]:
?my_string.replace

[1;31mSignature:[0m [0mmy_string[0m[1;33m.[0m[0mreplace[0m[1;33m([0m[0mold[0m[1;33m,[0m [0mnew[0m[1;33m,[0m [0mcount[0m[1;33m=[0m[1;33m-[0m[1;36m1[0m[1;33m,[0m [1;33m/[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mDocstring:[0m
Return a copy with all occurrences of substring old replaced by new.

  count
    Maximum number of occurrences to replace.
    -1 (the default value) means replace all occurrences.

If the optional argument count is given, only the first count occurrences are
replaced.
[1;31mType:[0m      builtin_function_or_method

### Booleans (True and False)

Booleans are variables that can only take two values: `True` or `False`. They are useful for logical operations, such as checking if a condition is met, or if two variables are equal.

In [5]:
# Input your age here
user_age = 20

is_adult = user_age >= 18   # This "condition statement" will return a boolean value (True or False)
is_adult

True

You can find more examples of logical operations in the Jupyter notebook on *control flow* (`PythonIntro_loops_and_logic.ipynb`). 

### Lists (lists, tuples and arrays)

Often you want to store a large number of variables together, like the output of the `split` method on a string. You do so by declaring a list. A list can hold 

In [16]:
my_float_list = [0.0, 1.0, 4.0, 6.0, 0.0]   # List of 5 floats
my_mixed_list = [0, "zero", 0.0]          # List can hold any type of object 


We can extract elements from list by using an integer to call the index of the element we want. Notice this *indexing* will not work with floats, only integer. Note also that in Python, index start from zero, meanind that e.g. the third element must be called with a `2`. 

In [15]:
my_float_list[2]

0.0

You can use indexing to reassing the value of a particular element in a list, with a new value of your choice.

In [17]:
my_float_list[2] = 13.0   # Assign the '2' entry to 13.0.
my_float_list

[0.0, 1.0, 13.0, 6.0, 0.0]

A `list` is *mutable*, meaning it allows changing the values like we did above or changing its length with methods like `append` (try it out!). A `tuple` on the other hand, is *not-mutable*, one its created, it cannot be changed (it can still be indexed though). We declare a list with square brackets, and a tuple with round brackets.  

In [18]:
my_tuple = (0, 1, 2, 4, 4)
my_tuple[3] = 3     # See how this line fails due to the immutability of tuples

TypeError: 'tuple' object does not support item assignment

Arrays are a special kind of collection of items, and barely resemble a list actually. Some of the noteworthy characteristics or an array is that they are actually a `numpy` object, so the declaration must be `np.array` (is we imported numpy as np). To create an array we can start from a list, a tuple, or nothing at all!. 

In [2]:
import numpy as np

my_list = [7,8,9]
my_array = np.array([my_list])   # try np.array([])!
my_array

array([[7, 8, 9]])

If you want an array of 10000 zeros, you do not need to type that many zeros. You can create one by calling `np.zeros(10000)`

In [3]:
my_big_zero_array = np.zeros(10000)   # try np.ones(100)!

# You can use multiplication or summation to get an array of any number. 
my_big_pi_array1 = np.zeros(10000) + 3.14
my_big_pi_array2 = np.ones(10000) * 3.14 
my_big_pi_array2

array([3.14, 3.14, 3.14, ..., 3.14, 3.14, 3.14])

You saw from the examples above that you can apply mathematical operations to arrays. This is because arrays are *vectorized*, meaning that the operations are applied to each element of the array. This is very useful, and can save you a lot of time. You can also apply mathematical operations to lists, but the result will be different. Try it out!

In [30]:
my_list = [2,2,2]
my_list * 3     # Do you expect this to be [6,6,6]? Do you see the actual operation that is performed to the list?

[2, 2, 2, 2, 2, 2, 2, 2, 2]

Numpy also has a lot of convenient functions for creating other specific types of arrays. Two notorious ones are `np.arange` and `np.linspace`. Both create arrays of evenly spaced values, but on the first you set the `step` between elements, and on the second you set the `number` of elements in the array.

In [11]:
my_numpy_arange = np.arange(0, 4, 0.5)      # The basic sintax is np.arange(start, stop, step)
print("my_numpy_arange: ", my_numpy_arange)   # Note that the upper limit is not included

my_numpy_linspace = np.linspace(0, 4, 5)        # The basic sintax is np.linspace(start, stop, number_of_elements)
print("my_numpy_linspace: ", my_numpy_linspace) # Note that the upper limit is included

my_numpy_arange:  [0.  0.5 1.  1.5 2.  2.5 3.  3.5]
my_numpy_linspace:  [0. 1. 2. 3. 4.]


In [29]:
# We can also create arrays of random numbers
my_random_array = np.random.random(10)     # 10 random numbers between 0 and 1
my_random_array

array([0.21074733, 0.37700788, 0.79544426, 0.98837509, 0.94762863,
       0.15617073, 0.48607378, 0.38265301, 0.44979017, 0.5376509 ])

Most relevant, is that array can have more than one dimension. You can create a 2D array by supplying a list of lists, or a tuple of tuples. Arrays of more dimensions are also possible, but their display on screen is not optimal, and we will not really work with them.

In [32]:
# A 2D array based on a list of lists
my_2d_array = np.array([[1,2,3],[4,5,6],[7,8,9]])

# What about a 2D array of pi
my_pi_array = np.ones((5,5)) * 3.14

# or a 2D array of random numbers
my_random_array = np.random.random((5,5))

# Indexing a 2D array
my_2d_array[1,2]    # The entry in the second row and third column (indexing works as [row, column])

### Dictionaries (dicts)

You could have an 2D array of cartographic coordinates, but how would you know which column is the latitude and which the longitude? In some cases, a better way to store this information is with a *dictionary*. A dictionary is a collection of *key-value* pairs. The key is a string, and the value can be any object, object here could be a float, a list, a tuple, or even another dictionary. A dictionary is declared by (a) curly brackets, and the key-value pairs are separated by a colon, or (b) by using the `dict` function.

In [34]:
# A single cartographic coordinate
my_coord_dict = {"lon": 110.0, "lat": 2.0}   # A dictionary of cartographic coordinates


# A list of cartographic coordinates
my_coords_dict = {"lon": [110.0, 111.0, 112.0], "lat": [2.0, 3.0, 4.0]}
my_coords_dict["lat"]    # Access the latitude values

[2.0, 3.0, 4.0]

In [37]:
# The same dictionaty, but created with the dict() function
my_coords_dict = dict(lon=[110.0, 111.0, 112.0], lat=[2.0, 3.0, 4.0])

`DataFrames` are similar to a dictionary, and are the among the main objects we will use in this course. They are declared by using the `pandas` package, and are very useful for storing *tabular data*. That means all columns have the same amount of elements, and rows are indexed by and integer. 

In [53]:
import pandas as pd
hotspots_df = pd.DataFrame(dict(name = ["Afar", "Hawaii", "Yellowstone"], lat = [12.82, 19.42, 44.42], lon = [41.75, -155.29, -110.67]))
hotspots_df

Unnamed: 0,name,lat,lon
0,Afar,12.82,41.75
1,Hawaii,19.42,-155.29
2,Yellowstone,44.42,-110.67


In [54]:
# We can add more rows to the dataframe
hotspot_new_row = pd.DataFrame(dict(name = "Kerguelen", lat = -49.58, lon = 69.5), index=[len(hotspots_df)])
hotspots_df = pd.concat([hotspots_df, hotspot_new_row])
hotspots_df

Unnamed: 0,name,lat,lon
0,Afar,12.82,41.75
1,Hawaii,19.42,-155.29
2,Yellowstone,44.42,-110.67
3,Kerguelen,-49.58,69.5


In [56]:
# More columns can also be added
hotspots_df["age"] = [30, 0.7, 0.6, 35]
hotspots_df

Unnamed: 0,name,lat,lon,age
0,Afar,12.82,41.75,30.0
1,Hawaii,19.42,-155.29,0.7
2,Yellowstone,44.42,-110.67,0.6
3,Kerguelen,-49.58,69.5,35.0


In [60]:
# Indexing by column is done with the column name
hotspots_df["name"]

# Indexing by row is done with the .loc attribute
hotspots_df.loc[0]

# Note that the zero index will not necessarily be the first row, but the first row with the index value of zero (leftmost column in the display).
# To get the first row, we can use the .iloc attribute
hotspots_df.iloc[0]

name     Afar
lat     12.82
lon     41.75
age      30.0
Name: 0, dtype: object