# 1. Python basics

This chapter only gives a short introduction to Python to make the explanations in the following chapters more understandable. A detailed description would be too extensive and would go beyond the scope of this tutorial. Take a look at https://docs.python.org/tutorial/.

Now let's take our first steps into the Python world.


## 1.1 Functions

You can see a function as a small subprogram that does a special job. Depending on what should be done, the code is executed and/or something will be returned to the calling part of the script. There is a set of built-in functions available but you can create your own functions, too. 

Each function name in Python 3.x is followed by parentheses () and any input parameter or arguments are placed within it. Functions can perform calculations or actions and return None, a value or multiple values.

For example

    function_name()
    function_name(parameter1, parameter2,...)
    ret = function_name(variable, axis=0)


### 1.1.1 Print

In our notebooks we use the function <font color='blue'><b>print</b></font> to print contents when running the cells. This gives us the possibility to export the notebooks as Python scripts, which can be run directly in a terminal.

Let's print the string <font color='red'>Hello World</font>. A string can be written in different ways, enclosed in single or double quotes. 


In [1]:
print('Hello World')

Hello World


This is the easiest way to use print. In order to produce a prettier output of the variable contents format specifications can be used. But we will come to this later.

## 1.2 Data types

- Numeric
  - integer
  - float
  - complex

- Boolean
  - True or False
 
- Text
  - string
  
- and many others

We use the built-in function <font color='blue'><b>type</b></font> to retrieve the type of a variable.

Example: Define a variable x with value 5 and print the content of x.

In [2]:
x = 5
print(x)

5


Let's see what type the variable really has. You can use the function type as argument to the function print.

In [3]:
print(type(x))

<class 'int'>


Change the value of variable x to a floating point value of 5.0.

In [4]:
x = 5.0
print(x)

5.0


Get the type of the changed variable x.

In [5]:
print(type(x))

<class 'float'>


Define variables of different types.

In [6]:
x = 1
y = 7.3
is_red = False
title = 'Just a string'
print(type(x), type(y), type(is_red), type(title))

<class 'int'> <class 'float'> <class 'bool'> <class 'str'>


## 1.3 Lists

A list is a compound data type, used to group different values which can have different data types. Lists are written as a list of comma-separated values (items) between square brackets.


In [7]:
names = ['Hugo', 'Charles','Janine']
ages  = [72, 33, 16]
print(type(names), type(ages))
print(names)
print(ages)

<class 'list'> <class 'list'>
['Hugo', 'Charles', 'Janine']
[72, 33, 16]


To select single or multiple elements of a list you can use indexing. A negative value takes the element from the end of the list.

In [8]:
first_name = names[0]
last_name  = names[-1]
print('First name: %-10s' % first_name)
print('Last name:  %-10s' % last_name)
print(type(names[0]))

First name: Hugo      
Last name:  Janine    
<class 'str'>


To select a subset of a list you can use indices, slicing, \[start_index<font color='red'><b>:</b></font>end_index\[<font color='red'><b>:</b></font>step\]\], where the selected part of the list include the first element and all following elements until the element **before** end_index. 

The next example will return the first two elements (index 0 and 1) and **not** the last element.


In [9]:
print(names[0:2])

['Hugo', 'Charles']


What will be returned when doing the following?

In [10]:
print(names[0:3:2])
print(names[1:2])
print(names[1:3])
print(names[::-1])

['Hugo', 'Janine']
['Charles']
['Charles', 'Janine']
['Janine', 'Charles', 'Hugo']


The slicing with \[::-1\] reverses the order of the list.

Using only the colon without any indices for slicing means to create a shallow copy of the list. Working with the new list will not affect the original list.


In [11]:
names_ln = names
names_cp = names[:]
names[0] = 'Ben'
print(names_ln)
print(names_cp)

['Ben', 'Charles', 'Janine']
['Hugo', 'Charles', 'Janine']


In [12]:
names.append('Paul')
print(names)

names += ['Liz']
print(names)

['Ben', 'Charles', 'Janine', 'Paul']
['Ben', 'Charles', 'Janine', 'Paul', 'Liz']


Well, how do we do an insertion of an element right after the first element?

In [13]:
names.insert(1,'Merle')
print(names)

['Ben', 'Merle', 'Charles', 'Janine', 'Paul', 'Liz']


If you want to add more than one element to a list use extend.

In [14]:
names.extend(['Sophie','Sebastian','James'])
print(names)

['Ben', 'Merle', 'Charles', 'Janine', 'Paul', 'Liz', 'Sophie', 'Sebastian', 'James']


If you decide to remove an element use remove.

In [15]:
names.remove('Janine')
print(names)

['Ben', 'Merle', 'Charles', 'Paul', 'Liz', 'Sophie', 'Sebastian', 'James']


With pop you can remove an element, too. Remove the last element of the list.

In [16]:
names.pop()
print(names)

['Ben', 'Merle', 'Charles', 'Paul', 'Liz', 'Sophie', 'Sebastian']


Remove an element by its index.

In [17]:
names.pop(2)
print(names)

['Ben', 'Merle', 'Paul', 'Liz', 'Sophie', 'Sebastian']


Use reverse to - yupp - reverse the list.

In [18]:
names.reverse()
print(names)

['Sebastian', 'Sophie', 'Liz', 'Paul', 'Merle', 'Ben']


## 1.4 Tuples

A tuple is like a list, but it's unchangeable (it's also called immutable). Once a tuple is created, you cannot change its values. To change a tuple you have to convert it to a list, change the content, and convert it back to a tuple.

Define the variable tup as tuple.

In [19]:
tup = (0, 1, 1, 5, 3, 8, 5)
print(type(tup))

<class 'tuple'>


Sometimes it is increasingly necessary to make multiple variable assignments which is very tedious. But  it is very easy with the tuple value packaging method. Here are some examples how to use tuples.
Standard definition of variable of type integer.

In [20]:
td = 15
tm = 12
ty = 2018

print(td,tm,ty)

15 12 2018


Tuple packaging

In [21]:
td,tm,ty = 15,12,2018

print(td,tm,ty)
print(type(td))

15 12 2018
<class 'int'>


You can use tuple packaging to assign the values to a single variable, too.

In [22]:
date = 31,12,2018

print(date)
print(type(date))

(day, month, year) = date

print(year, month, day)

(31, 12, 2018)
<class 'tuple'>
2018 12 31


Tuple packaging makes an exchange of the content of variables much easier.

In [23]:
x,y = 47,11

x,y = y,x

print(x,y)

11 47


Ok, now we've learned a lot about tuples, but not all. There is a very helpful way to unpack a tuple.

Unpacking example with a tuple of integers.

In [24]:
tup = (123,34,79,133)

X,*Y = tup

print(X)
print(Y)

X,*Y,Z = tup

print(X)
print(Y)
print(Z)

X,Y,*Z = tup

print(X)
print(Y)
print(Z)

123
[34, 79, 133]
123
[34, 79]
133
123
34
[79, 133]


Unpacking example with a tuple of strings.

In [25]:
Name = 'Elizabeth'

A,*B,C = Name

print(A)
print(B)
print(C)

A,B,*C = Name

print(A)
print(B)
print(C)

E
['l', 'i', 'z', 'a', 'b', 'e', 't']
h
E
l
['i', 'z', 'a', 'b', 'e', 't', 'h']


## 1.5 Computations

To do computations you can use the algebraic opertors on numeric values and lists. 


In [26]:
m = 12
d = 8.1
s = m + d
print(s)
print(type(s))

20.1
<class 'float'>


The built-in functions max(), min(), and sum() for instance can be used to do computations for us.

In [27]:
data = [12.2, 16.7, 22.0, 9.3, 13.1, 18.1, 15.0, 6.8]
data_min = min(data)
data_max = max(data)
data_sum = sum(data)

print('Minimum %6.1f' % data_min)
print('Maximum %6.1f' % data_max)
print('Sum     %6.1f' % data_sum)


Minimum    6.8
Maximum   22.0
Sum      113.2


To do computations with lists is not  that simple.

Multiply the content of the list values by 10.

In [28]:
values = [1,2,3,4,5,6,7,8,9,10]
values10 = values*10
print(values10)

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]


Yeah, that is not what you have expected, isn't it? 

To multiply a list by a value means to repeat the list 10-times to the new list. We have to go through the list and multiply each single element by 10. There is a long and a short way to do it.

The long way:

In [29]:
values10 = values[:]

for i in range(0,len(values)):
    values10[i] = values[i]*10
    
print(values10)

[10, 20, 30, 40, 50, 60, 70, 80, 90, 100]


The more efficient way is to use Python's list comprehension:

In [30]:
values10 =  [i * 10 for i in values]

print(values10)

[10, 20, 30, 40, 50, 60, 70, 80, 90, 100]


In [31]:
# just to be sure that the original values list is not overwritten.

print(values)

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]


To notice would be the inplace operators **+=** and ***=**.

In [32]:
ix  = 1
print(ix)
ix += 3      # same as x = x + 3
print(ix)
ix *= 2      # same as x = x * 2
print(ix)

1
4
8


## 1.6 Statements

Like other programing languages Python uses similar flow control statements


### 1.6.1 if statement

The most used statement is the **if** statement which allows us to control if a condition is **True** or **False**. It can contain optional parts like **elif** and **else**.


In [33]:
x = 0

if(x>0):
    print('x is greater than 0')
elif(x<0):
    print('x is less than 0')
elif(x==0):
    print('x is equal 0')

x is equal 0


In [34]:
user = 'George'

if(user):
    print('user is set')
    if(user=='Dennis'):
        print('--> it is Dennis')
    else:
        print('--> but it is not Dennis')

user is set
--> but it is not Dennis


### 1.6.2 while statement

The lines in a while loop is executed until the condition is False.


In [35]:
a = 0
b = 10

while(a < b):
    print('a =',a)
    a = a + 1

a = 0
a = 1
a = 2
a = 3
a = 4
a = 5
a = 6
a = 7
a = 8
a = 9


### 1.6.3 for statement

The use of the for statement differs to other programming languages because it iterates over the items of any sequence, e.g. a list or a string, in the order that they appear in the sequence.


In [36]:
s = 0
for x in [1,2,3,4]:
    s = s + x

print('sum = ', s)

sum =  10


In [37]:
# Now, let us find the shortest name of the list names.
# Oh, by the way this is a comment line :), which will not be executed.

index  = -99
length = 50
i = 0
for name in names:
    if(len(name)<length):
        length = len(name)
        index  = i
    i+=1

print('--> shortest name in list names is', names[index])

--> shortest name in list names is Liz


## 1.7 Import Python modules

Usually you need to load some additional Python packages, so called modules, in your program in order to use their functionality. This can be done with the command **import**, whose usage may look different.

```python
import module_name
import module_name as short_name
from module_name import module_part
```

### 1.7.1 Module os

We start with a simple example. To get access to the operating system outside our program we have to import the module **os**.

In [38]:
import os

Take a look at the module.

In [39]:
print(help(os))

Help on module os:

NAME
    os - OS routines for NT or Posix depending on what system we're on.

MODULE REFERENCE
    https://docs.python.org/3.7/library/os
    
    The following documentation is automatically generated from the Python
    source files.  It may be incomplete, incorrect or include features that
    are considered implementation detail and may vary between Python
    implementations.  When in doubt, consult the module reference at the
    location listed above.

DESCRIPTION
    This exports:
      - all functions from posix or nt, e.g. unlink, stat, etc.
      - os.path is either posixpath or ntpath
      - os.name is either 'posix' or 'nt'
      - os.curdir is a string representing the current directory (always '.')
      - os.pardir is a string representing the parent directory (always '..')
      - os.sep is the (or a most common) pathname separator ('/' or '\\')
      - os.extsep is the extension separator (always '.')
      - os.altsep is the alternate pathname se

Ok, let's see in which directory we are.

In [40]:
pwd = os.getcwd()
print(pwd)

/Users/k204045/PyEarthScience-Project/GitHub/PyEarthScience/Tutorial


Go to the parent directory and let us see where we are then.

In [41]:
os.chdir('..')
print("Directory changed: ", os.getcwd())

Directory changed:  /Users/k204045/PyEarthScience-Project/GitHub/PyEarthScience


Go back to the directory where we started (that's why we wrote the name of the directory to the variable pwd ;)).

In [42]:
os.chdir(pwd)
print("Directory changed: ", os.getcwd())

Directory changed:  /Users/k204045/PyEarthScience-Project/GitHub/PyEarthScience/Tutorial


To retrieve the content of an environment variable the module os provides os.environment.get function.

In [43]:
HOME = os.environ.get('HOME')
print('My HOME environment variable is set to ', HOME)

My HOME environment variable is set to  /Users/k204045


Concatenate path names with os.path.join.

In [44]:
datadir = 'data'
newpath = os.path.join(HOME,datadir)
print(newpath)

/Users/k204045/data


Now, we want to see if the directory really exist.

In [45]:
if os.path.isdir(newpath):
    print('--> directory %s exists' % newpath)
else:
    print('--> directory %s does not exist' % newpath)

--> directory /Users/k204045/data exists


Modify the datadir variable, run the cells and see what happens.

But how to proof if a file exist? Well, there is a function os.path.isfile, who would have thought!

In [46]:
input_file = os.path.join('data','precip.nc')

if os.path.isfile(input_file):
    print('--> file %s exists' % input_file)
else:
    print('--> file %s does not exist' % input_file)

--> file data/precip.nc exists


Add a cell and play with the os functions in your environment.


### 1.7.2 Module glob

In the last case we already know the name of the file we are looking for but in most cases we don't know what is in a directory.

To get the file names from a directory the **glob** module is very helpful.

For example, after importing the glob module the glob function of glob, weired isn't it, will return a list of all netCDF files in the subdirectory data.

In [47]:
import glob

fnames = sorted(glob.glob('./data/*.nc'))

print(fnames)

['./data/precip.nc', './data/tsurf.nc']


Now, we can select a file, for instance the second one, of fnames.

In [48]:
print(fnames[1])

./data/tsurf.nc


But how can we get rid of the leading path? And yes, the os module can help us again with its path.basename function.

In [49]:
print(os.path.basename(fnames[1]))

tsurf.nc


### 1.7.2 Module sys

The module sys provides access to system variables and functions. The  Module includes functions to read from stdin, write to stdout and stderr, and others.

Here we will give a closer look into the part sys.path of the module, which among other things allows us to extend the search path for loaded modules.

In the subdirectory **lib** is a file containing user defined python functions called **dkrz_utils.py**. To load the file like a module we have to extend the system path before calling any function from it. 

In [50]:
import sys

sys.path.append('./lib/')
import dkrz_utils

tempK = 286.5   #-- units Kelvin

print('Convert:  %6.2f degK  == %6.2f degC' % (tempK, (dkrz_utils.conv_K2C(tempK))))

Convert:  286.50 degK  ==  13.35 degC
