<a href="https://colab.research.google.com/github/JavkhlanEnkhbold/python-cheatsheet/blob/main/data-science-for-esm/01-workshop-python.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introduction to `Python`

:::{note}
This material is mostly adapted from the following resources:
- https://earth-env-data-science.github.io/lectures/core_python/python_fundamentals.html
- https://earth-env-data-science.github.io/lectures/core_python/functions_classes.html
- https://www.tomasbeuzen.com/python-programming-for-data-science/chapters/chapter1-basics.html

Further excellent introductions to Python can be found at
- https://www.learnpython.org
- https://www.pythontutorial.net
- https://python-course.eu
:::

## Running Python ##

There are three main ways to use Python.

1. By running a Python file, e.g. `python myscript.py`
1. Through an interactive console (Python interpreter or iPython shell)
1. In an interactive notebook (e.g. Jupyter)

In this course, we will mostly be interacting with Python via Jupyter notebooks.

:::{note}
If you have not yet set up Python on your computer, you can execute this tutorial in your browser via [Google Colab](https://colab.research.google.com/). Click on the rocket in the top right corner and launch "Colab". If that doesn't work download the `.ipynb` file and import it in [Google Colab](https://colab.research.google.com/)
:::

## Basic Variables: Numbers and String ##

In [1]:
# comments are anything that comes after the "#" symbol
a = 1       # assign integer 1 to variable a
b = "hello" # assign string "hello" to variable b

The following identifiers are used as reserved words and should not be used as identifiers:

    False      class      finally    is         return
    None       continue   for        lambda     try
    True       def        from       nonlocal   while
    and        del        global     not        with
    as         elif       if         or         yield
    assert     else       import     pass
    break      except     in         raise
    
Additionally, the following a built-in functions are always available:

    abs() dict() help() min() setattr() all() dir() hex() next() slice() any()
    divmod() id() object() sorted() ascii() enumerate() input() oct() staticmethod()
    bin() eval() int() open() str() bool() exec() isinstance() ord() sum() bytearray()
    filter() issubclass() pow() super() bytes() float() iter() print() tuple()
    callable() format() len() property() type() chr() frozenset() list() range()
    vars() classmethod() getattr() locals() repr() zip() compile() globals() map()
    reversed() __import__() complex() hasattr() max() round() delattr() hash()
    memoryview() set()

In [2]:
# how to we see our variables?
print(a)
print(b)
print(a,b)

1
hello
1 hello


All variables are objects. Every object has a type (class). To find out what type your variables are

In [3]:
print(type(a))
print(type(b))

<class 'int'>
<class 'str'>


In [4]:
# as a shortcut, iPython notebooks will automatically print whatever is on the last line
type(b)

str

In [5]:
# we can check for the type of an object
print(type(a) is int)
print(type(a) is str)

True
False


`NoneType` is its own type in Python. It only has one possible value, `None` - it represents an object with no value. 

In [6]:
n = None

In [7]:
print(n)

None


In [8]:
type(n)

NoneType

Objects can have **attributes** and **methods**, which can be accessed via ``variable.method``

IPython will autocomplete if you press ``<tab>`` to show you the methods available.

In [9]:
# this returns the method itself
b.capitalize

<function str.capitalize()>

In [10]:
# this calls the method
b.capitalize()

'Hello'

## String Operators ##

Basic operations to modify strings.

In [11]:
s = "HOW ARE YOU TODAY?"

In [12]:
split = s.split(" ")

In [13]:
split

['HOW', 'ARE', 'YOU', 'TODAY?']

In [14]:
"-".join(split)

'HOW-ARE-YOU-TODAY?'

Python has ways of creating strings by "filling in the blanks" and formatting them nicely.

This is helpful for when you want to print statements that include variables or statements.

In [15]:
name = "Newborn Baby"
age = 4 / 12
day = 10
month = 6
year = 2020
template_new = f"Hello, my name is {name}. I am {age:.2f} years old. I was born {day}/{month:02}/{year}."
template_new

'Hello, my name is Newborn Baby. I am 0.33 years old. I was born 10/06/2020.'

## Math ##

Basic arithmetic and boolean logic is part of the core Python library.

In [None]:
# addition / subtraction
1 + 1 - 5

-3

In [None]:
# multiplication
5 * 10

50

In [None]:
# division
1/2

0.5

In [None]:
# that was automatically converted to a float
type(1/2)

float

In [None]:
# exponentiation
2**4

16

In [None]:
# rounding
round(9/10)

1

In [None]:
# floor division
101 // 2

50

In [None]:
# modulo
101 % 2

1

## Comparison Operators

We can compare objects using comparison operators, and we'll get back a Boolean result:

| Operator  | Description                          |
| :-------- | :----------------------------------- |
| `x == y ` | is `x` equal to `y`?                 |
| `x != y`  | is `x` not equal to `y`?             |
| `x > y`   | is `x` greater than `y`?             |
| `x >= y`  | is `x` greater than or equal to `y`? |
| `x < y`   | is `x` less than `y`?                |
| `x <= y`  | is `x` less than or equal to `y`?    |
| `x is y`  | is `x` the same object as `y`?       |

In [None]:
2 < 3

True

In [None]:
"energy" == "power"

False

In [None]:
2 != "2"

True

In [None]:
2 == 2.0

True

## Boolean Operators

We also have so-called "boolean operators" which also evaluates to either `True` or `False`:

| Operator | Description |
| :---: | :--- |
|`x and y`| are `x` and `y` both True? |
|`x or y` | is at least one of `x` and `y` True? |
| `not x` | is `x` False? | 

In [None]:
# logic
True and True

True

In [None]:
True and False

False

In [None]:
True or True

True

In [None]:
(not True) or (not False)

True

## Conditionals ##

The first step to programming. Plus an intro to Python syntax.

In [16]:
x = 100
if x > 0:
    print('Positive Number')
elif x < 0:
    print('Negative Number')
else:
    print ('Zero!')

Positive Number


In [17]:
# indentation is MANDATORY
# blocks are closed by indentation level
if x > 0:
    print('Positive Number')
    if x >= 100:
        print('Huge number!')

Positive Number
Huge number!


There is also a way to write `if` statements "inline", i.e., in a single line, for simplicity.

In [18]:
words = ["the", "list", "of", "words"]

x = "long list" if len(words) > 10 else "short list"
x

'short list'

## More Flow Control ##

In [19]:
# make a loop 
count = 0
while count < 10:
    # bad way
    # count = count + 1
    # better way
    count += 1
print(count)

10


In [20]:
# use range
for i in range(5):
    print(i)

0
1
2
3
4


__Important point__: in Python, we always count from 0!

In [21]:
# what is range?
type(range)

type

In [22]:
range?

In [23]:
# iterate over a list we make up
for pet in ['electricity', 'hydrogen', 'methane']:
    print(pet, len(pet))

electricity 11
hydrogen 8
methane 7


In [24]:
# iterate over a list and count indices
for i, pet in enumerate(['electricity', 'hydrogen', 'methane']):
    print(i, pet, len(pet))

0 electricity 11
1 hydrogen 8
2 methane 7


What is the thing in brackets? __A list!__ Lists are one of the core Python data structures.

## Lists ##

In [25]:
l = ['electricity', 'hydrogen', 'methane']
type(l)

list

In [26]:
# list have lots of methods
l.sort()
l

['electricity', 'hydrogen', 'methane']

In [27]:
# we can convert a range to a list
r = list(range(5))
r

[0, 1, 2, 3, 4]

There are many different ways to interact with lists. For instance:

__list.append(x)__ Add an item to the end of the list.

__list.extend(L)__ 
Extend the list by appending all the items in the given list.

__list.insert(i, x)__ Insert an item at a given position.

__list.remove(x)__ Remove the first item from the list whose value is x.

__list.pop([i])__ Remove the item at the given position in the list, and return it.

__list.index(x)__ Return the index in the list of the first item whose value is x.

__list.count(x)__ Return the number of times x appears in the list.

__list.sort()__ Sort the items of the list in place.

__list.reverse()__ Reverse the elements of the list in place.

In [28]:
# "add" two lists
x = list(range(5))
y = list(range(10,15))
z = x + y
z

[0, 1, 2, 3, 4, 10, 11, 12, 13, 14]

In [29]:
# access items from a list
print('first', z[0])
print('last', z[-1])
print('first 3', z[:3])
print('last 3', z[-3:])

first 0
last 14
first 3 [0, 1, 2]
last 3 [12, 13, 14]


In [30]:
# this index notation also applies to strings
name = 'Power Plant Reuter-West'
print(name[:5])

Power


In [31]:
# you can also test for the presence of items in a list
5 in z

False

Python is full of tricks for iterating and working with lists

In [32]:
# a cool Python trick: list comprehension
squares = [n**2 for n in range(5)]
squares

[0, 1, 4, 9, 16]

In [33]:
# iterate over two lists together uzing zip
for item1, item2 in zip(x,y):
    print('first:', item1, 'second:', item2)

first: 0 second: 10
first: 1 second: 11
first: 2 second: 12
first: 3 second: 13
first: 4 second: 14


We are almost there. We have the building blocks we need to do basic programming. But Python has some other data structures we need to learn about.

## Dictionaries ##

This is an extremely useful data structure. It maps __keys__ to __values__.

Dictionaries are unordered!

In [34]:
# different ways to create dictionaries
d = {
    'name': 'Reuter West',
    'capacity': 564,
    'fuel': 'hard coal',
}
e = dict(name='Reuter West', capacity=564, fuel="hard coal")
e

{'name': 'Reuter West', 'capacity': 564, 'fuel': 'hard coal'}

In [35]:
# access a value
d['capacity']

564

Square brackets ``[...]`` are Python for "get item" in many different contexts.

In [36]:
# test for the presence of a key
print('fuel' in d)

True


In [37]:
# try to access a non-existant key
d['technology']

KeyError: ignored

In [38]:
# a way around missing keys -> defaults
d.get('technology', 'OCGT')

'OCGT'

In [39]:
# add a new key
d['technology'] = 'CHP'
d

{'name': 'Reuter West',
 'capacity': 564,
 'fuel': 'hard coal',
 'technology': 'CHP'}

In [40]:
# iterate over keys
for k in d:
    print(k, d[k])

name Reuter West
capacity 564
fuel hard coal
technology CHP


In [41]:
# better way
for k, v in d.items():
    print(k, v)

name Reuter West
capacity 564
fuel hard coal
technology CHP


## Functions ##

For longer and more complex tasks, it is important to organize your code into reuseable elements.

Cutting and pasting the same or similar lines of code is tedious and opens you up to errors.

**DRY** principle: "don't repeat yourself".

Functions are a central part of advanced python programming.

Functions take some inputs ("arguments") and do something in response.

Usually functions return something, but not always.

In [42]:
# define a function
def say_hello():
    """Return the word hello."""
    return 'Hello'

In [43]:
# functions are also objects
type(say_hello)

function

In [44]:
# this doesnt call
say_hello?

In [45]:
# this does
say_hello()

'Hello'

In [None]:
# assign the result to something
res = say_hello()
res

In [46]:
# take some arguments
def say_hello_to(name):
    """Return a greeting to `name`"""
    return 'Hello ' + name

In [47]:
# intended usage
say_hello_to('World')

'Hello World'

In [48]:
# take an optional keyword argument
def say_hello(name, german=False):
    """Say hello in multiple languages."""
    if german:
        greeting = 'Guten Tag '
    else:
        greeting = 'Hello '
    return greeting + name

In [None]:
print(say_hello('Mary'))
print(say_hello('Max', german=True))

### Anonymous Functions

In [49]:
def square(n):
    return n**2

In [50]:
square(3)

9

In [51]:
square = lambda n: n**2

In [52]:
square(2)

4

The one with `lambda` is called an anonymous function. Anonymous functions can only take up one line of code, so they aren’t appropriate in most cases, but can be useful for smaller things.

### Pure vs. Impure Functions

Functions that don't modify their arguments or produce any other side-effects are called [_pure_](https://en.wikipedia.org/wiki/Pure_function). 

Functions that modify their arguments or cause other actions to occur are called _impure_.

Below is an impure function.

In [None]:
def remove_last_from_list(input_list):
    input_list.pop()

In [None]:
names = ['Max', 'Martha', 'Marie']

In [None]:
remove_last_from_list(names)

In [None]:
names

In [None]:
remove_last_from_list(names)

In [None]:
names

We can do something similar with a pure function.

In general, pure functions are safer and more reliable.

In [None]:
def remove_last_from_list_pure(input_list):
    new_list = input_list.copy()
    new_list.pop()
    return new_list

In [None]:
names = ['Max', 'Martha', 'Marie']

In [None]:
new_names = remove_last_from_list_pure(names)

In [None]:
names

In [None]:
new_names

### Namespaces

In python, a [namespace](https://docs.python.org/3/tutorial/classes.html#python-scopes-and-namespaces) is a mapping between variable names and python object. You can think of it like a dictionary.

The namespace can change depending on where you are in your program. Functions can "see" the variables in the parent namespace, but they can also redefine them in a private scope.

In [53]:
name = 'Max'

def print_name():
    print(name)

def print_name_v2():
    name = 'Martha'
    print(name)
    
print_name()
print_name_v2()
print(name)

Max
Martha
Max


## Exercises

**Task 1:** What is 5 to the power of 5?

In [54]:
5 ** 5

3125

**Task 2:** Split the following string into a list by splitting on the space character:

In [55]:
s = "Data Science for Energy System Modelling"

In [56]:
s.split(" ")

['Data', 'Science', 'for', 'Energy', 'System', 'Modelling']

**Task 3:** Create a list with the names of every planet in the solar system (in order)

In [57]:
planets = ["Mercury", "Venus", "Earth", "Mars", "Jupyter", "Saturn", "Uranus", "Neptune"]

**Task 4:** Have Python tell you how many planets there are by examining your list

In [58]:
len(planets)

8

**Task 5:** Use slicing to display the first four planets (the rocky planets)

In [59]:
planets[:4]

['Mercury', 'Venus', 'Earth', 'Mars']

**Task 6:** Iterate through your planets and print the planet name only if it has an "s" at the end

In [None]:
for p in planets:
    if p.endswith("s"):
        print(p)

Venus
Mars
Uranus


In [60]:
for planet in planets:
  if planet.endswith("y"):
    print(planet)

Mercury


**Task 7:** Create a dictionary that contains the main facts about the Reuter West power plant.

> https://powerplants.vattenfall.com/reuter-west/

In [61]:
rw = {
    "Country": "Germany",
    "Electricity Capacity": 564,
    "Heat Capacity": 878,
    "Technology": "Combined heat and power (CHP)",
    "Main Fuel": "Hard coal",
    "Vattenfall ownership share": "100%",
    "Status": "In Operation",
}

**Task 8:** Use this dictionary to access the main fuel type.

In [62]:
rw["Main Fuel"]

'Hard coal'

**Task 9:** Add the power plant's approximate latitude and longitude to the dictionary.

In [63]:
rw["x"] = 13.24
rw["y"] = 52.53

**Task 10:** Write a function that converts units of energy from 'ktoe' to 'GWh'

> https://www.iea.org/data-and-statistics/data-tools/unit-converter

In [64]:
def ktoe_to_gwh(x):
    return 11.63 * x

**Task 11:** Write a more general unit conversion function that converts between all units of energy listed under the link below. The function should take arguments: for the original value, the original unit and the target unit. Implement the function in a way that the default target unit is "Wh".

> https://www.iea.org/data-and-statistics/data-tools/unit-converter

> You can also just pick three units to convert between if you don't feel like going through all combinations.

In [65]:
def to_joule(value, from_unit):
    if from_unit.endswith('cal'):
        return value / 0.2390
    elif from_unit.endswith('Btu'):
        return value / 0.0009478
    elif from_unit.endswith('Wh'):
        return value / 0.0002778
    elif from_unit.endswith('toe'):
        return value * 2.388e11
    elif from_unit.endswith('tce'):
        return value * 3.412e11
    else:
        raise NotImplementedError()

In [66]:
def convert_unit(value, from_unit, to_unit="Wh"):
    x = to_joule(value, from_unit)
    if to_unit.endswith('cal'):
        x *= 0.2390
    elif to_unit.endswith('Btu'):
        x *= 0.0009478
    elif to_unit.endswith('Wh'):
        x *= 0.0002778
    elif to_unit.endswith('toe'):
        x /= 2.388e11
    elif to_unit.endswith('tce'):
        x /= 3.412e11
    else:
        raise NotImplementedError()
        
    return x

In [67]:
convert_unit(200, "toe")

13267727999.999998

**Task 12:** Verify the function above by looping through all combinations of unit conversions and assert that applying the function back and forth results in the same value.

In [68]:
from itertools import product

In [69]:
units = ["cal", "Btu", "Wh", "toe", "tce"]

In [70]:
for i, j in product(units, units):
    x = convert_unit(convert_unit(100, i, j), j, i)
    print(x)

100.0
100.0
100.0
100.0
100.0
100.0
100.0
100.0
100.0
100.0
100.0
100.0
100.0
100.0
100.0
100.0
100.0
100.0
100.0
100.00000000000001
100.0
100.0
100.00000000000001
100.0
100.0
