WISO100303 / Johannes Schmidt & Peter Regner

# **An introduction to scientific programming**

<br> <br> <br> <br><br> <br> <br> <br>

# Introduction


  
  

## An appetizer

How much gas did Austria save in 2022?

For full analysis (in German - code in R) see [energie.wifo.ac.at](https://energie.wifo.ac.at/analysis/gas-savings).


![Gas consumption in Austria](images/gas-consumption-savings.png)

## Survey

Results of mentimeter survey.

## Who we are

Johannes Schmidt <johannes.schmidt@boku.ac.at>
* Associate Prof. in Energy & Resource Economics at the Institute of Sustainable Economic Development
* Works on modeling of renewable energy systems in R, GAMS (and Python)
* Studied Computer Science at TU Wien

Peter Regner <peter.regner@boku.ac.at>
* PhD student at the Institute of Sustainable Economic Development
* Worked almost 7 years as Python developer in semiconductor industry
* Studied mathematics at TU Wien

## Lecture dates

- **Lecture and exercise presentations**: 14:00-17:15, 15min break. See BOKU online for all dates
- **Interim tests**:
    -  2025-11-25, 14:00-15:30. On laptop, but in presence! No registration necessary.
    -  2026-01-27, 14:00-15:30. On laptop, but in presence! No registration necessary.

## Grading

**Expectations:**

- **Assignments:** Complete 4 weekly homework exercises. You may work in groups, but each student must submit their own solution by sharing the notebook link on BOKU Learn.
- **Attendance:** Mandatory during homework discussions. Be prepared to present your solutions.

Homework solutions will be reviewed at the start of each session.

**Grade Calculation:**

- **Interim tests (70% of total points):**
- **Homework exercises (30% of total points):**
    - Mark completed exercises in BOKU Learn before the lecture.
    - Students will be randomly selected to present their solutions.
    - Failure to present or explain a marked exercise results in losing all points for that week and increases your chances of being selected again.
    - Incorrect or incomplete solutions lose points for the specific exercise.
    - Ensure your code passes all tests without modifying them.

**Passing Requirements:**

To pass, you must score more than 50% on both homework and each interim test:

```=if(and([[Checkmarks]] > 50, [[first_test]] > 50, [[second_test]] > 50), [[total_weighted_score]], 0)```

Grades are available on [_BOKU Learn_](https://learn.boku.ac.at/grade/report/user/index.php?id=75887).

![Mapping from points to grade letters](images/grading-point-letter-mapping.png)

## Help

If you need help, there are several things you can do.
- Google (Programming is a mixture of knowing how to do things and knowing of how to google other things). Stackoverflow is a great resource, but it will show up in your google results anyhow.
-  Use the BOKU learn forum for your questions. Do not post complete solutions there, please.
-  Ask us in class. You *have* to prepare questions once, but everyone is welcome to ask questions about in-class or home exercises - or about anything else you didn't grasp. The more you contribute, the better for all of us!

## A word on the use of Artificial Intelligence

Modern AI tools for text (LLM) like ChatGPT have become extremely helpful in many areas, also for programming: to generate code snippets, to ask for advice and a handy replacement for Google.

However, we discourage you from solving the homework exercises with ChatGPT or similar tools. To learn programming, it is important to try out things yourself (and eventually fail). Only if you are stuck, ask a learning buddy for help: human or non-human.

AI can help as an assistant, but it is (still?) crucial to understand how programming works, because AI tools do not guarantee correctness:

<img src="images/chat-gpt.png"  width="600">

([Find here another example.](https://chat.openai.com/share/f321bcf6-0c56-4683-8fa2-252218f29101))

We will take a quick look at some AI tools in lecture 04 and how they can help.

Please note that, even if we cannot check if you copy solutions from AI tools or fellow students, you need to be able to present the solutions in class. And during the two tests, collaboration/communication, AI tools and materials except the ones from the lecture are not allowed.

## Class content

Learn to program in a scientific context using Python.

- Lecture 1: Python introduction 1 (Google Colab, functions, variables) 
- Lecture 2: Python introduction 2 (recap, order of execution, numpy arrays, plotting)
- Lecture 3: Python introduction 3 (recap)
 
------ Test ------

- Lecture 4: Object types and more about functions
- Lecture 5: Numpy and the Python scientific ecosystem
- Lecture 6: Numpy (recap & fancy indexing)
- Lecture 7: Pandas 
- Lecture 8: Models & Scikit-Learn

------ Test ------ 

Practical applications in the class deal with renewable electricity, predicting electricity load, and how to find the closest toilette in Vienna.

# Google Colab

- Python is a programming language. Python can be executed with different tools.
- We will use the platform *Google Colab* https://colab.google/
- Works out of the box as hosted platform to run Python code and do data analysis in your web browser.


There are many good and very similar alternatives, such as:

- [Jupyter](https://jupyter.org/) is a free and can be installed on your own computer.
- [VSCode](https://code.visualstudio.com/docs/datascience/jupyter-notebooks) is a comprehensive code editor that supports Jupyter notebooks.
- [GitHub Codespaces](https://github.com/features/codespaces) is very similar to VSCode, but hosted in the cloud, i.e. you can use it directly from your browser.
- [databricks](https://www.databricks.com/) is an enterprise-grade platform for data science and engineering.
- [marimo](https://marimo.io/) is the new cool kid on the block.


## Quick demo of Google Colab

- create notebooks in your browser, allows you to execute code
- for each notebook there is an automatic version history (like Google Docs) - if it is stored on your Google Drive
- you can invite collaborators (like Google Docs again)
- there is a *Table of contents*
- keyboard shortcuts make things way faster (Tools -> Keyboard shortcuts)
- there is more fancy stuff, just click through the menu!

Notebooks can be shared via link:
- click on *Share* on the top right
- set access rights (e.g. "Anyone with the link can view")
- copy the link and submit it on BOKU Learn

## Lecture workflow

- one Colab notebook per lecture: lecture content & in-class exercises
    - link shared on BOKU learn before the lecture
    - save a copy in your Google Drive (File -> Save a copy in Drive)
    - solve exercises during class (will not be graded)

- one Colab notebook per homework assignment
    - link for homework shared via BOKU learn
    - **save a copy in your Google Drive (File -> Save a copy in Drive)**
    - solve exercises, share the notebook link and submit on BOKU Learn

Homework submissions will not be graded, only your presentation and the number of checked exercises will be considered for the final grade.

# Python Basics

**This is text shown in the notebook with *markdown***

Also formula are supported via $\LaTeX$:
 $$
    e^{i\pi} + 1 = 0
 $$

Cells of type _Code_ can be used to execute Python code.

Python as a calculator:

In [None]:
2 + 3 * 5

A common first example for programming languages is the so called [hello world program](https://en.wikipedia.org/wiki/%22Hello,_World!%22_program): it simply displays the text "Hello world!".

In [None]:
print("Hello World!")

Note that `print()` does not refer to printing on paper, but displaying text on the screen.

In [None]:
# BTW, this is a comment, no markdown here!
# Comments are pieces of not exectued text inside code cells.
# They start with a # symbol.

In [None]:
# Great, but can we do more?
x = 1

In [None]:
x

x is a variable. A variable is a name with an associated value. The value can be changed by assignment.

In [None]:
x = 2


**Important:** Note that `=` is not a comparison, but an assignment operation. This sets the variable `x` to the value `1`. In mathematics you would write this as $x:=1$, which might be slightly more intuitive, because $:=$ indicates the difference in semantics of the thing on the left and the thing on the right side due to the asymmetry of the symbol used. The programming language R uses the syntax `x <- 1`. In Python, the value is assigned to the variable on the left side of the `=` symbol.

In [None]:
#This code won't work.
#x := 2

**Important:** Note that we have to follow strict rules how we write code, the *syntax*. In programming, syntax refers to the set of rules that dictate how programs written in a particular programming language should be structured. These rules define the correct combinations of symbols, keywords, and operators that are valid in that language. 

The notebook will always show the result of the last command in a cell:

In [None]:
x

One can also use `print(x)` to show something on the console. Observe the difference between using `x` and `print(x)`.

In [None]:
print(x)

In [None]:
x + 1
x

In [None]:
print(x + 1)
print(x)

You can have as many variables as you want. A variable has to start with a letter or an underscore `_`, can contain letters, numbers, and the underscore `_`.

In [None]:
y = 7

In [None]:
z = x * y

In [None]:
a_good_variable_name_is_descriptive_but_not_too_long = 43

In [None]:
a_good_variable_name_is_descriptive_but_not_too_long * 25

## Exercise 1

Create a variable `side_a_m` and assign the value 2.12 to it. Create a second variable `side_b_m` and assign the value 4.12 to it. Create a third variable `size_of_rectangle_m2` and assign the value of `side_a_m` * `side_b_m` to it. Print the result `size_of_rectangle_m2`.

In [None]:
# # # # # YOUR SOLUTION GOES HERE # # # # #

In [None]:
# # # # # RUN THIS CELL TO CHECK YOUR RESULTS # # # # # 

from urllib.request import urlretrieve
import os.path
if not os.path.exists('check.py'):
    urlretrieve('https://bokubox.boku.ac.at/get/62a8d3192270764797138e9ef22b0a6b/check.py', filename='check.py')
from check import check_solution

check_solution([
    ("size_of_rectangle_m2", 8.7344)
], globals())

<div style="color:#555;border-top:1px solid #999;text-align:right;padding:4px;">End of exercise</div>

Please observe: The code is executed from top to bottom. In the whole notebook and in the cell. The order matters!

In [None]:
size_of_rectangle_m2 = side_a_m * side_b_m
side_a_m = 5
side_b_m = 2
print(size_of_rectangle_m2)

# Functions
There are a lot of digits after the comma in the last result... can we reduce it by rounding?

In [None]:
print(round(size_of_rectangle_m2, 2))

What happened?

We called two functions: `print()` and `round()`. Calling a function means that code which is hidden inside the function is executed. In `()` we can pass one or several arguments to the function: this is a value which is used inside the function for calculation. A function returns a result (such as `round()` returns the value passed to it, rounded) and it may produce a side-effect (such as `print()` which prints the result to the screen).

In [None]:
round(1.45)

In [None]:
round(1.45, 1)

The function is called `round(number, ndigits)`.

It has 2 arguments:
 - the number to round
 - how many digits should be rounded to
  
 Python has predefined arguments: sometimes, if you do not supply an argument to the function call, it will automatically assume some default value. In the case of `round()`, if no second value is supplied, `0` is used, i.e. `round(x)` is equivalent to `round(x, 0)`.

Google Colab automatically shows the possible arguments to a function, once you type the parenthesis symbol after the function name.

## Exercise 2
Austria had an energy use of 372 000 000 MWh in the year 2019. It also had a population of 8.88 million people. Calculate the per capita energy use in MWh / capita (i.e. by dividing the energy use by population size), round it to 2 digits and store it in the variable `energy_use_at_mwh_per_capita`.

Observe: you can use the underline symbol `_` in a large number to separate groups of digits, i.e. `1000 = 1_000 = 1_0_00 = 1_0_0_0`. Furthermore, `.` is used to indicate the comma, i.e. `8.000 = 8`. Do not use `,` to separate digits in numbers, this will create a list of multiple values (more about that later).

In [None]:
# # # # # YOUR SOLUTION GOES HERE # # # # #

In [None]:
# # # # # RUN THIS CELL TO CHECK YOUR RESULTS # # # # # 
from check import check_solution
check_solution([
    ("energy_use_at_mwh_per_capita", 41.89, "invalid result", lambda x, y: x == y, 'compare')
], globals())

<div style="color:#555;border-top:1px solid #999;text-align:right;padding:4px;">End of exercise</div>

There are of course other functions available too. You can also develop your own functions.

E.g. the power output of a wind turbine is given by the following *simplified* formula:

$$
    p_\textrm{wind}= C_p \cdot 0.5 \cdot \rho \cdot A \cdot v^3,
$$

where $C_p$ is the efficiency of the wind turbine, $\rho$ is the air density at around $1$, $A$ is the area swept by the rotor, and $v$ is the wind speed.

Below, you see the code for a function which calculates the output of a wind turbine. Please observe the syntax: the function definition starts with `def` followed by our choice of the name of the function. In parentheses you find the arguments of our function - also chosen by ourselves. After `:`, you find the code which is executed upon calling the function. Please observe the intendation: in python, it is very important. Code on the same level of intendation belongs to the same code block. Here, this means that the function code stops once the intendation becomes 0.

In [None]:
# we will explain imports later. for the moment,
# it is sufficient to know that imports give access to functions and
# variables not available otherwise
import numpy as np

def windturbine_simulation_mw(wind_speed_ms, rotor_diameter_m):
    """Calculate output of a wind turbine in MW, given wind speed in m/s and rotor_diameter in m."""
    c_p = 0.4
    rho = 1
    area = rotor_diameter_m**2 * np.pi / 4
    p_out = c_p * 0.5 * rho * area * wind_speed_ms**3
    return p_out / 1_000_000

wind_speed_ms_stuhleck = 14
rotor_diameter_m_vestas = 100
windturbine_simulation_mw(wind_speed_ms_stuhleck, rotor_diameter_m_vestas)

The function call in the cell below won't work, because we do not provide any parameters to the function.

Check the function definition: `windturbine_simulation_mw(wind_speed_ms, rotor_diameter_m)`

The function expects 2 parameters, i.e. `wind_speed_ms` and `rotor_diameter_m`

In [None]:
# windturbine_simulation_mw()

Again, this won't work because we passed only one parameter to a function which expects two parameters:

In [None]:
# This won't work either. Now we provide one parameter, but 2 are required! 
# windturbine_simulation_mw(wind_speed_ms_stuhleck)

## Exercise 3

The output of a solar panel is given by

$$
    p_\textrm{pv}=A \cdot e \cdot R,
$$

where $A$ is the area, $e$ is the efficiency of the panel, $R$ is the radiation reaching the panel. 

Write a function `pv_simulation_w(radiation_w, area_m2)` which calculates the output of a solar panel in W, taking the solar radiation R (in W) and the area A (in m²) as input. Define $e$ fixed at 0.18. Test the function for a hypothetical location where the PV area is 20,000m² and solar radiation is 1000 W.

In [None]:
# # # # # YOUR SOLUTION GOES HERE # # # # #

In [None]:
# # # # # RUN THIS CELL TO CHECK YOUR RESULTS # # # # # 
from check import check_solution
check_solution([
    ("pv_simulation_w(0, 42)", 0),
    ("pv_simulation_w(42, 0)", 0),
    ("pv_simulation_w(100, 20_000)", 360_000),
], globals())

<div style="color:#555;border-top:1px solid #999;text-align:right;padding:4px;">End of exercise</div>

# If statements

Is this result realistic? What is the rated capacity of turbines?

In [None]:
wind_speed_ms_high = 30
rotor_size_m_vestas = 100
windturbine_simulation_mw(wind_speed_ms_high, rotor_size_m_vestas)

This is too high. The largest turbine on market currently has a rated capacity of around 15 MW. We somehow have to deal with the fact that power-output is constrained from the top. 

We will use if statements to improve the function. A simple example using if statements. Observe: intendation matters! Furthermore, any code enclosed in `"` is interpreted by Python as text. It is not interpreted as Python code and can be used to annotate output.

In [None]:
x = 1
if x < 42:
    print("x is smaller than 42")
else:
    print("x is not smaller than 42")

Let's now improve our function:

In [None]:
def windturbine_simulation_improved_mw(wind_speed_ms, rotor_diameter_m, rated_capacity_mw):
    """Calculate output of a wind turbine in MW, given wind speed in m/s and rotor_diameter in m, 
       accounting for rated capacity."""
    c_p = 0.4
    rho = 1
    area = rotor_diameter_m**2 * np.pi / 4
    p_out = c_p * 0.5 * rho * area * wind_speed_ms**3 / 1_000_000
    if p_out > rated_capacity_mw:
        p_out = rated_capacity_mw
    return p_out

wind_speed_ms_high = 25
rotor_diameter_m_vestas = 100
rated_capacity_mw_vestas = 3

print(windturbine_simulation_mw(wind_speed_ms_high, rotor_diameter_m_vestas))
print(windturbine_simulation_improved_mw(wind_speed_ms_high, rotor_diameter_m_vestas, rated_capacity_mw_vestas))

## Exercise 4

Load depends on the temperature in a V shaped manner. This is the average temperature in Texas, plotted against the load on the electricity grid using real data:

![Load temperature curve in Texas](images/load.png)

We want to coarsely model this depend behaviour, writing a function called `determine_load_gw`. It determines the `load` on an electricity grid, depending on outside temperature $t$. If the temperature is below 15 degrees, people start heating and therefore

$$
    \mathrm{load} = 20 + (15 - t) \cdot 1.4.
$$
    
If the temperature is above 15 degrees, people start cooling, so we assume

$$
    \mathrm{load} = 20 + (t - 15) \cdot 1.4.
$$

Test the function for temperatures -5, 0, 5, 10, 14, 15, 16, 20, 25.

In [None]:
# # # # # YOUR SOLUTION GOES HERE # # # # #

In [None]:
# # # # # RUN THIS CELL TO CHECK YOUR RESULTS # # # # # 
from check import check_solution
check_solution([
    ("determine_load_gw(-5)", 48),
    ("determine_load_gw(0)", 41),
    ("determine_load_gw(5)", 34),
    ("determine_load_gw(10)", 27),
    ("determine_load_gw(14)", 21.4),
    ("determine_load_gw(15)", 20),
    ("determine_load_gw(16)", 21.4),
    ("determine_load_gw(20)", 27),
    ("determine_load_gw(25)", 34)    
], globals())

<div style="color:#555;border-top:1px solid #999;text-align:right;padding:4px;">End of exercise</div>

## Difference between return and print in a function

The following two functions only differ in their last line.

In [None]:


def determine_load_gw_return(temperature_dc):
    """Determine load in GW given temperature in degree Celsius."""
    load = 20
    if temperature_dc < 15:
        load = load + (15 - temperature_dc) * 1.4
    if temperature_dc > 15:
        load = load + (temperature_dc - 15) * 1.4

    return load

def determine_load_gw_print(temperature_dc):
    """Determine load in GW given temperature in degree Celsius."""
    load = 20
    if temperature_dc < 15:
        load = load + (15 - temperature_dc) * 1.4
    if temperature_dc > 15:
        load = load + (temperature_dc - 15) * 1.4

    print(load)

In [None]:
determine_load_gw_return(20)

In [None]:
determine_load_gw_print(20)

They seem to give the same result... but wait!

In [None]:
load_return = determine_load_gw_return(20)
load_print = determine_load_gw_print(20)

In [None]:
load_return

In [None]:
load_return == None

In [None]:
load_print

In [None]:
load_print == None

Note the difference! The `determine_load_gw_return` function uses a `return` statement at the end. The result of this function is therefore the value of the load variable returned. The `determine_load_gw_print` function uses a `print`statement at the end. It will return `None`, and additionally print the result to the screen. This is a crucial difference! Note: `None` is used if a variable does not contain any value.

## Bonus exercise 5

Instead of using an if statement, you can also use a function that allows you to calculate the absolute value of a number, which can also be used to solve the example. Try to look it up on the internet and write the function `determine_load_alternative_gw`, using it.

In [None]:
# # # # # YOUR SOLUTION GOES HERE # # # # #

In [None]:
# # # # # RUN THIS CELL TO CHECK YOUR RESULTS # # # # # 
from check import check_solution
check_solution([
    ("determine_load_alternative_gw(-5)", 48),
    ("determine_load_alternative_gw(0)", 41),
    ("determine_load_alternative_gw(5)", 34),
    ("determine_load_alternative_gw(10)", 27),
    ("determine_load_alternative_gw(14)", 21.4),
    ("determine_load_alternative_gw(15)", 20),
    ("determine_load_alternative_gw(16)", 21.4),
    ("determine_load_alternative_gw(20)", 27),
    ("determine_load_alternative_gw(25)", 34)
], globals())

<div style="color:#555;border-top:1px solid #999;text-align:right;padding:4px;">End of exercise</div>