In [None]:
%pip install --no-cache-dir --force-reinstall https://dm.cs.tu-dortmund.de/nats/nats25_00_02_python_basics-0.1-py3-none-any.whl
import nats25_00_02_python_basics

# Introduction
## Python (Basics)

In this notebook, we will have a glance at the most basic functionalities of Python, that you need to write your first pieces of code.
Understanding Python is elementary to the excercises for this course.
To keep things simple for those who are unfamiliar with Python, most excercises contain code frames that you need to fill.
As a bonus excercise in Python, it is highly encouraged that you try to understand everything in the code frames!

If you do not understand something, try first to read it up in the docs, on stack overflow, or wherever, and if everything else fails, ask me.  
Giving up is not an option. 😉

### Kernels and notebooks
Python is an interpreter language which means it is not compiled but the source code is parsed during running time and executed by the interpreter.
An instance of the interpreter is called kernel.
Where classically interpreter languages execute a script and exit afterwards, a lot of modern Python programming is centered around interactive sessions on an interactive Python kernel with some benefits like built-in plot visualization (IPython), that allows additional inputs during runtime, similar to a shell.
The IPython kernel keeps all globally defined variables from previous code in scope for future reference.
Notebooks make use of that concept by keeping an IPython kernel alive in the background and feeding it the code of whatever cell you wish to execute.
The global variable scope then allows to reference functions and variables from other cells.
But we need to keep in mind, that all these variables are held in the scope of the kernel and as soon, as we restart the kernel (e.g. when reopening the notebook), the variable scope is gone.
It is therefore strictly recommended to keep your cells "in order" and not reference upward of the current before 

### Variables

Python is untyped which makes writing Python code pretty easy.
New variables do not need to be declared but can be used on the left hand side of an assignment immediately:

In [None]:
# Basic data types
myString = "Hello World" # = 'Hello World' # Single and double quotes are interchangeable
myInt = 5
myFloat = 5.5
myBoolean = True # alternatively False

# Untyped variables can require some getting used to
myVar = myString
myVar = myInt
myVar = myFloat
myVar = myBoolean
myVar = None # The equivalent of "null" in other languages

print("myString =", myString)
print("myInt =", myInt)
print("myFloat =", myFloat)
print("myBoolean =", myBoolean)
print("myVar =", myVar)

Now it is time for you to write some code.
Define two variables `firstName` and `lastName`, concatenate them with a space inbetween and store the result in a variable `fullName`.
String concatenation in Python works with the `+` sign (to append something else, simply cast it to string using the function `str`).

In [None]:
# firstName = ...
# lastName = ...
# fullName = ...
pass # Your solution here
print(fullName)

This is a test cell. Do not worry if you don't understand this code yet.
At the end of this notebook, read these test cases again and check if you fully understand how they work.

In [None]:
assert firstName in fullName, "First name is not part of full name"
assert lastName in fullName, "Last name is not part of full name"
assert fullName[:len(firstName)] == firstName, "Full name does not start with first name"
assert fullName[-len(lastName):] == lastName, "Full name does not end with last name"
assert fullName[len(firstName)] == " ", "After first name must be a space"

In future notebooks, you will encounter hidden test cells, that adopt your global variables and run some automatic tests to validate your code.
The cell below is an example of such a test cell, that runs the same code as the visible tests above.

In [None]:
nats25_00_02_python_basics.hidden_tests_9_0(fullName, lastName, firstName)

### Data structures

The most basic data structures in Python are lists, tuples and dictionaries.
- Lists (square brackets) are simply a linear collection of values independent of their type.
Further, lists have a variable length which makes them the go to solution for many applications.
- Tuples (round brackets) are basically the same as lists but they are immutable: You can not change values in a tuple or change its size.
- Dictionaries (curly brackets) are basically hashmaps and can contain any number of key-value-pairs, again independent of their types.

In [None]:
myList = [1,2,3,4,5]
myTuple = (myString,myInt) # Only executes correctly after executing above code cell
myDict = {
    "Foo" : 100,
    2 : "bar",
}
print("myList =",myList)
print("myTuple =",myTuple)
print("myDict =",myDict)

To adress values within lists and tuples, you can procede similar to arrays in other programming languages, by providing an index in square brackets like `myList[2]`.
Adressing values in dictionaries is similar but you need to provide a key rather than an index as in `myDict["Foo"]`.
To get the size of any of these data structures, you can use the length function `len` as in `len(myList)`.

Some of the more important features of lists in Python are slicing and list comprehension.
Slicing a list/tuple means taking a sublist/subtuple defined by a starting index (inclusive) and an ending index (exclusive).
The starting and ending index are seperated by a colon.
Not providing a starting or ending index is equivalent to 0 or the length of the list/tuple.
Negative indices are valid and describe a position from the end of the list/index.

In [None]:
mySlice = myList[:] # Take everything
print("mySlice =",mySlice)
mySlice = myList[:3] # Take the first 3 elements
print("mySlice =",mySlice)
mySlice = myList[-3:] # Take the last 3 elements
print("mySlice =",mySlice)
mySlice = myList[1:4] # Take the elements 1, 2 and 3
print("mySlice =",mySlice)

List comprehensions are an easy way to create a list with very little code.
The basic idea is to write an expression, that describes all elements that should be contained in the resulting list inside brackets.
The definition must start with the expression that defines the elements (below `i*2`) and end with an expression that explains how the variables in the prior part should be substituted.variables shall be 

In [None]:
mySecondList = [i*2 for i in myList]
print(mySecondList)

Similarly, one can apply a dictionary comprehension by replacing the prior expression with a mapping definition `key: value` like below.

In [None]:
mySecondDict = {str(i): 2**i for i in myList}
print(mySecondDict)

If you wish to limit the iterative part of the list comprehension to specific elements you can append an if clause to the end.

In [None]:
myThirdList = [i//2 for i in myList if i % 2 == 0] # // is integer division; % is modulo
print(myThirdList)

Now it is once again time for you to write some code.
Compute the first 10 powers of 2 (1, 2, 4, $\ldots$) using a list comprehension and store the result as `firstPowers`.  
Hint: `a**b` is Python notation for $a^b$.

In [None]:
# firstPowers = 
pass # Your solution here

In [None]:
nats25_00_02_python_basics.hidden_tests_22_0(firstPowers)

### Flow control

Flow control in Python can be done by the three concepts of `if/elif/else`, `for/else` and `while`.
Similar to other languages, you write the keyword followed by a boolean expression (no brackets needed!) followed by a colon.
In contrast to other languages, the body of these expressions must not be contained in brackets either, but are disned by indentaion.
Experiment a little and change the variables and look

In [None]:
if myVar:
    print("myVar is True")
elif myInt > 3:
    print("myVar is False but at least myInt is bigger than 3")
else:
    print("neither is myVar True, nor is myInt bigger than 3")

In [None]:
for i in range(6): # range(x) creates a sequence of numbers from 0 to x-1; equivalent to [0,1,2,3,4,5] but not actually a list
    if i > 5:
        print("i is too large, breaking out of the loop!")
        break
    print(i)
else:
    print("The loop executed completely. No break happened.")

In [None]:
i = 1
while i < 13:
    print(i)
    i *= 2
else:
    print("The loop executed completely. No break happened.")

At times you wish to iterate over multiple lists at once.
In other languages, one would define an index variable and fetch the values of each iterable at the specific index in every loop.
That is quite cumbersome and requires a lot of code, which is not really "the Python way".
In Python, you can zip multiple lists using the `zip` function to obtain a joined collection (`zip([1,2,3],[4,5,6])` behaves like `[[1,4],[2,5],[3,6]]`) and "unwrap" short collections in place by simply writing multiple variables on the left hand side of an assignment.

In [None]:
myListA = [1,2,3]
myListB = [4,5,6]
a,b,c = myListA
print("a =",a,"b =",b,"c =",c)
for x,y in zip(myListA, myListB):
    # This unwrapping also works with multiple return values from functions!
    print("x =",x,"y =",y)

And it is your time to code again.
Write code that concatenates the first 100 integers starting from 1 as a string (`123456789101112...`) until the string is longer than 40 characters.
Store the resulting string as `myLongString`.

In [None]:
# myLongString = ...
pass # Your solution here

In [None]:
nats25_00_02_python_basics.hidden_tests_31_0(myLongString)

### Functions

It is of course tedious and confusing to have the entire code in a large blob of text (even if separated into cells).
We would rather write compact functions to use again and again.
The easiest form to define a function in Python is by the use of the keyword `def`.
The syntax is straight forward and feel somehwat familiar to you by now:

In [None]:
def myFunction(myArg1, myArg2):
    # This is some nicer printing method. Check out this description: https://pyformat.info/
    print("I got the arguments '{}' and '{}'.".format(myArg1,myArg2))
    return 1

myFunction("Hello", "World")
myFunction(myArg2="Hello", myArg1="World")

So basically you write `def`, the function name and a comma separated list of arguments in round brackets followed by a colon.
As in flow control, the body must be indented!
When calling functions, unnamed arguments will be read in order and named arguments will be inserted in the correct position.

In case you wish to use default arguments, simply add an assignment to the function definition.
Arguments with default names must be behind all arguments without default values.

In [None]:
def mySecondFunction(myArg1, myArg2="Hi"):
    print("I got the arguments '{}' and '{}'.".format(myArg1,myArg2))
    return 2

mySecondFunction("Test")
mySecondFunction("Test", "Override")

Now it is time again for you to code!
Write a function named "myFibonacci", that takes an argument `n` and returns a list of the first `n` Fibonacci numbers.
In case `n` is not given, assume a default value of 10.  
Hint: You can get a list of variable and method names of an object `o` by using the `dir(o)` function. Maybe lists have a function that you need.

In [None]:
# Define myFibonacci!
pass # Your solution here

In [None]:
nats25_00_02_python_basics.hidden_tests_38_0(myFibonacci)

### Getting `help`

You will often have to rely on predefined functions either from Python itself or imported packages.
To get information on what a function can do, you can call the `help` function with a function or type as argument and will receive a print of the functions documentation (if available!).

In [None]:
def some_function(x):
	"""This function is just a stub that consumes the parameter and does nothing.

	Parameters
	----------
	x : any
		This is completely ignored.
	"""
	pass

help(some_function)

### Looking at `dir`ectories (as in "table of contents")

Switching between different programming languages, one can easily forget how e.g. "that specific string operation" is named in Python.
Typically, one would need to open the documentation of `str` and look at what functions it provides.
In Python, you can get a list of *all available fields* of any variable/package/class/... by just calling the `dir` function.

Calling `dir` will also show you all the "hidden" fields of an object.
Python does not provide any functionality to hide "private" fields to other programmers, instead it is agreed upon, that variables starting with an underscore (except for `scikit-learn` devs, who use a trailing underscore for whatever reason) are not meant for use by other developers.
That does not mean, that you can not (or should not) access them, just that any errors resulting from you fiddling with these fields are *your problem*.

["After all, we're all consenting adults here."](https://mail.python.org/pipermail/tutor/2003-October/025932.html)

In [None]:
# This print looks horrible
print("Plain dir print:")
print(dir(str))
# This print looks a bit nicer
print("\nString-joined dir print:")
print(", ".join(dir(str)))
# Make a nice print and filter out "hidden" fields
print("\nFiltered and string-joined dir print:")
print(", ".join([field for field in dir(str) if not field.startswith("_")]))

### Files

Reading and writing files is a core functionality of every language.
In Python you can open read and write streams to files with the `open` function.
The first argument is the file name and the second argument is a string containing `r`,`w`, or `b` for read, write and binary mode.
As you should close streams once you are done, you can use the keyword `with` which opens a stream and automatically closes it once the following code block is left.
Reading file handles can be used like an iterator over lines.

In [None]:
import urllib
file_path, _ = urllib.request.urlretrieve("https://dm.cs.tu-dortmund.de/nats/data/loremipsum.txt")
with open(file_path,"rt") as f:
    for line in f:
        # str.strip() removes leading and trailing whitespaces such as line breaks.
        print(line.strip())

This is most of the Python basics.
If you find anything in the assignment codes, that you do not fully understand, try looking it up in the docs or search engine.
The Python community is quite large and you will very likely find your answer.

Next up are the most important packages and how to use them.
Afterwards, we will take a look at more advanced Python features, that will be relevant later on in the course.