# Flow control
## Conditional statements
### Getting user input

Let's first learn how to ask the user for a text with the input function:


In [2]:
txt = input()
print(type(txt))
print(txt
)

<class 'str'>
input


In [4]:
txt = input()
n = int(txt)
print(type(n))
print(n)

ValueError: invalid literal for int() with base 10: 'my'

In [5]:
n = float(input())
if n > 0:
    print(f"number {n} is positive.")

number 5.0 is positive.


In [6]:
n = float(input())
if n > 0:
    print(f"Number {n} is positive")
else:
    print(f"Number {n} is nonpositive (either zero or negative.)")

Number 6.0 is positive


The elif clause allows testing of multiple conditions in a provided order. The block correspondng to the first condition evaluated to True is executed and no further condition is tested.

When none of the conditions evaluates to True, the block corresponding to else is executed.

In [7]:
n = float(input())
if n > 0:
    print(f"Number {n} is postive.")
elif n < 0:
    print(f"Number {n} is negative.")
else:
    print(f"Number {n} is zero.")

Number -7.0 is negative.


# Conditional expressions
Note: Python offers a conditional expression (Python's Ternary operator) which also uses if and else keywords.

The following code assigns different values to the variable t depdending on a condition (n>0):

In [9]:
n = float(input())
if n > 0:
    t = "positive"
else:
    t = "nonpositive"
print(t)

positive


The above code can be written using a conditional expression:

In [10]:
n = float(input())
t = "postive" if n > 0 else "nonpositive"
print(t)

postive


Here is an example of a conditional expression used as the expression in a list comprehension

In [11]:
ns = [-1, 0, 1, -1, -1, 1]
["postive" if n > 0 else "negative" for n in ns]

['negative', 'negative', 'postive', 'negative', 'negative', 'postive']

# Repetition control statements: for loops

## Iterable objects
Iterable objects are capable of providing their elements one at a time. 

Here are some examples of iterable objects:

In [13]:
# Several character of a family sitcom
# Several characters of a family sitcom.
iterable1 = [ "Claire", "Phil", "Haley", "Cameron", "Luke", "Mitchell", "Jay", "Gloria" ]

# A color scale robust to colorblindness: https://cran.r-project.org/web/packages/viridis/vignettes/intro-to-viridis.html
# Hexadecimal numbers: https://wiki.osdev.org/Hexadecimal_Notation
iterable2 = ( 0xfde725, 0x5ec962, 0x21918c, 0x3b528b, 0x440154 )

# An unlikely combination of numbers in a lottery 6 out of 1,2,...,44,45.
iterable3 = { 10, 15, 23, 24, 40 }

# More on colors: https://www.rapidtables.com/web/color/RGB_Color.html
# For loop will iterate over the keys.
iterable4 = { "red":0xFF0000, "green":0x00FF00, "blue":0x0000FF, "black":0x000000, "white":0xFFFFFF }

# Iterable over letters of the string.
iterable5 = "Statistics and Data Science"

# An arithmetic progression.
iterable6 = range( 10, 15 )

# A simple for loop

Here is an example of a for loop

In [14]:
names = [ "Claire", "Phil", "Haley", "Cameron", "Luke", "Mitchell", "Jay", "Gloria" ]
for name in names:
    print(name)

Claire
Phil
Haley
Cameron
Luke
Mitchell
Jay
Gloria


In [15]:
for n in names:
    pass

# A for loop with continue
Let's introduce several print commands to observe the order of code execution (i.e., the flow of the program)

In [17]:
print("Before loop")
for i in range(3):
    print(f"Body start i = {i}")
    pass
    print(f"Body end in i = {i}")
print("After loop")

Before loop
Body start i = 0
Body end in i = 0
Body start i = 1
Body end in i = 1
Body start i = 2
Body end in i = 2
After loop


The keyword continue stops exectution of the remaining part of the loop's body and transfers the execution point back to the begining of the next iteration:

In [18]:
print("Before loop")
for i in range(3):
    print(f"Body start i = {i}")
    if i == 1:
        continue
    print(f"Body end i = {i}")
print("After loop")

Before loop
Body start i = 0
Body end i = 0
Body start i = 1
Body start i = 2
Body end i = 2
After loop


# A `for` loop with else of break
In `for` loops, the keyword else allows to introduce a block of code which is executed after all iterations of the loop are done without any `break`. The `else` block is then executed even if there were no iterations at all because the iterable was empty

In [19]:
print("Before loop")
for i in range(3):
    print(f"Body start i = {i}")
    pass
    print(f"Body end i = {i}")
else:
    print("Else block")
print("After loop")

Before loop
Body start i = 0
Body end i = 0
Body start i = 1
Body end i = 1
Body start i = 2
Body end i = 2
Else block
After loop


The keyword `break` stops execution of the remaining part of the loop's body and transfers the execution point after the end of the loop. 

If an `else` part is present, `break` skips over it.

In [24]:
print("Before loop")
for i in range(3):
    print(f"Body start i = {i}")
    if i == 2:
        break
    print(f"Body end i = {i}")
else: 
    print("Else block")
print("After loop")


Before loop
Body start i = 0
Body end i = 0
Body start i = 1
Body end i = 1
Body start i = 2
After loop


# Repetition control statements: `while` loops

The keyword `while` allows repeating a block of code as long as a provided conditions evaluates to `True`.

The keywords `break`, `continue` and `else` provide the same functionality as in `for` loops

In [18]:
from random import randint

tossedNum = randint(1, 6)
while True:
    guessedNum = int(input("I tossed ranomly a fair dice. Guess the number ..."))
    if tossedNum == guessedNum:
        print("Congratulations!")
        break
    print("Your are wrong, try again")

Your are wrong, try again
Your are wrong, try again
Your are wrong, try again
Your are wrong, try again
Your are wrong, try again
Your are wrong, try again
Your are wrong, try again
Your are wrong, try again
Your are wrong, try again
Your are wrong, try again
Your are wrong, try again
Your are wrong, try again
Your are wrong, try again
Your are wrong, try again
Your are wrong, try again
Your are wrong, try again
Your are wrong, try again
Congratulations!


# User-defined functions

## A function, arguments and the return value

Here is an example of a user-defined function `myMean`:

The function takes on argument `x`.
The function calculates and returns the arithmetic mean of the data provided in the `x` argument.

In [3]:
def myMean(x):
    return sum(x)/len(x)

myMean([1, 2])

1.5

Here is another example of calling the new function from a dictinary comprehensions:

In [5]:
course2grades = {
    "math":(7, 9, 6.5, 8, 8.5),
    "physics":(9, 8.5, 9.5, 8, 7.5, 9.5),
    "philosophy":(6, 6.5, 6, 7)
}

{c:myMean(gs) for c, gs in course2grades.items()}

{'math': 7.8, 'physics': 8.666666666666666, 'philosophy': 6.375}

# An arguement with a default value

Here an additional argument `isGeom` is added to the function `myMeans`.

The value of `isGeom` decides whether arithmetic or geometric mean is calculated.

In [9]:
from math import log, exp

def myMean(x, isGeom = False):
    if isGeom:
        logOfX = (log(xx) for xx in x)
        return exp(sum(logOfX)/len(x))
    else: 
        return sum(x)/len(x)

The new argument has a default value `False` which will be used when no value is provided when the function is called.

In [10]:
myMean([1, 2, 3])

2.0

Here are different variants of providing `True` value for `isGeom`

In [13]:
myMean([1, 2, 3], True) # argument names may be omitted
myMean(x = [1, 2, 3], isGeom = True) # or provided
myMean([1, 2, 3], isGeom = True) #or partially provided

1.8171205928321397

# Raising exceptions (handling errors)

Sometimes a function might be called with wrong argument. For example:

In [14]:
myMean([])

ZeroDivisionError: division by zero

Often it is useful to provide more informative (user defined) error message with help of `raise` command throwing errors/exceptions

In [19]:
from math import log, exp
def myMean(x, isGeom = False):
    if len(x) == 0:
        raise RuntimeError("There must be at least one element to calculate mean")
    if isGeom:
        logOfX = (log(xx) for xx in x)
        return exp(sum(logOfX)/len(x))
    else:
        return sum(x)/len(x)
    
myMean([])

RuntimeError: There must be at least one element to calculate mean

# Describing functions

Python provides a docstring convention for writing documentation to functions (and other language elements).

A short description of a function, quoted with triple quotes, should be provided at the top of the function body:

In [20]:
from math import log, exp

def myMean(x, isGeom=False):
    """Calculates arithmetic or geometric mean of elements in x."""
    if len(x) == 0:
        raise RuntimeError("There must be at least one element to calculate mean.")
    if isGeom:
        logOfX = (log(xx) for xx in x)
        return exp(sum(logOfX)/len(x))
    else:
        return sum(x)/len(x)

This documentation is available as follows:

In [21]:
myMean.__doc__

'Calculates arithmetic or geometric mean of elements in x.'

# Lambda functions
Let's consider a simple, single-expression function:

In [22]:
def myMean(x):
    return sum(x)/len(x)

The same single-expression function can be written in shorter form:

In [23]:
myMean = lambda x:  sum(x)/len(x)

and be called with the usual notation:

In [24]:
myMean([1, 2, 3])

2.0

In some contexts the `lambda` notation allows to write short, compact code

# Self-study tasks

## A function to convert score to grade
An exam `score` is a number in a range from `0` to `maxScore`.
Write a function `score2grade (score, maxScore)` which implements a linear transformation of a score to a grade:

for `score = 0` the returned grade should be `1`;
for `score = maxScore` the returned grade should be `10`;
for `score' beyond range the function should raise an exception.

Add a docstring with a short description of the function. Call the function for several combinations of the arguments to check whether it works correctly.

In [11]:
def score2grad(score, maxScore):
    """
    grading based on score
    """
    if ((score > maxScore) | (score < 0)):
         raise ValueError("The score is out of range")
    elif score == 0:
        return 1
    elif score == maxScore:
        return 10
    else:
        return score/maxScore * 9 + 1
    
print(score2grad(0, 20))    
print(score2grad(1, 20))
print(score2grad(19, 20))
print(score2grad(20, 20))
print(score2grad(21, 20))

1
1.45
9.549999999999999
10


ValueError: The score is out of range

In [30]:
score2grad.__doc__

'\n    grading based on score\n    '

A `lambda` function to convert score to grade

In [14]:
s2g = lambda score, maxScore:  score/maxScore * 9 + 1
s2g(19, 20)

9.549999999999999

In [31]:
# Conditional expression

In [34]:
xs = [1, 2, 3, -2, -3, -4]
print([abs(x) for x in xs])
print([x if x > 0 else -x for x in xs])

[1, 2, 3, 2, 3, 4]
[1, 2, 3, 2, 3, 4]


# Argments `sep` and `end` of `print(...)`

In [40]:
print( "A", "B", "C" )
print( "A", "B", "C", sep="-" )
print( "A", "B", "C", sep="<--->" )
print( "A", "B", sep="???" )  # sep = "???", "???" only prints between two arguements
print("A", sep="???")

A B C
A-B-C
A<--->B<--->C
A???B
A


In [41]:

print( "A" )
print( "B" )
print( "C" )

A
B
C


In [43]:
print( "A", end="" )
print( "B", end="" )
print( "C" )
print( "A", end="" )

ABC
A

## A condition in a loop in a loop: patinting with dots

In [23]:
def printGrid(n):
    for i in range(1,n):
        for j in range(1, n):
               print(".", end = "")
        print("\n")
    
printGrid(10)         # These commands printed the shapes

.........

.........

.........

.........

.........

.........

.........

.........

.........



In [29]:

def printBackslash(n):
     for i in range(1, n):          
          for j in range(1, n):
               if i == j:
                    print("#", end ="")
               else:
                    print(".", end="")
          print("\n", end="")
printBackslash(6)


#....
.#...
..#..
...#.
....#


In [48]:
def printSlash(n):
    for i in range(1, n):
        for j in range(1, n):
            print(".", end = "")
            if j + i == n:
                print("#", end = "")
        print("\n", end="")
printSlash(8)


.......#
......#.
.....#..
....#...
...#....
..#.....
.#......


In [50]:
def printSquare(n):
    for i in range(0, n):
        for j in range(0, n):
            if i == 0 or i == n-1:
                print("#", end = "")
                continue
            elif j == 0 or j == n-1:
                print("#", end = "")
                continue
            else: 
                print(".", end = "")

        print("\n", end ="") # = print("") here, which prints a new line

printSquare(7)
print("A", "B", end="")

#######
#.....#
#.....#
#.....#
#.....#
#.....#
#######
A B

In [51]:
def printX(n):
    for i in range(1, n):
        for j in range(1, n):
            if i == j or i + j == n:
                print("#", end = "")
            else:
                print(".", end = "")
        print("\n", end = "")
printX(10)


#.......#
.#.....#.
..#...#..
...#.#...
....#....
...#.#...
..#...#..
.#.....#.
#.......#


In [60]:
def printLUTriangle(n):
    for i in range(0, n):
        for j in range(0, n):
            if i == 0 or j == 0:
                print("#", end = "")
            elif i + j == n:
                print("#", end = "")
            elif j >= n- i:
                print(" ", end = "")
            else:
                print(".", end = "")
        print("\n", end = "")
printLUTriangle(10)


##########
#........#
#.......# 
#......#  
#.....#   
#....#    
#...#     
#..#      
#.#       
##        


In [65]:
def printRBTriangle(n):
    for i in range(0, n):
        for j in range(0, n):
            if i == n-1 or j == n-1:
                print("#", end = "")
            elif i + j == n:
                print("#", end = "")
            elif j <= n-i:
                print(" ", end = "")
            else:
                print(".", end = "")
        print("\n", end = "")

printRBTriangle(10)

         #
         #
        ##
       #.#
      #..#
     #...#
    #....#
   #.....#
  #......#
##########


# Rewriting the function `myMean`

In [48]:
def myMean( x ):
    return sum(x)/len(x)

vs  = [ 1, 5, 10, 15, 30 ]
vs2 = (v**2 for v in vs ) #here, vs2 is not a set, instead, it is a generator. The data is not stored. 
myMean( vs )     # this works
myMean( vs2 )   # this raises an exception

TypeError: object of type 'generator' has no len()

In [57]:
vs2
vs2 = (v**2 for v in vs )
print("ONE")
for v in vs2:
    print(v)
print("TWO")
for v in vs2:
    print(v)

ONE
1
25
100
225
900
TWO


# Let's practice accessing (remote) files.

Goal: get README.md file of this course and from it select only text lines containing the word course.
Store the selected text lines as a list in textsWithCourse.

In [78]:
url = "https://raw.githubusercontent.com/LUMC/EfDS/main/README.md"

import urllib.request as rq

accessFile = rq.urlopen(url)
SingleLines = accessFile.readlines()
textsWithCourse = []
for line in SingleLines:
    #print(type(line))
    line = line.decode("utf-8")
    #print(type(line))
    if "course" in line:
        textsWithCourse.append(line)

print(textsWithCourse)




['A course of [Statistics and Data Science master](https://www.universiteitleiden.nl/en/education/study-programmes/master/statistics--data-science), Leiden University.\n', 'The course offers a practical introduction to a few programming languages and tools currently used in data science:\n', 'During the course the students will write Python programs of growing complexity (from basic coding examples to fitting \n', 'a machine learning model). After this course you will be able to program simple reproducible data analyses \n', 'During the course you will practice writing [Python](https://www.python.org/) code. After the course you will be able to:\n', '  - The primary source for lecture, exam and retake dates/locations is **Essentials for Data Science** course `4433EDASCY` \n', '    - General course introduction\n', '    - To pass the course, the Assignments A, B, C rounded mean grade must be greater than 5.5.\n', '    - To pass the course, the group assignment rounded grade must be grea