*This notebook is part of  course materials for CS 345: Machine Learning Foundations and Practice at Colorado State University.
Original versions were created by Asa Ben-Hur.
The content is availabe [on GitHub](https://github.com/asabenhur/CS345).*

*The text is released under the [CC BY-SA license](https://creativecommons.org/licenses/by-sa/4.0/), and code is released under the [MIT license](https://opensource.org/licenses/MIT).*


<a href="https://colab.research.google.com/github//asabenhur/CS345/blob/master/fall24/notebooks/module00_02_python_intro.ipynb">
  <img align="left" src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

# A Python Primer

This notebook is part of  course materials for CS 345: Machine Learning Foundations and Practice at Colorado State University.
Original versions were created by Asa Ben-Hur with updates by Ross Beveridge.

*The text is released under the [CC BY-SA license](https://creativecommons.org/licenses/by-sa/4.0/), and code is released under the [MIT license](https://opensource.org/licenses/MIT).*

### Why Python?

As it happens, Python is the most commonly used programming language in machine learning.  In our introduction to the course we argue that it is a good match for the needs of machine learning and data science, especially both for beginners and advanced users.

### Why a primer?

Keep in mind that students entering CS 345 generally have some exposure to Python.  That said, people forget, and also the instructors have their own priorities. For both these reasons, let us take a bit of time and walk through some Python basics.

## Variables and basic Python types

Variables in Python can hold any type, and there is no need to declare them:

In [1]:
x = 5

To determine what type of variable we created, use Python's `type` built in function:

In [2]:
type(x)

int

In [3]:
x = 5.0

In [4]:
type(x)

float

You can coerce a value to a desired type e.g. to an `int` or a `float`:

In [5]:
int(2.6)

2

In [6]:
float(2)

2.0

All the regular mathematical operators (+, -, \*, /, \*\*) work as expected.  The only distinction worth making is between the division operator (/) and the integer division operator (//):

In [7]:
2/5, 2//5

(0.4, 0)

Boolean variables:

In [8]:
type(True), type(False)

(bool, bool)

Here below is a little something to chew upon

In [9]:
type(True) == type(False)

True

## Caveats when using Jupyter notebooks

Anyone familiar with programming generally takes for granted that code is excecuted in sequence.  Indeed, for any single code cell in Jupyter this rule applies.  However, if we run the following cells in an order different than their order in the notebook, this will change the end result:

In [10]:
# Run this cell second
foo = "Cat"

In [11]:
# Run this cell first
foo = "Dog"

In [12]:
# Run this cell third (last)
foo

'Dog'

Since there is a human involved here, your choices will affect the end result.
So, when turning in an assignment for example, we encourage you to execute "Run All" (available in the "Cell" pull down menu), just to make sure that everything runs as it should.

## Strings and Printed Output
Python strings are defined using single or double quotes:

In [13]:
'hello'

'hello'

In [14]:
type("hello")

str

In [15]:
"hello" == 'hello'

True

When not using the notebook (or the Python interpreter), you will need to use the `print` function to display the result of a computation:

In [16]:
print('Hello World!')

Hello World!


Notice that in this case there is no "`Out[ ]`" message, as this is not an output of the evaluation.

Also keep in mind that the Jupyter notebook will only output the result of only the last command in a code cell, another reason why you do sometimes need to use print() in the notebook.

In [17]:
"hello world"
"hello Python"

'hello Python'

### Formatting Printed Output

Printing out useful information for the user about the state of a computation.  But generating a string that conveys the information in a readable form might require a bit of work.  That's where [formatted string literals](https://docs.python.org/3/tutorial/inputoutput.html), or f-strings for short, come in.  Here's an example:

In [18]:
import math
foo = "Cat"
print(f'The {foo} weighs {math.pi:5.2f} pounds.')

The Cat weighs  3.14 pounds.


## Lists

A list is an ordered collection of values, where each value is identified by an index; Python lists are analogous to Java's `ArrayList` data structure, but have greater flexibility.

Let's create a few lists:

In [19]:
vocabulary = ["ameliorate", "castigate", "defenestrate"]
numbers = [17, 123]
mixed_list = [1, 'one', 2.21, [1,2,3]]
empty = []

That last list we created is the empty list.
You can ask a list for its length:

In [20]:
len([])

0

You can append an element to a list using its ```append``` method:

In [21]:
vocabulary = ["ameliorate", "castigate", "defenestrate"]
vocabulary.append('your favorite word')
vocabulary

['ameliorate', 'castigate', 'defenestrate', 'your favorite word']

If you want to find out what a method does or other methods that an object has, use Python's ```help``` function:

In [22]:
help(list.append)

Help on method_descriptor:

append(self, object, /)
    Append object to the end of the list.



### List indexing

Elements of a list are accessed using the bracket operator, and like in Java, indexing starts at 0.


In [23]:
numbers = [17, 123]
print(numbers[0])
print(numbers[1])

17
123


You can also try to see what happens when you try to use an index that is greater or equal to the length of the list.

In Python an index can take a **negative** value.  Can you figure out what that does?

In [24]:
# write a snippet of code that accesses a list at indices with negative values

### Traversing a list

It is common to iterate through the elements of a list using a `for` loop:

In [25]:
words = ["ameliorate", "castigate", "defenestrate", 'cat']
for word in words : 
    print(word)

ameliorate
castigate
defenestrate
cat


Note the use of indentation to define a block.  Python does not use braces {  } the way Java and C do.  When using indentation you have to be consistent, and this is one aspect of Python that may take getting used to.

Another way to iterate over the elements of a list is using the `range` function:

In [26]:
words = ["ameliorate", "castigate", "defenestrate"]
for i in range(len(words)) :
    words[i] = words[i].upper()

words

['AMELIORATE', 'CASTIGATE', 'DEFENESTRATE']

Although more cumbersome, it is useful when you need access to the index of each item, as we do here.

### More about range

The ability to generate sequences quickly and easily with a simple syntax is very useful

Starting simple, the call to
```python
range(stop)
```

produces the integers from 0 to (and not including) stop.  It actually has more flexibility: the more general call

```python
range(start, stop[, step])
```
produces the integers from a start until stop (not including).  

The optional step parameter determines the increment, which is 1 by default.
For example:


In [27]:
for i in range(1,10,2) : print (i)

1
3
5
7
9


Another aspect of Python, if in doubt, ask.

In [28]:
?range

### Creating lists

Here's Python code that creates a list that contains the first n squares:

In [29]:
n = 5
squares = []
for i in range(1, n+1):
    squares.append(i**2)
print(squares)

[1, 4, 9, 16, 25]


### List comprehensions

Python provides a more elegant way of creating lists using the so-called *list comprehensions*:

In [30]:
n=5
squares = [i * i for i in range(1, n+1)]
print(squares)

[1, 4, 9, 16, 25]


List comprehensions can contain an `if` clause that serves as a filter:

In [31]:
a_list = [1, '4', 9, 'a', 6, 4]
squares = [ e**2 for e in a_list if type(e) == int ]
print (squares)

[1, 81, 36, 16]


Note also that lists may contain elements of differing type, in this case integers and strings

And what gest put into the new list may itself be a list

In [32]:
n = 5
square_pairs = [ [i, i*i] for i in range(1, n+1)]  
square_pairs

[[1, 1], [2, 4], [3, 9], [4, 16], [5, 25]]

## Slices

Slices allow you to create sublists.  To familiarize yourselves with slices, create a list called `values` and try out the following commands:
```python
values[1:3]  
values[2:-1] 
values[:2]   
values[2:]   
values[::2] # this last value is the stride
```
Using slices you can also solve the second exercise using a single statement.  Hint:  negative strides.
Slices also apply to strings much the same way they apply to lists - try it out!

## Map

While explicit iteration is univerally recognized as important, the equally important and sometimes more useful notion of mapping an operation of a list is comparatively neglected. Consider again the simple task of squaring values in a list:

In [33]:
foo = [1, 2, 3, 4, 5]

def sq(val) :
    return val * val

bar = map(sq, foo)
print(bar)
print(list(bar))
print(list(bar))

<map object at 0x7fe55eda4f90>
[1, 4, 9, 16, 25]
[]


In [34]:
print()




Consider the following bit-of-a-puzzle.  What appears in output?  What is the value of variable ```bar```

In [35]:
words = ["ameliorate", "castigate", "defenestrate"]
foomap = map(print,words)
bar = list(foomap)
bar

ameliorate
castigate
defenestrate


[None, None, None]

## Boolean Expressions

Here are some of the Boolean comparison operators in Python:

Comparison Operators:
```python
      x == y               # x is equal to y
      x != y               # x is not equal to y
      x > y                # x is greater than y
      x < y                # x is less than y
      x >= y               # x is greater than or equal to y
      x <= y               # x is less than or equal to y
      x is y               # x is the same as y
```      
For example:

In [36]:
 3 < 1


False

## Logical operators
In Python, logical operators are written in plain English, so our familiar  logical operators are expressed as:
`and`, `or`, and `not`. Here are some examples:

In [37]:
print(3 < 1 and 3 > 1)
print(3 < 1 or 3 > 1)
print(not(3 < 1) and 3 > 1)

False
True
True


Whenever in doubt about precedence, use parentheses!

The general syntax for `if` statements:

```python
if condition1 is true:
    block of code
elif condition2 is true:
    block of code
else:
    block of code
```

Let's put everything we've learned so far to write a snippet of code that prints the maximum value from a list:

In [38]:
a_list = [1, 5, 23, -3, 42]
m = a_list[0]
for element in a_list[1:] :
    if element > m :
        m = element
print(m)

42


## Python Functions

The above code snippet is useful as a function callable elsewhere. 

In [39]:
def list_max(a_list) :
    m = a_list[0]
    for element in a_list[1:] :
        if element > m :
            m = element
    return m

print(f'The function returns: {list_max([1,5,23,-3])}')

The function returns: 23


Functions are defined using the `def` reserved word, and `return` is used to return a value.
Note that we did not call our function `max`, because that would have shadowed Python's built-in function:

In [40]:
max([1,5,23,-3])

23

## The Special Value `None`


Let's see what happens if we forget to return a value from our function:


In [41]:
def list_max(a_list) :
    m = a_list[0]
    for element in a_list[1:] :
        if element > m :
            m = element
            
print(list_max([1,5,23,-3]))

None


What happened is that the special value `None` got returned. Most programming languages have an agreed upon way of indicating what might in English translate as *I really have nothing I can say or return to you at this point*.

In [42]:
print(None)
print(type(None))

None
<class 'NoneType'>


## Function Documentation Headers

There is special syntax for multiline string literals

In [43]:
'''hello
multiline strings!'''

'hello\nmultiline strings!'

These are often used for function documentation

In [44]:
def my_function():
    '''This function currently does nothing'''
    pass

In [45]:
my_function

<function __main__.my_function()>

In [46]:
help(my_function)

Help on function my_function in module __main__:

my_function()
    This function currently does nothing



## Pass By Reference/Value?

Let's consider how lists are passed to a function, by reference or by value:

In [47]:
def double_values(a_list) :
    for index, value in enumerate(a_list) :  
        a_list[index] = 2 * value
    pass

things = [2, 5, 'Spam', 9.5]
double_values(things)
things

[4, 10, 'SpamSpam', 19.0]

What can you conclude about how lists are passed?

## Equality and comparison

The double equal sign (`==`) in Python is like Java's `.equals`. You use it for all the various types. There is also the "`is`" keyword, which is rarely used in Python, and means "lives at the same memory address," like `==` for objects in Java!

In [48]:
2 == 2.0

True

Collections are compared by contents being equal and in the same order

In [49]:
L = list(range(3))
L == [0, 1, 2.0]

True

In [50]:
foo = [1, 2, [3.0, 4.0, 5.0]]
bar = [1, 2, [3, 4]]
foo == bar

False

## Running Python non-interactively

To use code written in another file, import it by its file name (it should be in the same directory or in a standard location where Python searches for packages).
As an example, put the following function in a file called ```helper_functions.py```.  Make sure to use a code editor, e.g. kate, gedit, sublime, emacs, or vi/vim.

In [51]:
def gcd(x, y):
    x, y = sorted([x, y])
    return x if y % x == 0 else gcd(x, y % x)

We will use a shortcut and do so programmatically:

In [52]:
import inspect
with open('helper_functions.py', 'w') as outfile :
    outfile.write(inspect.getsource(gcd))

In [53]:
# let's see that the file exists:
!ls helper_functions.py

helper_functions.py


To use this function in the Python interpreter, open the  interpreter and import the module you created:
```bash
$ python
```
```python
>>> import helper_functions
>>> helper_functions.gcd(25, 15)
```

In [54]:
gcd(9,3)

3

In [55]:
gcd(42,42*8)

42