# Introduction to Python 3

Python is a modern programming language that 
* is open source
* is interpreted
 * interpreters exist for most platforms
* is multi-paradigm (incl. object-oriented)
* comes with [batteries included](https://www.python.org/dev/peps/pep-0206/#batteries-included-philosophy) whenever possible

### Versions

Version 2 of the Python language (2.7 is the current minor version) is what made Python popular. However it was far from perfect, and version 3 of Python fixes many of the most glaring design flaws in Python 2.

Because version 2 gained popularity rapidly, it has taken over 10 years for version 3 to gain foothold. This is the first time CSC gives introduction to Python course using Python 3.

### Three levels of Python

There are 3 levels of functionality you can use in Python
* the built-in parts
 * the language itself that is used to write programs
* the [standard library](https://docs.python.org/3/library/)
 * these cover many common tasks in programming in general, e.g.
   * file system and operating system abstraction
   * reading standardized file formats (zip, xml, csv, etc.)
   * most common data communications protocols (HTTP and email protocols)
   * more data types, programming libraries
* the Python ecosystem, mostly available via the [Python Package Index PyPI](https://pypi.python.org/pypi)
   * tens of thousands of packages of varying quality
   * libraries for
    * numeric computation (e.g. NumPy)
    * machine learning (e.g. scikit-learn)
    * HTTP frameworks (e.g. Django)
    * natural language analysis (e.g. nltk)
    * data visualization
    
The core or built-in parts of Python is relatively small and we will cover that first.

The typical way to write python programs is to write it in script files that end in *.py and that can be run with the ``python`` command. We will get to that later but first we use this Jupyter Notebook to go over the basics of the language.

### Syntax

First a few motivational words from [The Zen of Python](https://www.python.org/dev/peps/pep-0020/)

    Beautiful is better than ugly.
    Simple is better than complex.
    Readability counts.
    
The design of Python aims for simplicity.



## First program

The first exercise in most programming tutorials is a Hello World -program.

You can edit the code in the cell below and run it by clicking on the run-button in the above toolbar or by pressing CTRL+Enter when you have the cell in focus (surrounded by a green box).

The text between the quotation marks "" is a string. ``print`` is a function and the parameters are inside regular brackets () in a C-kind of style.


    




In [None]:
print("hello world!")

These exercises are run in this notebook environment, but you could just as easily copy the text below to a file called **hello.py** and run it with the command

    $ python hello.py
    hello world!

**Extra**: compare this with a hello world program in some other programming language that you know. Is it simpler or more complex? What kinds of design decisions have to have been made in order for the example to be this simple?

## Getting help

The built-in function ```help()``` will show you interactive documentation about most Python objects when you're inside an interpreter.

If you want to know all the members of an object (more about objects and classes later) you can call the ```dir()``` function.

In [120]:
help(print)

Help on built-in function print in module builtins:

print(...)
    print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)
    
    Prints the values to a stream, or to sys.stdout by default.
    Optional keyword arguments:
    file:  a file-like object (stream); defaults to the current sys.stdout.
    sep:   string inserted between values, default a space.
    end:   string appended after the last value, default a newline.
    flush: whether to forcibly flush the stream.



## Variables and data types

Variable is something that can change in the execution of a program. It is referenced by a name.

In Python variable names
* may contain letters, numbers or underscores
* start with a a letter or underscore (but not with a number!)
* are case sensitive

Underscores in the beginning or end of a variable are part of idiomatic coding style that hints things to the reader of the code. We will get to that in the future.

Try them out below:

In [31]:
hello_example_1 = "hello world!" # comments are marked with the #-sign
hello_example_1 = 5
hello_example_1 # in Jupyter notebooks if the cell ends with a single variable, the system will print the value for you

5

Python is a [dynamically typed](https://en.wikipedia.org/wiki/Type_system#Dynamic_type_checking_and_runtime_type_information), [strongly typed](https://en.wikipedia.org/wiki/Strong_and_weak_typing) language. It's OK not to understand the terms completely. They are simply mentioned because they carry very specific meaning to experienced programmers.

In practice this means that:

* variables (and their types) don't need to be declared
* trying to use a variable of an incorrect type will result in errors


### Data types

Python has a small set of basic data types, that are grouped into groups that we will introduce. All variables in python have a type and you can use the built-in method ``type()`` to check the type of a variable.

* ``boolean``: is a data type that can be either True or False (note capitalization of first letter)

* Numeric types, that represent numbers
 * ``int``: integers, not limited in length
 * ``float``: floating point numbers, like doubles in C, with similar caveats
 * ``complex``: complex numbers, represented by j (not covered in this tutorial)
 
* Sequences:
 * ``str``: String, a sequence of Unicode characters in the range U+0000 - U+10FFFF
 * ``bytes``: a sequence of integers in the range 0-255, i.e. raw data
 * ``byte array``: like bytes, but mutable
 * ``list``: a mutable ordered sequence of variables
 * ``tuple``: an immutable ordered sequence of variables
 
* Sets
 * ``set``: an unordered collection of unique objects
 * ``frozen set``: like set, but immutable
 
* Mappings
 * ``dict``: a dictionary, also called a hashmap
 
Python is **dynamically** typed, which means that the data types does not need to be declared, it is determined at run time.

Python is **strongly** typed, which means that it typically does not attempt to coerce a data type to another. For instance it is not possible to concatenate a string and a number, which is often valid in many languages. The number needs to be converted into a string explicitly.

The typing in Python is called **duck typing**. It is sufficient to implement the functions required and not necessary to explicitly implement an interface like in e.g. Java or C#.

Each of the abovementioned types is also a built-in function that returns objects of said type.

Sequences, sets and mappings are often iterated over. More on this later.

In [32]:
value = 5
value2 = value + 1

my_string = "hello "
my_string = my_string + str(value2) # you can attempt the same without converting to string
print(my_string)

hello 6


### Mutable and immutable data types

Some data types are mutable and some are immutable.

**Mutable** data types can be changed after they are created for example:
* a list can be appended to
* a byte in a byte array can be altered
* a set can be added to
* a dict can be added to

**Immutable** data types cannot be changed after they are created. Any operations on the data types will return a **new** instance of the same type, that is different. Typically this new value then needs to be assigned to a variable.

| Immutable                        | Mutable    |
|----------------------------------|------------|
| numeric types (int, float, etc.) |            |
| tuple                            | list       |
| str                              | byte array |
| frozen set                       | set        |
|                                  | dict       |

Only immutable data types can be the keys in a dict.

In [57]:
# mutable examples
dict_ = {"key": "value"}
dict_["key2"] = "value2"
print(dict_)

list_ = ["egg", "sausage", "bacon"]
list2 = list_
list_.append("spam")
print(list_)

# variables are just pointers to objects in memory
# for mutable types all references point to the same object that has changed
print(list2)

# immutable examples

str_ = "hello world!"
print(str_.replace("l", ""))
print(str_)

tuple_ = (4, 5, 6)
print(tuple_ + (6,7))
print(tuple_)

{'key': 'value', 'key2': 'value2'}
['egg', 'sausage', 'bacon', 'spam']
['egg', 'sausage', 'bacon', 'spam']
heo word!
hello world!
(4, 5, 6, 6, 7)
(4, 5, 6)


### Lists

Lists are created using [] brackets or the ``list()`` constructor. There is no requirement for all the objects in the list to be of the same type. This is a consequence of the **duck typing** mentioned earlier.

Lists support multiple types of indexing.

In [46]:
my_list = [1, 2, 3, 4]

print(my_list[0]) # indexing starts from 0
print(my_list[1:3]) # so-called slice syntax selects a part of a list
print(my_list[-1]) # negative indices are also permitted, -1 is the last index
print(my_list[-3:-1]) # also in slicing

1
[2, 3]
4
[2, 3]


Lists can be appended to using several types of syntax

In [65]:
my_list = [1, 2]
my_list.append(3) # modifies in place, takes a single item
print(my_list)
my_list.extend([4, 5]) # takes another list
print(my_list)
another_list = my_list + [6, 7] # makes a copy
print(another_list)
print(my_list)

[1, 2, 3]
[1, 2, 3, 4, 5]
[1, 2, 3, 4, 5, 6, 7]
[1, 2, 3, 4, 5]


### Dictionaries

Dictionaries are also accessed by using the []-brackets. A dict is accessed by key.

The dict also contains a ``get()`` method that takes in a default value to return if the key is not present.

It is assigned to using the bracket notation. If a key exists, the value is overriden.

In [67]:
my_dict = {1: 2, "key": "value"}
print(my_dict[1])
print(my_dict["key"])
print(my_dict.get("im_not_there", "default"))

my_dict["key2"] = "i was just inserted"
print(my_dict["key2"])

2
value
default
i was just inserted


### Tuples

A comma defines a tuple. for example
```
a,b
```
is a valid tuple. It's convention use parentheses to make the presence of a tuple more explicit,
```
(a, b)
```
but the parentheses are in no way required.

Python does automatic packing and unpacking of tuples, as is illustrated by the following example.

In [1]:
a, b = 1, 2
a, b = b, a
##Check what the values of a and b are now

## Conditional statements

The most common conditional statement in Python is the if-elif-else statement:

```
if variable > 5:
  do_something()
elif variable > 0:
  do_something_else()
else:
  give_up()
```

Compared to languages like C, Java or Lisp, do you feel something is missing?

Python is whitespace-aware and it uses the so-called [off-side rule](https://en.wikipedia.org/wiki/Off-side_rule) to annotate **code blocks**. This has several benefits
* It's easy to read at a glance
* levels of indentation are processed [pre-attentively](https://en.wikipedia.org/wiki/Pre-attentive_processing) to conserve brain power for everything
* It's easy to write without having to worry too much

One corollary of the indentation is that you need to be very aware of when you're using the tabulator character and when you're using a space. Most Python programmers only use whitespace and configure their editor to output several spaces when tab is pressed.

**Note**: the line before a deeper level of indentation ends in a colon ":". This syntax is part of beginning a new code block and surprisingly easy to forget.

In [1]:
value = 4
value = value + 1
if value < 5:
    print("value is less than 5")
elif value > 5:
    print("value is more than 5")
else:
    print("value is precisely 5")
# go ahead and experiment by changing the value


value is precisely 5


There is no switch-case type of statement in Python.

**Note**: When evaluating conditional statements the values 0, an empty string and an empty list all evaluate to False. This can be confusing as it is one of the few places where Python doesn't enforce strong typing.



### While statement

Python supports ``while`` statement familiar from many languages. It is not nearly as much used because of iterators (covered later).

```
value = 5
while value > 0:
  value = do_something(value)
```

The following example shows how a list is used as the conditional.

In [2]:
list_ = [1, 2, 3, 4]
while list_: # remember, an empty list evaluates as False for conditional purposes
    print(list_.pop()) # pop() removes the last entry from the list

4
3
2
1


## Iterating

Python has a ``for``-loop statement that is similar to the foreach statement in a lot of other languages.

It is possible to loop over any iterables, i.e. lists, sets, tuples, even dicts.

In [3]:
synonyms = ["is dead", "has kicked the bucket", "is no more", "ceased to be"]
for phrase in synonyms:
    print("This parrot " + phrase + ".")

This parrot is dead.
This parrot has kicked the bucket.
This parrot is no more.
This parrot ceased to be.


It is possible to unpack things in this stage if that is required.

In [6]:
tuples = (
        (1, 2),
        (3, 4),
        (5, 6),
)

for x, y in tuples:
    print("A is " + str(x))
    print("B is " + str(y))

A is 1
B is 2
A is 3
B is 4
A is 5
B is 6


In dictionaries the **keys** are iterated over by default.

In [61]:
airspeed_swallows = {"African": 20, "European": 30}
for swallow in airspeed_swallows:
    print("The air speed of " + swallow + " swallows is "+ str(airspeed_swallows[swallow]))

The air speed of European swallows is 30
The air speed of African swallows is 20


It is still possible to loop through numbers using the built-in ``range`` function that returns an iterable with numbers in sequence. The function supports arbitary step lengths etc.

In [77]:
for i in range(99, 90, -2): # parameters are from, to and step length in that order
    print(str(i) +" boxes of bottles of beer on the wall")

99 boxes of bottles of beer on the wall
97 boxes of bottles of beer on the wall
95 boxes of bottles of beer on the wall
93 boxes of bottles of beer on the wall
91 boxes of bottles of beer on the wall


### breaking and continuing

Sometimes it is necessary to stop the execution of a loop before it's time. For that there is the ``break`` keyword.

At other times it is desired to end that particular step in the loop and immediately move to the next one.

Both of the keywords could be substituted with complex if-else statements but a well-considered break or continue statement is more readable to the next programmer.

In [87]:
for i in range(20):
    if i % 7 == 6: # modulo operator
        break #
    print(i)

0
1
2
3
4
5


In [85]:
for i in range(-5, 5, 1):
  if i == 0:
        print ("not dividing by 0")
        continue
  print("5/" + str(i) + " equals " + str(5/i))

5/-5 equals -1.0
5/-4 equals -1.25
5/-3 equals -1.6666666666666667
5/-2 equals -2.5
5/-1 equals -5.0
not dividing by 0
5/1 equals 5.0
5/2 equals 2.5
5/3 equals 1.6666666666666667
5/4 equals 1.25


### Step aside: list comprehension

The act of modifying all the values in a list into a new list  is so common in programming that there is a special syntax for it in python, the list comprehension.


In [114]:
list_1 = [1, 2, 3, 4]
list_2 = [value*3-1 for value in list_1]
list_2

[2, 5, 8, 11]

It is not necessary to use list comprehensions but they are mentioned so they can be understood if discovered in other programs.

Part of the Zen of Python says

    There should be one-- and preferably only one --obvious way to do it.
    Although that way may not be obvious at first unless you're Dutch.
    
List comprehensions are the one and obvious way to do these kinds of operations so they are stated here.

There is also possibility to add extra tests to the statement.

In [113]:
list_1 = [1, 2, 3, 4]
list_2 = [value*3-1 for value in list_1 if value % 2 == 0] #only take even numbers
list_2


[5, 11]

**Note:** List comprehensions always create the entire list in memory. When handling large amounts of data or in an environment with limited memory it's often a good idea to avoid creating large data structures in memory.

In Python a language feature called **generators** helps you do this. There is extra material about this in another notebook.

## Functions and function arguments

Functions are the building blocks of writing software. If a function is associated with an object and it's data, it is called a method.

Functions are defined using the keyword ``def``.

There are two types of arguments
* regular arguments, which must always be given when calling the function
* keyword arguments, that have a default value that can be overriden if desired

Values are returned using the ``return`` keyword. If not ``return`` is defined, the default return value of all functions and methods is **None**, which is the null object in Python.

In [93]:
def my_function(arg_one, arg_two, optional_1=6, optional_2="seven"):
    return " ".join([str(arg_one), str(arg_two), str(optional_1), str(optional_2)])

print(my_function("a", "b"))
print(my_function("a", "b", optional_2="eight"))

#go ahead and try out different components

a b 6 seven
a b 6 eight


Python has special syntax for catching an arbitary number of parameters. For regular parameters it is a variable with one asterisk \* and for keyword parameters it is a variable with two asterisks. It is conventional to name these \*args and \*\*kwargs, but this is not required.

In [98]:
def count_args(*args, **kwargs):
    print("i was called with " + str(len(args)) + " arguments and " + str(len(kwargs)) + " keyword arguments")
    
count_args(1, 2, 3, 4, 5, foo=1, bar=2)

i was called with 5 arguments and 2 keyword arguments


Consequently the length of sequences can be checked using the **len()** function.

## Modules and importing

Python projects are structured into modules. 

There are a plethora of modules available in the [Python standard library](https://docs.python.org/3/library/). Those are always available to you but you must import them.

Of course, you must also be aware of the fact that such a module exists. It is usually beneficial to be a bit lazy and assume someone has already solved your problem. Most of the time someone already has!

In [100]:
import math

def circle_circumference(r):
    return 2*math.pi*r

circle_circumference(3)

18.84955592153876

At it's simplest a module can just be a python file.

### Task:
create a file called mymodule.py in using jupyter (New -> Text File)

Edit the contents of the file to be

```
def fancy_function(x):
    return x + x
```

And save the file.

Now you can

In [101]:
from mymodule import fancy_function

print(fancy_function(1))
print(fancy_function("hi"))

2
hihi


Modules can also have more structure in them. To make a directory a module, you must place a special file, called **__init__.py** in the directory.

```
main.py
bigmodule/
  __init__.py
  module_a.py
  module_b.py
```
Now in main.py, you could ``import bigmodule.module_a``.

It is also possible to import only a single member from a module, like a variable or a fuction.

In [106]:
from math import exp

print(exp(2))

def circle_area(r):
    if r < 0:
        return 0
    else:
        #you can also import inside functions or other code blocks
        from math import pi
        return pi*r*r
print(circle_area(2))

7.38905609893065
12.566370614359172


Whether to import the entire module or only what you need depends on your circumstances and how the module has been designed to be used. It's usually good to pick a practice inside a project and stick to it.