# Programming in Python - Introduction to Python

__September 26, 2022__

__Judit Ács__

acs.judit@sztaki.hu

## Goal

- intermediate level Python starting from a basic programming background
- will cover some advanced concepts
- ability to read third party source code
- lots of practice

## Prerequisites

- basic knowledge in at least one object oriented programming language
- must know: variables, types, functions, basic I/O handling
- good to know: objects, classes, attributes, function arguments

## Links to course material

Official Github repository:
https://github.com/tuw-python/tuw-python-2022WS

- Please consult the Syllabus before asking questions
- Public Slack channels should be used

## Format of the course

1. Jupyter
2. Standalone Python modules (in other words a single .py file)
3. Small packages

We provide skeletons that you have to modify.

You may use Google Colab in place of the Jupyter notebooks but the assignments must be submitted via Github Classroom. You also need to develop the modules and packages on your local machine.

# Jupyter

- Jupyter - formally known as IPython Notebook is a web application that allows you to create and share documents with live code, equations, visualizations etc.
- Jupyter notebooks are JSON files with the extension `.ipynb`
- Can be converted to HTML, PDF, LateX etc.
- Can render images, tables, graphs, LateX equations
- Large number of extensions called 'nbextensions'
- Extensions used in this lecture:
  - Table of Contents (`main/toc2`)
  - `jupyter-vim-binding`, a keyboard binding emulating the VIM editor
- Content is organized into cells
- Can be turned into a live slideshow using the RISE extension

## Cell types

1. code cell: Python/R/Lua/etc. code
2. markdown cell: formatted text using Markdown
3. raw cell: raw text (not used here)

## Code cell

In [1]:
print("Hello world")

Hello world


The last command's output is displayed

In [2]:
2 + 3
3 + 4

7

This can be a tuple of multiple values

In [3]:
2 + 3, "hello " + "world"

(5, 'hello world')

## Markdown cell

**This is in bold**

*This is in italics*

| This | is |
| --- | --- |
| a | table |

and this is a pretty LateX equation:

$$
\mathbf{E}\cdot\mathrm{d}\mathbf{S} = \frac{1}{\varepsilon_0} \iiint_\Omega \rho \,\mathrm{d}V
$$

## Using Jupyter

### Command mode and edit mode

Jupyter has two modes: command mode and edit mode

1. Command mode: perform non-edit operations on selected cells (can select more than one cell)
  - Selected cells are marked blue
2. Edit mode: edit a single cell
  - The cell being edited is marked green

### Switching between modes

1. Esc: Edit mode -> Command mode
2. Enter or double click: Command mode -> Edit mode

### Running cells

1. Ctrl + Enter: run cell
2. Shift + Enter: run cell and select next cell
3. Alt + Enter: run cell and insert new cell below

Remembering these shorthands can save you a lot of time.

## Handling user input

Jupyter has a widget for the built-in `input` function. This __halts__ the execution until some input is provided. Note the * in place of the execution counter:

In [4]:
input("Please input something: ")

Please input something: 42


'42'

## Cell magic

Special commands can modify a single cell's behavior, for example

In [5]:
%%time

for x in range(1000000):
    pass

CPU times: user 51.2 ms, sys: 0 ns, total: 51.2 ms
Wall time: 49.3 ms


In [6]:
%%timeit

x = 2

9.58 ns ± 1.93 ns per loop (mean ± std. dev. of 7 runs, 100,000,000 loops each)


In [7]:
%%timeit

x = 2

9.27 ns ± 0.635 ns per loop (mean ± std. dev. of 7 runs, 100,000,000 loops each)


Let's see a longer execution

In [8]:
%%timeit

x = 2
x = 4

12.2 ns ± 0.818 ns per loop (mean ± std. dev. of 7 runs, 100,000,000 loops each)


For a complete list of magic commands:

In [9]:
%lsmagic

Available line magics:
%alias  %alias_magic  %autoawait  %autocall  %automagic  %autosave  %bookmark  %cat  %cd  %clear  %colors  %conda  %config  %connect_info  %cp  %debug  %dhist  %dirs  %doctest_mode  %ed  %edit  %env  %gui  %hist  %history  %killbgscripts  %ldir  %less  %lf  %lk  %ll  %load  %load_ext  %loadpy  %logoff  %logon  %logstart  %logstate  %logstop  %ls  %lsmagic  %lx  %macro  %magic  %man  %matplotlib  %mkdir  %more  %mv  %notebook  %page  %pastebin  %pdb  %pdef  %pdoc  %pfile  %pinfo  %pinfo2  %pip  %popd  %pprint  %precision  %prun  %psearch  %psource  %pushd  %pwd  %pycat  %pylab  %qtconsole  %quickref  %recall  %rehashx  %reload_ext  %rep  %rerun  %reset  %reset_selective  %rm  %rmdir  %run  %save  %sc  %set_env  %store  %sx  %system  %tb  %time  %timeit  %unalias  %unload_ext  %who  %who_ls  %whos  %xdel  %xmode

Available cell magics:
%%!  %%HTML  %%SVG  %%bash  %%capture  %%debug  %%file  %%html  %%javascript  %%js  %%latex  %%markdown  %%perl  %%prun  %%pypy  %%

### Run shell commands

In [10]:
! dir

1_Introduction.ipynb
2_Builtin_types.ipynb
3_Object_Oriented_Programming.ipynb
4_Decorators_list_comprehension_iteration_context_managers_functional.ipynb


Shell commands may fail and their failure does not affect the notebook execution:

In [11]:
! command-that-does-not-exist

zsh:1: command not found: command-that-does-not-exist


## Under the hood

- Each notebook is run by its own _Kernel_ (Python interpreter)
  - The kernel can interrupted or restarted through the Kernel menu
  - **Always** run `Kernel -> Restart & Run All` before submitting homework to make sure that your notebook behaves as expected
- All cells share a single namespace. This means that a variable defined in one (executed) cell will be available in other cells:

In [12]:
my_name = 12

In [13]:
my_name = my_name + 1
my_name

13

Cells can be run in arbitrary order, execution count is helpful

In [14]:
print("this is run first")

this is run first


In [15]:
print("this is run afterwords. Note the execution count on the left.")

this is run afterwords. Note the execution count on the left.


**Kernel->Restart & Run All** runs the cells from top to bottom. Don't forget to use this to test your notebook before submitting it as homework. This is how the instructor is going to test it.

# The Python programming language

## History of Python


- Python started as a hobby project of Dutch programmer, Guido van Rossum in 1989.
- Python 1.0 in 1994
- Python 2.0 in 2000
  - Cycle-detecting garbage collector
  - Unicode support
- Python 3.0 in 2008
  - Proposal in 2006 [PEP 3000](https://peps.python.org/pep-3000/)
  - **Backward incompatible** with earlier versions
- Python2 End-of-Life (EOL) date was postponed from 2015 to 2020
  - [Sunsetting Python 2](https://www.python.org/doc/sunset-python-2/)

## Guido van Rossum, <s>Benevolent Dictator for Life</s> Stepped down in 2018
 
Guido van Rossum at OSCON 2006. by
[Doc Searls](https://www.flickr.com/photos/docsearls/)
licensed under [CC BY 2.0](https://creativecommons.org/licenses/by/2.0/)
 <img width="400" alt="portfolio_view" src="https://upload.wikimedia.org/wikipedia/commons/6/66/Guido_van_Rossum_OSCON_2006.jpg">

## Python community and development

- Python Software Foundation nonprofit organization based in Delaware, US
- Managed through PEPs (Python Enhancement Proposal)
    - Public discussion for example [PEP 3000 about Python 3.0](https://www.python.org/dev/peps/pep-3000/)
- Strong community inclusion
- Large standard library
- Very large third-party module repository called PyPI (Python Package Index)
- pip installer
  - Can install modules from PyPI
  - Can install custom modules (we will do that in the upcoming lectures)

## Python neologisms

- the Python community has a number of made-up expressions
- _Pythonic_: following Python's conventions, Python-like
- _Pythonist_ or _Pythonista_: good Python programmer

## Python memes

In [16]:
import antigravity

# Developing in Python

## Notebooks

- Jupyter
- [JupyterLab](https://jupyterlab.readthedocs.io/en/stable/): Jupyter + IDE-like features
- [Google Colab](https://colab.research.google.com): online notebook with GPU access

## IDEs

- [VSCode](https://code.visualstudio.com/): free, cross-platform, Python plugin, command line support
- [PyCharm](https://www.jetbrains.com/pycharm/): free Community edition, cross-platform

## Command line tools

- [VIM](https://www.vim.org/) or [neovim](https://neovim.io/) + [tmux](https://github.com/tmux/tmux/wiki): small, runs everywhere, modal editing, steep learning curve, Python plugins, very mature
    - VSCode and PyCharm have VIM editing mode
- [Emacs](https://www.emacswiki.org/emacs/PythonProgrammingInEmacs): another CLI editor, built-in Python support

# PEP8, the Python style guide

- widely accepted style guide for Python
- [PEP8](https://www.python.org/dev/peps/pep-0008/) by Guido himself, 2001

Specifies:

- indentation
- line length
- module imports
- class names, function names etc.

We shall use PEP8 throughout this course. You are expected to follow it in the homeworks.

We use the [Black autoformatter library](https://black.readthedocs.io/en/stable/). Black is available as a Jupyter extension as well.

# General properties of Python

## Whitespaces

Whitespace indentation instead of curly braces, no need for semicolons:

In [17]:
n = 10
if n % 2 == 0:
    print("n is even")
else:
    print("n is odd")

n is even


## Dynamic typing

Type checking is performed at run-time as opposed to compile-time (C++):

In [18]:
n = 2
print(type(n))

n = 2.1
print(type(n))

n = "foo"
print(type(n))

<class 'int'>
<class 'float'>
<class 'str'>


# Simple statements

## Conditional expressions

### if, elif, else

In [19]:
n = int(input())
# n = 12

if n < 0:
    print("N is negative")
elif n > 0:
    print("N is positive")
else:
    print("N is neither positive nor negative")

0
N is neither positive nor negative


### Ternary conditional operator

- one-line `if` statements
- the order of operands is different from C's `?:` operator, the C version of abs would look like this

~~~C
int x = -2;
int abs_x = x>=0 ? x : -x;
~~~
- should only be used for very short statements


`<expr1> if <condition> else <expr2>`

In [20]:
n = -2
abs_n = n if n >= 0 else -n
abs_n

2

## Lists

- lists are the most frequently used built-in containers
- basic operations: indexing, length, append, extend
- lists will be covered in detail next week

In [21]:
l = []  # l = list()
l.append(2)
l.append(2)
l.append("foo")
# l = [2, 2, "foo"]

len(l), l

(3, [2, 2, 'foo'])

## Iteration

### Iterating a list

In [22]:
for e in ["foo", "bar"]:
    print(e)

foo
bar


## `enumerate`: iterating with an index

In [23]:
for idx, element in enumerate(["foo", "bar"]):
    print(idx, element)

0 foo
1 bar


## `range`: Iterating over a range of integers

By default `range` starts from 0 and it is right-open, in other words, it does not include the last element.

In [24]:
for i in range(5):
    print(i)

0
1
2
3
4


In [25]:
for _ in range(5):
    print("Printing 5 times.")

Printing 5 times.
Printing 5 times.
Printing 5 times.
Printing 5 times.
Printing 5 times.


Specifying the start of the range:

In [26]:
for i in range(2, 5):
    print(i)
    
start = 4
window_size = 5
for i in range(start, start+window_size):
    print(i)

2
3
4
4
5
6
7
8


Specifying the step. Note that in this case we need to specify all three positional arguments.

In [27]:
for i in range(0, 10, 2):
    print(i)

0
2
4
6
8


Negative values:

In [28]:
for i in range(-3, 0):
    print(i)

-3
-2
-1


In [29]:
for i in range(-3, 0, -1):
    print(i)

In [30]:
for i in range(0, -3, -1):
    print(i)

0
-1
-2


`range` only accepts integers:

In [31]:
# range(0, 1, 0.1)  # raises TypeError

The `numpy` module's `arange` function is more flexible

In [32]:
import numpy as np

np.arange(0, 1, 0.1)

array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])

## `break` and `continue`

`break`: allows early exit from a loop

In [33]:
for i in range(10):
    if i > 4:
        break
    print(i)

0
1
2
3
4


`continue`: allows early jump to next iteration

In [34]:
for i in range(10):
    if i % 2 == 0:
        continue
    print(i)

1
3
5
7
9


## `else`

__`else`__ can be used with `for` but this largely unknown so we do not recommend using it:

In [35]:
numbers = [3, -1, 5, 3, 7]

for n in numbers:
    if n % 2 == 0:
        print("Breaking out of the iteration.")
        break
else:
    print("Found no even numbers.")

Found no even numbers.


## while

In [36]:
i = 0
while i < 5:
    print(i)
    i += 1
i

0
1
2
3
4


5

There is no `do...while` loop in Python.

# Functions

Functions can be defined using the `def` keyword:

In [37]:
def foo():
    print("this is a function")


foo()

this is a function


## Function arguments

1. positional
2. named or keyword arguments


In [38]:
def foo(arg1, arg2, arg3):
    print("arg1 ", arg1)
    print("arg2 ", arg2)
    print("arg3 ", arg3)


#foo(1, 2, "asdfs")
#foo(1, arg3="asdfs", arg2=2)
foo(1, "asdfs", 2)

arg1  1
arg2  asdfs
arg3  2


Keyword arguments must follow positional arguments:

In [39]:
# foo(1, arg3="asdfs", 2)  # raises SyntaxError

## Default arguments

- arguments can have default values
- default arguments must follow non-default arguments

In [40]:
# def foo(arg1, arg2=2, arg3):  # raises SyntaxError
def foo(arg1, arg2, arg3=3):
    print("arg1 ", arg1)
    print("arg2 ", arg2)
    print("arg3 ", arg3)


foo(1, 2)

arg1  1
arg2  2
arg3  3


Default arguments need not be specified when calling the function

In [41]:
foo(1, 2)

arg1  1
arg2  2
arg3  3


They can be specified in any order:

In [42]:
foo(arg1=1, arg3=33, arg2=222)

arg1  1
arg2  222
arg3  33


If more than one value has default arguments, either can be skipped:

In [43]:
def foo(arg1, arg2=2, arg3=3):
    print("arg1 ", arg1)
    print("arg2 ", arg2)
    print("arg3 ", arg3)

foo(11)
print()
foo(11, 33)
print("")
foo(11, arg3=33)

arg1  11
arg2  2
arg3  3

arg1  11
arg2  33
arg3  3

arg1  11
arg2  2
arg3  33


This mechanism allows having a very large number of arguments.
Many libraries have functions with dozens of arguments.

The popular data analysis library `pandas` has functions with dozens of arguments, for example:

~~~python
pandas.read_csv(filepath_or_buffer, sep=',', delimiter=None, header='infer', names=None, index_col=None, usecols=None, squeeze=False, prefix=None, mangle_dupe_cols=True, dtype=None, engine=None, converters=None, true_values=None, false_values=None, skipinitialspace=False, skiprows=None, skipfooter=0, nrows=None, na_values=None, keep_default_na=True, na_filter=True, verbose=False, skip_blank_lines=True, parse_dates=False, infer_datetime_format=False, keep_date_col=False, date_parser=None, dayfirst=False, cache_dates=True, iterator=False, chunksize=None, compression='infer', thousands=None, decimal='.', lineterminator=None, quotechar='"', quoting=0, doublequote=True, escapechar=None, comment=None, encoding=None, dialect=None, error_bad_lines=True, warn_bad_lines=True, delim_whitespace=False, low_memory=True, memory_map=False, float_precision=None)
 ~~~

## The return statement

- functions may return more than one value
  - a tuple of the values is returned
- without an explicit return statement `None` is returned
- an empty return statement returns `None`

In [44]:
def foo(n):
    if n < 0:
        return "negative"
    if 0 <= n < 10:
        return "positive", n
    # return None
    # return


print(foo(-2))
print(foo(3), type(foo(3)))
print(foo(12))

negative
('positive', 3) <class 'tuple'>
None


# Some useful resources

The [official documentation](https://docs.python.org/3/contents.html) is excellent. Make sure you select the correct Python version on the top of the page.

[Python Conferences (PyCons)](https://pycon.org/)
- recorded presentations
- from beginner to advanced level
- tutorials
- search videos on https://pyvideo.org/

Youtube channels:
- https://www.youtube.com/c/mCodingWithJamesMurphy
- https://www.youtube.com/c/ArjanCodes

Subreddit:
- https://www.reddit.com/r/Python/ (1M subscribers)
- https://www.reddit.com/r/learnpython/ (600k subscribers)
- https://www.reddit.com/r/MachineLearning/ (2.5M subscribers)

In [45]:
import this

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
