## Module 4: Python


# Introduction
<br>

Asel Kushkeyeva<br>
Data Science Institute, University of Toronto<br>
2022

# Agenda:

1. Prerequisites and Readings
2. Hello, Python
3. Functions Design
4. Strings
5. True or False
5. Lists, Tuples, Sets, Dictionaries
6. Loops
7. Databases
8. Numpy
9. Pandas
10. Visualization

# Prerequisites

The module introduces to Python programming language starting from its basics. Students are expected to know how to operate a computer and to be able to install programs on their machines. No prior programming knowledge is required.
<br>

Students are highly encouraged to install the following programs in advance:

- Python 3.10.0 from https://www.python.org/downloads/
- Anaconda from https://www.anaconda.com/products/individual#Downloads
- Jupyter Notebook (within Anaconda)


# Readings

- Gries, P., Campbell, J., Montojo, J., & Coron, T. (2017). *Practical programming: An introduction to computer science using python 3.6.* 

- Adhikari, DeNero, and Wagner, (2021). *Computational and Inferential Thinking: The Foundations of Data Science.*

- Boykis, V. (2019). Neural nets are just people all the way down. https://vicki.substack.com/p/neural-nets-are-just-people-all-the



# Hello, Python

- Install Anaconda from https://www.anaconda.com/products/individual#Downloads.
- Launch Jupyter Notebook.
- On the top right corner of the home page, open a new Python 3 Notebook.
<br>

![jupyter%20new%20notebook.png](attachment:jupyter%20new%20notebook.png)

We will work in this notebook.  Rename the notebook -- File -> Rename -- to be able to find it later. 

__Important:__ Our main text is based on Python IDLE, and you are encouraged to try coding in it on your own. In this module, we will mainly work in Jupyter Notebook.

## Arithmetic in Python

In [1]:
9 + 3

12

In [2]:
34 - 4

30

In [3]:
5 * 3

15

In [4]:
20 / 2

10.0

## Variables and Computer Memory

In [5]:
degrees_celcius = 24.0

In [6]:
degrees_celcius

24.0

In [7]:
speed = 50

In [8]:
speed

50

## Errors

Syntax error:

In [9]:
2 +

SyntaxError: invalid syntax (<ipython-input-9-8b8e0fea6fab>, line 1)

## Errors

Semantic error:

In [10]:
3 + apple

NameError: name 'apple' is not defined

# Functions Design

## Built-in functions

In [11]:
abs(-31)

31

In [13]:
abs(7.4)

7.4

In [14]:
int(9.3)

9

In [15]:
float(7)

7.0

In [16]:
round(52.9374)

53

In [17]:
round(3.3)

3

## User-defined functions

In [20]:
def lenght_in_cm(inches):
    return(inches * 2.54)

In [21]:
lenght_in_cm(5)

12.7

# Strings

In [22]:
'friend'

'friend'

In [23]:
"my friend"

'my friend'

Quotes need to match at the beginning and end:

In [24]:
'We code"

SyntaxError: EOL while scanning string literal (<ipython-input-24-c126b3bb79c4>, line 1)

## A multiline string

In [25]:
numbers = """one
two
three"""

In [26]:
numbers

'one\ntwo\nthree'

We will talk about `\n` -- new line character -- and other 'escape' characters later in the module.

In [27]:
print(numbers)

one
two
three


# True or False

Type bool has only two values: __True__ and __False__.
<br>

Boolean operators: 
- and;
- or;
- not. 


In [28]:
cold = True
windy = False

In [29]:
(not cold) and windy

False

In [30]:
not (cold and windy)

True

__Relational operators__

|Symbol|Meaning|
|------|-------|
|>|Greater than|
|<|Less than|
|>=|Greater than or equal to|
|<=|Less than or equal to|
|==|Equal to|
|!=|Not equal to|


# if, elif, else

General form:

__if__ `condition`:
<br>

> `block`

*if* example:

In [39]:
letter = input('Enter a letter:')

Enter a letter:a


In [41]:
if letter in 'aeoiu':
    print(letter, 'is a vowel.')

a is a vowel.


*if* and *elif* general form is similar to a single *if* statement with __as many *elif* statements as required.__

__if__ `condition`:
<br>

> `if_block`
<br>

__elif__`condition`:
<br>
> `elif_block`

*if* and *elif* example:

In [51]:
number_x = int(input('Enter a number:'))

Enter a number:0


In [52]:
if number_x > 0:
    print(number_x, 'is positive.')
elif number_x < 0:
    print(number_x, 'is negative.')
elif number_x == 0:
    print(number_x, 'is zero.')

0 is zero.


*if* and *else* general form:

__if__ `condition`:
<br>

> `if_block`
<br>

__else__`condition`:
<br>
> `else_block`

*if* and *else* example:

In [42]:
letter = input('Enter a letter:')

Enter a letter:d


In [53]:
if letter in 'aeoiu':
    print(letter, 'is a vowel.')
else:
    print(letter, 'is a consonant.')

d is a consonant.


# Lists, Sets, Tuples, Dictionaries

__List__ is a type of data storage in Python. It contains collections of objects and is an object, too. Lists are expressed as [`object1`, `object2`,..., `objectN`]. The following list was created to show number of students in  seven seven classrooms:


In [54]:
students = [23, 30, 15, 20, 25, 32, 18]

Each item in the list has its *index*. In Python, items are indexed starting from 0. To get number of students in classrooms 1, 2, 6, and 7:

In [55]:
students[0]

23

In [56]:
students[1]

30

In [57]:
students[5]

32

In [58]:
students[6]

18

__Sets, tuples, and dictionaries__ are other types of data storage in Python. Some differences between these types of data storage are:
- Items in a set are unordered. A set is expressed as {`object1`, `object2`,..., `objectN`}.
    
- Content of a tuple cannot be changed. Items in a tuple are grouped in brackets: (`object1`, `object2`,..., `objectN`).

- Dictionaries consist of key-value pairs: {`key1 : value1`, `key2 : value2`,..., `keyN : valueN`}


# Loops

## A *for* loop

*for* loop allows to loop over a collection of data. It could be a list, a string, or a range of numbers. A general form is as follows.

__for__ `variable` in `list`:
<br>

> `block`

## A *while* loop

*while* loop is used with an unknown number of iterations. *while* loop will execute a code until the condition is true. For example, the following code stops printing number of rabbits when the condition - rabbits > 0 - is false.


In [59]:
rabbits = 3

In [60]:
while rabbits > 0:
    print(rabbits)
    rabbits = rabbits - 1

3
2
1


# Databases

At the moment, *relational databases* are the most popular. In essence, a relational database consists of spreadsheet types of tables. __SQL__ is a specialized language for interacting with databases. In this course, an open source database called *SQLite* will be used as Python has `sqlite3` module to work with it. All programs will start with this line:


In [61]:
import sqlite3

# Pandas

*Pandas* is a Python library that helps to explore, clean, and analyze data.
<br>

- Type `conda install pandas` in the first active row of the Jupyter Notebook you created earlier. 
- Run the code -- Run button on the top panel. This command will do what it says - install *pandas* library.
- Next line of code will be:


In [62]:
import pandas as pd

__Tables in pandas are called DataFrames.__
<br>

![pandas%20dataframe.png](attachment:pandas%20dataframe.png)

# NumPy

*NumPy* is a Python library and a great tool for scientific computing. NumPy’s main object is a multidimensional array. Its array class is called *ndarray*.
<br>

Similar to *pandas*, first install numpy in the notebook:
- Type `conda install numpy`.
- Then import the library:


In [63]:
import numpy as np

# Visuals

*matplotlib* and *seaborn* are Python libraries to create static and interactive visualizations.
<br>

Install them in your notebook:

- `conda install matplotlib`
- `conda install seaborn`
<br>

Then import the libraries:

In [64]:
import matplotlib.pyplot as plt
import seaborn as sns

### Jupyter Notebook as a Slideshow

To see this notebook as a live slideshow, we need to install RISE (Reveal.js - Jupyter/IPython Slideshow Extension):

1. Insert a cell and execute the following code: `conda install -c conda-forge rise`
2. Restart the Jupyter Notebook.
3. On the top of your notebook you have a new icon that looks like a bar chart; hover over the icon to see 'Enter/Exit RISE Slideshow'.
4. Click on the RISE icon and enjoy the slideshow.
5. You can edit the notebook in a slideshow mode by double clicking the line.
*This is done only once. Now all your notebooks will have the RISE extension (unless you re-install the Jupyter Notebook).*

# References

- Adhikari, DeNero, and Wagner, 2021, *Computational and Inferential Thinking: The Foundations of Data Science*
- Anaconda. https://www.anaconda.com
- Gries, Campbell, and Montojo, 2017, *Practical Programming: An Introduction to Computer Science Using Python 3.6*
- Matplotlib. https://matplotlib.org
- Numpy. https://numpy.org
- Pandas. https://pandas.pydata.org
- Seaborn. https://seaborn.pydata.org