(**You can also open this notebook in Google Colab**)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/xiangshiyin/data-programming-with-python/blob/main/2023-fall/2023-08-29/notebook/code_demo.ipynb)

# Jupyter Notebook demo

# General principle of program execution

## Example 1 - Simple addition

In [1]:
x = 1
y = 2
x + y

3

## Example 2 - Order of Operations

Compute `2 * 3 + 1`

In [2]:
x = 2
x = x * 3
x = x + 1
print(x)

7


If we swap the order of code

In [3]:
x = 2
x = x + 1
x = x * 3
print(x)

9


In [6]:
del x

In [7]:
x = x * 3
x = x + 1
x = 2
print(x)

NameError: name 'x' is not defined

# Primitive Data Types

## Numbers
* To define a variable of numeric data type, follow the syntax `VarName = value`
* You could use the function `int()` to cast the value type to integer number
* You could use the function `float()` to cast the value type to floating point number
* You could use `type()` to check the datatype of an object

In [8]:
x = 123
x + 1

124

In [9]:
x = 123.1
x + 1

124.1

In [10]:
int(123.6)

123

In [11]:
round(123.6, 0)

124.0

In [12]:
float(123)

123.0

In [13]:
type(float(123))

float

**In fact, you could use `type()` function to check the data types of any object in Python**

**Common arithmetic operators**

| Operation | Result                           |
| --------- | -------------------------------- |
| x + y     | sum of x and y                   |
| x - y     | difference of x and y            |
| x * y     | product of x and y               |
| x / y     | quotient of x and y              |
| x // y    | floored quotient of x and y      |
| x % y     | remainder of x / y               |
| -x        | x negated                        |
| +x        | x unchanged                      |
| abs(x)    | absolute value or magnitude of x |
| pow(x, y) | x to the power y                 |
| x ** y    | x to the power y                 |

In [14]:
7 / 3

2.3333333333333335

In [15]:
7 // 3

2

In [16]:
7 % 3

1

In [17]:
(7 // 3) * 3 + (7 % 3)

7

In [18]:
pow(2, 3)

8

In [19]:
2 ** 3

8

In [21]:
2 ** 0.5

1.4142135623730951

## Strings

### Characters and strings
* Character: a single letter, number, or symbol
  * Example: `'a', '1', '\n'`
* String: a sequence of characters
  * Example: `"I like programming"`
* Can be expressed in a variety of ways:
  * Single quotes: `'single quotes'`
  * Double quotes: `"double quotes"`
  * Triple quoted: `'''Three single quotes'''` or `"""Three double quotes"""`
* You can use `str()` to cast the value type to string
* Special characters
  * `\n` - new line 
  * `\t` - tab (often equals to 8 spaces)

In [22]:
x = 'a'
x

'a'

In [23]:
y = "abc"
type(y)

str

In [24]:
y = """abc"""
type(y)

str

In [25]:
print(y)

abc


In [27]:
z = '
the first line
the second line
the third line
'
z

SyntaxError: unterminated string literal (detected at line 1) (3015490244.py, line 1)

In [26]:
z = """
the first line
the second line
the third line
"""
z

'\nthe first line\nthe second line\nthe third line\n'

In [28]:
print(z)


the first line
the second line
the third line



In [29]:
xx = 123
type(xx)

int

In [30]:
str(xx)

'123'

In [31]:
type(str(xx))

str

In [32]:
## special characters
yy = 'a\nb\nc'
print(yy)

a
b
c


In [33]:
## special characters
zz = 'a\tb\tc'
print(zz)

a	b	c


### String properties
With `string` data, you could
* Check the length of a string with `len()`
* `+` and `*` operators
* Change the cases with `str.lower()` and `str.upper()`
* Replace part of the string with `str.replace()`
* Check if a substring is a part of a given string with the `in` operator
* String indexing
  * Each character of the string is assigned a index number representing its position in the string, and index number starts from 0
  * General indexing format - `StringValue[<lower_index>:<upper_index>]`
    * `<lower_index>` is inclusive
    * `<upper_index>` is exclusive
    * Negative indexing

In [34]:
x = 'adj;gja[gdjg;ajg;g]'
len(x)

19

In [35]:
y = 'x\ty'
len(y)

3

In [36]:
z = 'a'
z * 5

'aaaaa'

In [38]:
x2 = x.upper()
print(type(x2))
print(x2)

<class 'str'>
ADJ;GJA[GDJG;AJG;G]


In [39]:
x.upper().lower()

'adj;gja[gdjg;ajg;g]'

In [40]:
x = 'abababababab'
y = x.replace('a', '0')
print(y)

0b0b0b0b0b0b


In [41]:
x = 'abfdabfgabfh'
y = x.replace('abf','0')
print(y)

0d0g0h


In [42]:
'b' in x

True

In [43]:
'i' in x

False

In [44]:
xx = 'abcdefghijklmnopqrstuvwxyz'
len(xx)

26

In [47]:
# print(xx[0])
# print(xx[1])
print(xx[25])

z


In [48]:
xx[26]

IndexError: string index out of range

In [49]:
xx[1:5]

'bcde'

In [50]:
# negative indexing
xx[-1]

'z'

In [51]:
xx[-2]

'y'

In [52]:
xx[-3:-1]

'xy'

In [53]:
xx[-2:-5]

''

In [54]:
xx[5:1]

''

In [55]:
xx[2:1]

''

In [56]:
xx[1:1]

''

In [58]:
xx[-3:25]

'xy'

### String formatting [[Official Documentation](https://docs.python.org/3.8/library/string.html#formatstrings)]

#### "Old C style" string formatting
The `%` operator is used to format a set of variables enclosed in a `tuple` (a fixed size list, will be covered later in this class), together with a format string, which contains normal text together with `argument specifiers`.

Common `argument specifiers` include:
* `%s` - String (or any object with a string representation, like numbers)
* `%d` - Integers
* `%f` - Floating point numbers (by default, it keeps 6 decimal digits)
* `%.<number of digits>f` - Floating point numbers with a fixed amount of digits to the right of the dot.


In [59]:
## format string with 1 placeholder
name = 'John'
'My name is %s' % name

'My name is John'

In [60]:
age = 21
'His age is %d' % age

'His age is 21'

In [61]:
## format string with 2 placeholder
name = 'Xiangshi'
balance = 123.4
'Hello %s. Your current bank account balance is $%.2f' % (name, balance)

'Hello Xiangshi. Your current bank account balance is $123.40'

#### String formatting via the format() function
* In Python 3, you can also format strings by calling the `.format()` method on a string object. `{}` is used as a replacement field for values you'd like to plug in, and also a container for `format specifications`.
* A general convention is that <ins>an empty format specification produces the same result as if you had called the function `str()` on the value</ins>. A non-empty format specifications typically modifies the result.
* The common pattern of a replacement field is like `{field_name:format_spec}`
* Check the [official documentation](https://docs.python.org/3.8/library/string.html#formatstrings) for more details on `format specifications` pattern

* Example 1: no format modification
```
"{} {}".format(a,b), "{0} {1}".format(a,b), or "{A} {B}".format(A=a,B=b)
```

In [63]:
a = 1
b = 2
c = a + b
# '{} plus {} is {}'.format(b,a,c) 
'{B} plus {A} is {C}'.format(B=b,A=a,C=c) 

'2 plus 1 is 3'


* Example 2: floating point number

In [66]:
a = 1
b = 2
c = a + b
# '{A:f} plus {B:f} is {C:f}'.format(A=a,B=b,C=c) 
'{A:d} plus {B:d} is {C:d}'.format(A=a,B=b,C=c) 
# you can also use index as the field_name
# '{0:f} plus {1:f} is {2:f}'.format(a,b,c) 
# '{:f} plus {:f} is {:f}'.format(a,b,c) 

'1 plus 2 is 3'

In [65]:
## control the precision
'{A:.2f} plus {B:.2f} is {C:.2f}'.format(A=a,B=b,C=c) 

'1.00 plus 2.00 is 3.00'

In [67]:
## align the number
print('{:>6.0f}'.format(1))
print('{:>6.1f}'.format(1))
print('{:>6.2f}'.format(1))
print('{:>6.3f}'.format(1))
print('{:>6.4f}'.format(1))

     1
   1.0
  1.00
 1.000
1.0000


In [71]:
print('{:>6.0f}'.format(1))
print('{:>6.1f}'.format(1))
print('{:>6.2f}'.format(1))
print('{:>6.3f}'.format(1))
print('{:>6.4f}'.format(1))

     1
   1.0
  1.00
 1.000
1.0000


In [69]:
print('{:<6.0f}'.format(1))
print('{:<6.1f}'.format(1))

1     
1.0   


In [70]:
print('{:^6.0f}'.format(1))
print('{:^6.1f}'.format(1))

  1   
 1.0  


### `f-string` [[Official Documentation](https://realpython.com/python-f-strings/)]

* Starting from Python 3.6, the `f string` formatting became available
* It in general carrys the same coding style as the `format()` function and is even conciser!!


In [None]:
a = 3
b = 2
c = a + b
f'{b} plus {a} is {c}'

### Challenge
Align a string to right with a predefined window width

In [72]:
"{:>10}".format("Test")

'      Test'

In [73]:
"{:^10}".format("Test")

'   Test   '

In [74]:
"{:<10}".format("Test")

'Test      '

In [75]:
"{:>10}".format("This is our first class, I enjoy meeting everyone here")

'This is our first class, I enjoy meeting everyone here'

In [76]:
# A different way!!
"Test".ljust(10, '*')

'Test******'

In [77]:
"Test".rjust(10, '*')

'******Test'

## Boolean
This built-in data type that can take up the values: `True` and `False`, which often makes them interchangeable with the integers 1 and 0. Booleans are useful in conditional and comparison expressions.

In [83]:
x = 1
# x == 1
# print(x==1)
print(not x==2)
# print(x!=2)

True


In [84]:
(100>10) and (100<200)

True

In [85]:
(100>10) or (100>200)

True

In [86]:
((100>10) & (100<200))==True
# ((100>10) & (100<200))==False

True

In [87]:
((100>10) | (100>200)) == True

True

**Logical operators**

| Operator                                                              | Description                                                                        |
|-----------------------------------------------------------------------|------------------------------------------------------------------------------------|
| or                                                                    | Boolean OR                                                                         |
| and                                                                   | Boolean AND                                                                        |
| not x                                                                 | Boolean NOT                                                                        |
| in, not in, is, is not, <, <=, >, >=, !=, ==                          | Comparisons, including membership tests and identity tests                         |

## Exercise

In [88]:
## input student name, output the name in upper case
name = input('Type in a name:\n')
name2 = name.upper()
print(f'The transformed name is {name2}')

Type in a name:
 xiangshi yin


The transformed name is XIANGSHI YIN


In [90]:
## input student name, check if the input name is John
name = input('Type in a name:\n')
check = name == 'John'
print(f'Name check result: {check}')

Type in a name:
 Adam


Name check result: False


In [91]:
name = 'Adam'
name == 'John'

False

In [92]:
name = 'John'
name == 'John'

True

In [93]:
name = 'John1'
name == 'John'

False

In [94]:
name = 'john'
name == 'John'

False

In [96]:
## input student name, check if the input name is John
name = input('Type in a name:\n')
check = name.upper() == 'John'.upper()
print(f'Name check result: {check}')

Type in a name:
 JoHN


Name check result: True


## Revisit the example from last class

In [97]:
x = 1
y = x
print(id(x))
print(id(y))

140444222914800
140444222914800


In [98]:
x = 123
y = 123
print(id(x))
print(id(y))

140444222918704
140444222918704


In [99]:
x = 'abc'
y = 'abc'
print(id(x))
print(id(y))

140443954601904
140443954601904


In Python, there are two types of data types: `immutable` and `mutable`. Immutable data types cannot be changed once they are created, while mutable data types can be changed.
* `Immutable` data types include:
    * Numbers
    * Strings
    * Tuples
* `Mutable` data types include:
  * Lists
  * Sets
  * Dictionaries

For `immutable` primitive data types, the Python interpreter optimizes memory usage by reusing the same memory location for these variables.

In [None]:
x = 1
y = x
print(id(x))
print(id(y))

In [None]:
x = 2
print(y)

In [None]:
print(id(x))
print(id(y))

In [102]:
x = 'abcd'
print(f'The id of variable x before replace: {id(x)}')
print(f"The id of variable x before replace: {id(x.replace('a', 'E'))}")

The id of variable x before replace: 140443962178544
The id of variable x before replace: 140444230926256


# Non-primitive Data Types

## List
[[official documentation](https://docs.python.org/3/tutorial/datastructures.html)]
* List is a mutable sequence, typically used to store a collection of separate values. It is generally represented in a list of comma separated values(items) between a square bracket.
* It is normally used to store homogeneous items. However, items in a list don't necessarily need to be of the same data type.
* <span style="color:blue">Each item of a list is assigned a index number representing its position, and the index starts from 0</span> (Does this sound similar?)

**That's because both `string` and `list` belong to the so-called `Sequential Data Type`!!**

In [103]:
# Create an empty list
x = []

In [104]:
x = list()
type(x)

list

In [105]:
# Create a list of multiple elements
x = [1,2,3,4,5]

In [106]:
# Create a list of mixed data types
y = ['a',1,'b',2,'c']

In [107]:
x

[1, 2, 3, 4, 5]

In [108]:
y

['a', 1, 'b', 2, 'c']

### Common properties between `string` and `list`

#### Measure size with `len()`

In [109]:
x = [1,2,3]
len(x)

3

#### Indexing

In [110]:
x = [1,2,3,4,5]
x[1:3]

[2, 3]

In [111]:
x[1:]

[2, 3, 4, 5]

In [112]:
x[-1]

5

#### Expand

In [114]:
x = [1,2,3]
print(id(x))
x = x + [4]
# x
print(id(x))

140443962123904
140443962145856


In [115]:
x = 'abc'
x = x + 'def'
print(x)

abcdef


In [116]:
x = [1,2,3]
x * 5

[1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3]

#### Check if an element exists

In [117]:
4 in [1,2,3]

False

### "NEW" property for `list` (because it is mutable ...)

#### Expand ... in place

In [118]:
x = [1,2,3]
id(x)

140443962084352

In [119]:
x.append(4)
print(x)
print(id(x))

[1, 2, 3, 4]
140443962084352


In [120]:
x.extend([4,5,6])
print(x)
print(id(x))

[1, 2, 3, 4, 4, 5, 6]
140443962084352


In [121]:
# insert to a specific position
x.insert(1,9)
print(x)
print(id(x))

[1, 9, 2, 3, 4, 4, 5, 6]
140443962084352


#### Element/value changes

In [123]:
x = [1,2,3]
print(f'Before change, x = {x}, id = {id(x)}')
x[1] = 5
print(f'After change, x = {x}, id = {id(x)}')

Before change, x = [1, 2, 3], id = 140443962124032
After change, x = [1, 5, 3], id = 140443962124032


In [124]:
x = [1,2,3,4]
print(f'Before pop, x = {x}, id = {id(x)}')
x.pop()
print(f'After pop, x = {x}, id = {id(x)}')

Before pop, x = [1, 2, 3, 4], id = 140443962141760
After pop, x = [1, 2, 3], id = 140443962141760


In [125]:
x = [1,2,3,4]
print(f'Before pop, x = {x}')
x.pop(1)
print(f'After pop, x = {x}')

Before pop, x = [1, 2, 3, 4]
After pop, x = [1, 3, 4]


#### Sort (because there is a sequence)

In [126]:
## list.sort()
x = [4,5,2,1]
print(f'Before sort, x = {x}, id = {id(x)}')
x.sort() # x.sort() does in-place sorting
print(f'After sort, x = {x}, id = {id(x)}')

Before sort, x = [4, 5, 2, 1], id = 140443962176320
After sort, x = [1, 2, 4, 5], id = 140443962176320


In [127]:
x = [4,5,2,1]
print(f'Before sort, x = {x}, id = {id(x)}')
x.sort(reverse=True) # x.sort() does in-place sorting
print(f'After sort, x = {x}, id = {id(x)}')

Before sort, x = [4, 5, 2, 1], id = 140443962084352
After sort, x = [5, 4, 2, 1], id = 140443962084352


In [128]:
## sorted()
x = [4,5,2,1]
print(f'Before sort, x = {x}, id = {id(x)}')
print(f'Sort result: {sorted(x)}') # sorted(x) output a new list holding the sorted values
print(f'After sort, x = {x}, id = {id(x)}')

Before sort, x = [4, 5, 2, 1], id = 140443962129856
Sort result: [1, 2, 4, 5]
After sort, x = [4, 5, 2, 1], id = 140443962129856


#### The `del` statement

In [130]:
del x

NameError: name 'x' is not defined

In [131]:
x

NameError: name 'x' is not defined

### Revisit the "address" problem

In [132]:
x = [1,2,3]
y = x
print(f"""
Before the element change in x: 
    x = {x}
    y = {y}
    id of x: {id(x)}
    id of y: {id(y)}
""")
x[1] = 5
print(f"""
After the element change in x: 
    x = {x}
    y = {y}
    id of x: {id(x)}
    id of y: {id(y)}
""")



Before the element change in x: 
    x = [1, 2, 3]
    y = [1, 2, 3]
    id of x: 140443962298496
    id of y: 140443962298496


After the element change in x: 
    x = [1, 5, 3]
    y = [1, 5, 3]
    id of x: 140443962298496
    id of y: 140443962298496



In [133]:
x = 1
y = x
x = 2
print(y)

1


**The value of variable `y` changes along with variable `x` since they point to the same memory address!!**

In [134]:
# Any way to prevent this from happening??
x = [1,2,3]
y = x.copy()
print(f"""
Before the element change in x: 
    x = {x}
    y = {y}
    id of x: {id(x)}
    id of y: {id(y)}
""")
x[1] = 5
print(f"""
After the element change in x: 
    x = {x}
    y = {y}
    id of x: {id(x)}
    id of y: {id(y)}
""")



Before the element change in x: 
    x = [1, 2, 3]
    y = [1, 2, 3]
    id of x: 140443962177600
    id of y: 140443962081600


After the element change in x: 
    x = [1, 5, 3]
    y = [1, 2, 3]
    id of x: 140443962177600
    id of y: 140443962081600



In [135]:
# Any way to prevent this from happening??
import copy

x = [1,2,3]
y = copy.copy(x)
print(f"""
Before the element change in x: 
    x = {x}
    y = {y}
    id of x: {id(x)}
    id of y: {id(y)}
""")
x[1] = 5
print(f"""
After the element change in x: 
    x = {x}
    y = {y}
    id of x: {id(x)}
    id of y: {id(y)}
""")



Before the element change in x: 
    x = [1, 2, 3]
    y = [1, 2, 3]
    id of x: 140443962083648
    id of y: 140444230595968


After the element change in x: 
    x = [1, 5, 3]
    y = [1, 2, 3]
    id of x: 140443962083648
    id of y: 140444230595968



## Tuple
* Tuples are immutable sequences, typically used to store heterogeneous items.
* It is normally represented by a list of comma separated values(items) with surrounding parentheses

*Summary*:
* List, string, tuple are also called the `Sequence` type

*Major differences from `list`:*
* `()` instead of `[]`
* `immutable` vs. `mutable`

In [136]:
## Create a tuple
x = (1,2)
x

(1, 2)

In [138]:
y = (1,2)
print(f'id of x = {id(x)}')
print(f'id of y = {id(y)}')

id of x = 140443962097408
id of y = 140444226980672


In [139]:
# is it really immutable??
x[1] = 3

TypeError: 'tuple' object does not support item assignment

In [140]:
# is it really immutable?
x.sort()

AttributeError: 'tuple' object has no attribute 'sort'

In [141]:
# how about the other way to sort??
y = sorted(x)

In [142]:
print(f'id of x = {id(x)}')
print(f'id of y = {id(y)}')

id of x = 140443962097408
id of y = 140443961995072


In [143]:
y

[1, 2]

In [144]:
type(y)

list

In [145]:
## Tuples can be constructed with or without parentheses
x = 1,2
x

(1, 2)

In [146]:
## Indexing and slicing
x = (1,2,3)
x[1]

2

In [147]:
## Value in tuple, value not in tuple
x = (1,2,3)
1 in x

True

In [148]:
## Unpacking tuples
x,y = (1,2)
print(x,y)

1 2


In [149]:
z = (1,2)
x = z[0]
y = z[1]
print(x, y)

1 2


In [150]:
x, y = [3,4]
print(x)
print(y)

3
4


## Set
* A set is an unordered collection with no duplicate elements, same to the mathematical concept of `set`
* Set objects support mathematical operations like union, intersection, difference, and symmetric difference
* A good tutorial on Set operations: https://www.geeksforgeeks.org/python-set-operations-union-intersection-difference-symmetric-difference/

### Create a set
Use `{}` or `set()`

In [151]:
x = {1,2,3,4}
print(x)
print(type(x))

{1, 2, 3, 4}
<class 'set'>


In [152]:
x = set([1,2,3,4])
print(x)
print(type(x))

{1, 2, 3, 4}
<class 'set'>


In [153]:
x = set() # the only way to create an empty set
print(type(x))
print(f'Length of the set x: {len(x)}')

<class 'set'>
Length of the set x: 0


In [154]:
x.add('a')
print(f'Length of the set x: {len(x)}')

Length of the set x: 1


### Unordered

In [155]:
x[0]

TypeError: 'set' object is not subscriptable

### De-dup

In [156]:
set([1,2,3,3,4,4,5])

{1, 2, 3, 4, 5}

### If a value exists
Use `in`

In [157]:
1 in x

False

### Set operations
![](../pics/set_operations.png)

| Operation            | Python Code                            |
|----------------------|----------------------------------------|
| union                | `A \| B` or `A.union(B)`               |
| intersect            | `A & B` or `A.intersection(B)`         |
| difference           | `A - B` or `A.difference(B)`           |
| symmetric difference | `A ^ B` or `A.symmetric_difference(B)` |

In [158]:
x = {1,2,3}
y = {2,3,4,5}
print(x | y)
print(x.union(y))

{1, 2, 3, 4, 5}
{1, 2, 3, 4, 5}


In [159]:
x = {1,2,3}
y = {2,3,4,5}
print(x & y)
print(x.intersection(y))

{2, 3}
{2, 3}


In [160]:
x = {1,2,3}
y = {2,3,4,5}
print(x - y)
print(x.difference(y))

{1}
{1}


In [161]:
x = {1,2,3}
y = {2,3,4,5}
print(x ^ y)
print(x.symmetric_difference(y))

{1, 4, 5}
{1, 4, 5}


## Dictionary
* Dictionary is the most commonly used data structure to store key-value pairs
* Keys are unique within one dictionary, and search by key is of [constant time complexity](https://en.wikipedia.org/wiki/Time_complexity)
* The general format of a dictionary: `{key1:value1, key2:value2}`

### Create an empty dictionary

In [162]:
x = {}
print(type(x))
print(f'Length of the set x: {len(x)}')

<class 'dict'>
Length of the set x: 0


In [163]:
x = dict()
print(type(x))
print(f'Length of the set x: {len(x)}')

<class 'dict'>
Length of the set x: 0


### Create a non-empty dictionary

In [164]:
x = {'a': 1, 'b': 2}
print(type(x))
print(f'Length of the set x: {len(x)}')

<class 'dict'>
Length of the set x: 2


### Key-value lookup

In [165]:
x['a']

1

In [166]:
x['c']

KeyError: 'c'

In [167]:
x.get('a')

1

In [168]:
x.get('c', -1)

-1

### If a key exists
Use `in` statement

In [169]:
'a' in x

True

In [170]:
'c' in x

False

### Update a dictionary
- Change the value of a given key
- Introduce new key-value pairs

In [171]:
# update the value associated with a key
x = {'a':1, 'b':2}
x['a'] = 4
x

{'a': 4, 'b': 2}

In [172]:
# update the dictiory
x = {'a':1, 'b':2}
x.update({'c':3, 'd':4})
x

{'a': 1, 'b': 2, 'c': 3, 'd': 4}

In [173]:
x = {'a':1, 'b':2}
x.update({'b':3, 'd':4})
x

{'a': 1, 'b': 3, 'd': 4}

## Exercise

In [None]:
## input student name, check if the student is registered in the class



In [None]:
## input student name, output the student's grade of the class


# Control Structure

## Conditional Statements

## Exercise