(**You can also open this notebook in Google Colab**)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/xiangshiyin/data-programming-with-python/blob/main/2023-fall/2023-08-29/notebook/code_demo.ipynb)

# General principle of program execution

## Example 1 - Simple addition

In [None]:
x = 1
y = 2
x + y

## Example 2 - Order of Operations

Compute `2 * 3 + 1`

In [None]:
x = 2
x = x * 3
x = x + 1
print(x)

If we swap the order of code

In [None]:
x = 2
x = x + 1
x = x * 3
print(x)

# Primitive Data Types

## Numbers
* To define a variable of numeric data type, follow the syntax `VarName = value``
* You could use the function `int()` to cast the value type to integer number
* You could use the function `float()` to cast the value type to floating point number
* You could use `type()` to check the datatype of an object

In [None]:
x = 123
x + 1

In [None]:
x = 123.1
x + 1

In [None]:
int(123.6)

In [None]:
float(123)

In [None]:
type(float(123))

**In fact, you could use `type()` function to check the data types of any object in Python**

**Common arithmetic operators**
| Operation | Result                           |
| --------- | -------------------------------- |
| x + y     | sum of x and y                   |
| x - y     | difference of x and y            |
| x * y     | product of x and y               |
| x / y     | quotient of x and y              |
| x // y    | floored quotient of x and y      |
| x % y     | remainder of x / y               |
| -x        | x negated                        |
| +x        | x unchanged                      |
| abs(x)    | absolute value or magnitude of x |
| pow(x, y) | x to the power y                 |
| x ** y    | x to the power y                 |

In [None]:
7 / 3

In [None]:
7 // 3

In [None]:
7 % 3

In [None]:
(7 // 3) * 3 + (7 % 3)

In [None]:
2 ** 3

In [None]:
2 ** 0.5

## Strings

### Characters and strings
* Character: a single letter, number, or symbol
  * Example: `'a', '1', '\n'`
* String: a sequence of characters
  * Example: `"I like programming"`
* Can be expressed in a variety of ways:
  * Single quotes: `'single quotes'`
  * Double quotes: `"double quotes"`
  * Triple quoted: `'''Three single quotes'''` or `"""Three double quotes"""`
* You can use `str()` to cast the value type to string
* Special characters
  * `\n` - new line 
  * `\t` - tab (often equals to 8 spaces)

In [None]:
x = 'a'
x

In [None]:
y = "abc"
type(y)

In [None]:
z = """
the first line
the second line
the third line
"""
z

In [None]:
print(z)

In [None]:
xx = 123
type(xx)

In [None]:
str(xx)

In [None]:
type(str(xx))

In [None]:
## special characters
yy = 'a\nb\nc'
print(yy)

In [None]:
## special characters
zz = 'a\tb\tc'
print(zz)

### String properties
With `string` data, you could
* Check the length of a string with `len()`
* `+` and `*` operators
* Change the cases with `str.lower()` and `str.upper()`
* Replace part of the string with `str.replace()`
* Check if a substring is a part of a given string with the `in` operator
* String indexing
  * Each character of the string is assigned a index number representing its position in the string, and index number starts from 0
  * General indexing format - `StringValue[<lower_index>:<upper_index>]`
    * `<lower_index>` is inclusive
    * `<upper_index>` is exclusive
    * Negative indexing

In [None]:
x = 'adj;gja[gdjg;ajg;g]'
len(x)

In [None]:
y = 'x\ty'
len(y)

In [None]:
z = 'a'
z * 5

In [None]:
x.upper()

In [None]:
x.upper().lower()

In [None]:
x = 'abababababab'
y = x.replace('a', '0')
print(y)

In [None]:
'b' in x

In [None]:
xx = 'abcdefghijklmnopqrstuvwxyz'
len(xx)

In [None]:
print(xx[0])
print(xx[1])
print(xx[25])

In [None]:
xx[26]

In [None]:
xx[1:5]

In [None]:
# negative indexing
xx[-1]

### String formatting [[Official Documentation](https://docs.python.org/3.8/library/string.html#formatstrings)]

#### "Old C style" string formatting
The `%` operator is used to format a set of variables enclosed in a `tuple` (a fixed size list, will be covered later in this class), together with a format string, which contains normal text together with `argument specifiers`.

Common `argument specifiers` include:
* `%s` - String (or any object with a string representation, like numbers)
* `%d` - Integers
* `%f` - Floating point numbers (by default, it keeps 6 decimal digits)
* `%.<number of digits>f` - Floating point numbers with a fixed amount of digits to the right of the dot.


In [None]:
## format string with 1 placeholder
name = 'John'
'My name is %s' % name

In [None]:
age = 21
'His age is %d' % age

In [None]:
## format string with 2 placeholder
name = 'Xiangshi'
balance = 123.4
'Hello %s. Your current bank account balance is $%.2f' % (name, balance)

#### String formatting via the format() function
* In Python 3, you can also format strings by calling the `.format()` method on a string object. `{}` is used as a replacement field for values you'd like to plug in, and also a container for `format specifications`.
* A general convention is that <ins>an empty format specification produces the same result as if you had called the function `str()` on the value</ins>. A non-empty format specifications typically modifies the result.
* The common pattern of a replacement field is like `{field_name:format_spec}`
* Check the [official documentation](https://docs.python.org/3.8/library/string.html#formatstrings) for more details on `format specifications` pattern

* Example 1: no format modification
```
"{} {}".format(a,b), "{0} {1}".format(a,b), or "{A} {B}".format(A=a,B=b)
```

In [None]:
a = 1
b = 2
c = a + b
# '{} plus {} is {}'.format(b,a,c) 
'{B} plus {A} is {C}'.format(B=b,A=a,C=c) 


* Example 2: floating point number

In [None]:
a = 1
b = 2
c = a + b
'{A:f} plus {B:f} is {C:f}'.format(A=a,B=b,C=c) 
# '{A:d} plus {B:d} is {C:d}'.format(A=a,B=b,C=c) 
# you can also use index as the field_name
# '{0:f} plus {1:f} is {2:f}'.format(a,b,c) 
# '{:f} plus {:f} is {:f}'.format(a,b,c) 

In [None]:
## control the precision
'{A:.2f} plus {B:.2f} is {C:.2f}'.format(A=a,B=b,C=c) 

In [None]:
## align the number
print('{:>6.0f}'.format(1))
print('{:>6.1f}'.format(1))
print('{:>6.2f}'.format(1))
print('{:>6.3f}'.format(1))
print('{:>6.4f}'.format(1))

In [None]:
print('{:<6.0f}'.format(1))
print('{:<6.1f}'.format(1))

In [None]:
print('{:^6.0f}'.format(1))
print('{:^6.1f}'.format(1))

### `f-string` [[Official Documentation](https://realpython.com/python-f-strings/)]

* Starting from Python 3.6, the `f string` formatting became available
* It in general carrys the same coding style as the `format()` function and is even conciser!!


In [None]:
a = 3
b = 2
c = a + b
f'{b} plus {a} is {c}'

### Challenge
Align a string to right with a predefined window width

In [None]:
"{:>10}".format("Test")

In [None]:
"{:^10}".format("Test")

In [None]:
"{:<10}".format("Test")

In [None]:
"{:>10}".format("This is our first class, I enjoy meeting everyone here")

In [None]:
# A different way!!
"Test".ljust(10, '*')

## Boolean
This built-in data type that can take up the values: `True` and `False`, which often makes them interchangeable with the integers 1 and 0. Booleans are useful in conditional and comparison expressions.

In [None]:
x = 1
x == 1
# print(x==1)
# print(not x==2)
print(x!=2)

In [None]:
(100>10) and (100<200)

In [None]:
(100>10) or (100>200)

In [None]:
((100>10) & (100<200))==True
# ((100>10) & (100<200))==False

In [None]:
((100>10) & (100<200))==1

**Logical operators**

| Operator                                                              | Description                                                                        |
|-----------------------------------------------------------------------|------------------------------------------------------------------------------------|
| or                                                                    | Boolean OR                                                                         |
| and                                                                   | Boolean AND                                                                        |
| not x                                                                 | Boolean NOT                                                                        |
| in, not in, is, is not, <, <=, >, >=, !=, ==                          | Comparisons, including membership tests and identity tests                         |

## Revisit the example from last class

In [None]:
x = 1
y = 1
print(id(x))
print(id(y))

In [None]:
x = 'abc'
y = 'abc'
print(id(x))
print(id(y))

In Python, there are two types of data types: `immutable` and `mutable`. Immutable data types cannot be changed once they are created, while mutable data types can be changed.
* `Immutable` data types include:
    * Numbers
    * Strings
    * Tuples
* `Mutable` data types include:
  * Lists
  * Sets
  * Dictionaries

For `immutable` primitive data types, the Python interpreter optimizes memory usage by reusing the same memory location for these variables.

In [None]:
x = 1
y = x
print(id(x))
print(id(y))

In [None]:
x = 2
print(y)

In [None]:
print(id(x))
print(id(y))

# Non-primitive Data Types

## List
[[official documentation](https://docs.python.org/3/tutorial/datastructures.html)]
* List is a mutable sequence, typically used to store a collection of separate values. It is generally represented in a list of comma separated values(items) between a square bracket.
* It is normally used to store homogeneous items. However, items in a list don't necessarily need to be of the same data type.
* <span style="color:blue">Each item of a list is assigned a index number representing its position, and the index starts from 0</span> (Does this sound similar?)

**That's because both `string` and `list` belong to the so-called `Sequential Data Type`!!**

In [None]:
# Create an empty list
x = []

In [None]:
x = list()
type(x)

In [None]:
# Create a list of multiple elements
x = [1,2,3,4,5]

In [None]:
# Create a list of mixed data types
y = ['a',1,'b',2,'c']

### Common properties between `string` and `list`

#### Measure size with `len()`

In [None]:
x = [1,2,3]
len(x)

#### Indexing

In [None]:
x = [1,2,3,4,5]
x[1:3]

In [None]:
x[1:]

In [None]:
x[-1]

#### Expand

In [None]:
x = [1,2,3]
x = x + [4]
x

In [None]:
x = [1,2,3]
x * 5

#### Check if an element exists

In [None]:
4 in [1,2,3]

### "NEW" property for `list` (because it is mutable ...)

#### Expand ... in place

In [None]:
x = [1,2,3]
id(x)

In [None]:
x.append(4)
print(x)
print(id(x))

In [None]:
x.extend([4,5,6])
print(x)
print(id(x))

In [None]:
# insert to a specific position
x.insert(1,9)
print(x)
print(id(x))

#### Element/value changes

In [None]:
x = [1,2,3]
print(f'Before change, x = {x}')
x[1] = 5
print(f'After change, x = {x}')

In [None]:
x = [1,2,3,4]
print(f'Before pop, x = {x}')
x.pop()
print(f'After pop, x = {x}')

In [None]:
x = [1,2,3,4]
print(f'Before pop, x = {x}')
x.pop(1)
print(f'After pop, x = {x}')

#### Sort (because there is a sequence)

In [None]:
## list.sort()
x = [4,5,2,1]
print(f'Before sort, x = {x}, id = {id(x)}')
x.sort() # x.sort() does in-place sorting
print(f'After sort, x = {x}, id = {id(x)}')

In [None]:
x = [4,5,2,1]
print(f'Before sort, x = {x}, id = {id(x)}')
x.sort(reverse=True) # x.sort() does in-place sorting
print(f'After sort, x = {x}, id = {id(x)}')

In [None]:
## sorted()
x = [4,5,2,1]
print(f'Before sort, x = {x}, id = {id(x)}')
print(f'Sort result: {sorted(x)}') # sorted(x) output a new list holding the sorted values
print(f'After sort, x = {x}, id = {id(x)}')

#### The `del` statement

In [None]:
del x
x

### Revisit the "address" problem

In [None]:
x = [1,2,3]
y = x
print(f"""
Before the element change in x: 
    x = {x}
    y = {y}
    id of x: {id(x)}
    id of y: {id(y)}
""")
x[1] = 5
print(f"""
After the element change in x: 
    x = {x}
    y = {y}
    id of x: {id(x)}
    id of y: {id(y)}
""")


**The value of variable `y` changes along with variable `x` since they point to the same memory address!!**

In [None]:
# Any way to prevent this from happening??
x = [1,2,3]
y = x.copy()
print(f"""
Before the element change in x: 
    x = {x}
    y = {y}
    id of x: {id(x)}
    id of y: {id(y)}
""")
x[1] = 5
print(f"""
After the element change in x: 
    x = {x}
    y = {y}
    id of x: {id(x)}
    id of y: {id(y)}
""")


In [None]:
# Any way to prevent this from happening??
import copy

x = [1,2,3]
y = copy.copy(x)
print(f"""
Before the element change in x: 
    x = {x}
    y = {y}
    id of x: {id(x)}
    id of y: {id(y)}
""")
x[1] = 5
print(f"""
After the element change in x: 
    x = {x}
    y = {y}
    id of x: {id(x)}
    id of y: {id(y)}
""")


## Tuple
* Tuples are immutable sequences, typically used to store heterogeneous items.
* It is normally represented by a list of comma separated values(items) with surrounding parentheses

*Summary*:
* List, string, tuple are also called the `Sequence` type

*Major differences from `list`:*
* `()` instead of `[]`
* `immutable` vs. `mutable`

In [None]:
## Create a tuple
x = (1,2)
x

In [None]:
y = (1,2)
print(f'id of x = {id(x)}')
print(f'id of y = {id(y)}')

In [None]:
# is it really immutable??
x[1] = 3

In [None]:
# is it really immutable?
x.sort()

In [None]:
# how about the other way to sort??
y = sorted(x)

In [None]:
print(f'id of x = {id(x)}')
print(f'id of y = {id(y)}')

In [None]:
## Tuples can be constructed with or without parentheses
x = 1,2
x

In [None]:
## Indexing and slicing
x = (1,2,3)
x[1]

In [None]:
## Value in tuple, value not in tuple
x = (1,2,3)
1 in x

In [None]:
## Unpacking tuples
x,y = (1,2)
print(x,y)

In [None]:
z = (1,2)
x = z[0]
y = z[1]
print(x, y)

## Set
* A set is an unordered collection with no duplicate elements, same to the mathematical concept of `set`
* Set objects support mathematical operations like union, intersection, difference, and symmetric difference
* A good tutorial on Set operations: https://www.geeksforgeeks.org/python-set-operations-union-intersection-difference-symmetric-difference/

### Create a set
Use `{}` or `set()`

In [None]:
x = {1,2,3,4}
print(x)
print(type(x))

In [None]:
x = set([1,2,3,4])
print(x)
print(type(x))

In [None]:
x = set()
print(type(x))
print(f'Length of the set x: {len(x)}')

In [None]:
x.add('a')
print(f'Length of the set x: {len(x)}')

### Unordered

In [None]:
x[0]

### De-dup

In [None]:
set([1,2,3,3,4,4,5])

### If a value exists
Use `in`

In [None]:
1 in x

### Set operations
![](../pics/set_operations.png)

| Operation            | Python Code                            |
|----------------------|----------------------------------------|
| union                | `A \| B` or `A.union(B)`               |
| intersect            | `A & B` or `A.intersection(B)`         |
| difference           | `A - B` or `A.difference(B)`           |
| symmetric difference | `A ^ B` or `A.symmetric_difference(B)` |

In [None]:
x = {1,2,3}
y = {2,3,4,5}
print(x | y)
print(x.union(y))

In [None]:
x = {1,2,3}
y = {2,3,4,5}
print(x & y)
print(x.intersection(y))

In [None]:
x = {1,2,3}
y = {2,3,4,5}
print(x - y)
print(x.difference(y))

In [None]:
x = {1,2,3}
y = {2,3,4,5}
print(x ^ y)
print(x.symmetric_difference(y))

## Dictionary
* Dictionary is the most commonly used data structure to store key-value pairs
* Keys are unique within one dictionary, and search by key is of [constant time complexity](https://en.wikipedia.org/wiki/Time_complexity)
* The general format of a dictionary: `{key1:value1, key2:value2}`

### Create an empty dictionary

In [None]:
x = {}
print(type(x))
print(f'Length of the set x: {len(x)}')

In [None]:
x = dict()
print(type(x))
print(f'Length of the set x: {len(x)}')

### Create a non-empty dictionary

In [None]:
x = {'a': 1, 'b': 2}
print(type(x))
print(f'Length of the set x: {len(x)}')

### Key-value lookup

In [None]:
x['a']

In [None]:
x['c']

In [None]:
x.get('a')

In [None]:
x.get('c', -1)

### If a key exists
Use `in` statement

In [None]:
'a' in x

In [None]:
'c' in x

### Update a dictionary
- Change the value of a given key
- Introduce new key-value pairs

In [None]:
# update the value associated with a key
x = {'a':1, 'b':2}
x['a'] = 4
x

In [None]:
# update the dictiory
x = {'a':1, 'b':2}
x.update({'c':3, 'd':4})
x

In [None]:
x = {'a':1, 'b':2}
x.update({'b':3, 'd':4})
x