<a href="https://colab.research.google.com/github/statrliu/data-science-letures/blob/main/Introduction_to_Python_for_Data_Science_1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#**Full Cycle of a Data Science Project**
  1. Defining the Problem to Solve 
  2. Collecting Data
  3. Manipulating Data
  4. Building, Evaluating, and Selecting Models
  5. Delivering Results 

# **Overview of Data Manipulation**
+ Cleaning the data
  + Investigating errors/inconsistency/outliers
  + Investigating missing values 
+ Formating and combining multiple data sets
+ Generating new features




# **Python for Data Science**


In [None]:
# Check Python version
import sys
sys.version

## _Object and Type (Class)_

**Object:** An object can hold data (attributes) and perform operations (methods). For example:
+ An integer object can hold integer values and support arithmetic operations such as addition and subtraction. 
+ A string object can hold text values and support operations such as concatenation and substring extraction.

**Type/Class:** A Python type is a classification of objects based on their characteristics and the operations they support. 

***In Python, almost everything is an object.***

***task***: Find the type of an object
```
#using build in type() function
type(10)
type(1.5)
```

***task***: List an type/object's attributes (both data and method).
```
dir(10)
dir(int)
```

***task***: Access an object's data/method using `.` notation 
```
(10).real
(10).__str__() # same as str(10)
```





**Variables (Object reference)**

In Python, an object reference is a value that points to the memory location of an object. 
+ Every object in Python has a unique memory address, which can be accessed using the built-in id() function. 
+ An object reference is essentially a pointer to this memory location.

If you create a variable `x` and assign it a value of 25, Python creates an integer object with a value of 25 and assigns a reference to that object to the variable `x`. You can then perform operations on that object, such as adding `x` to another integer.

| Variable | Value   | Type     | Memory Address |
|:--------:|:-------:|:--------:|:--------------|
| x        | 25      | Integer  | 0x00123        |
| y        | 3.14    | Float    | 0x00456        |
| z        | "hello" | String   | 0x009AB        |
| arr      | [1,2,3] | List     | 0x00DEF        |
| dict     | {prop1: "value1", prop2: 42} | Object | 0x0112F        |






***task:*** Find the unique id (memory address) for an object.
```
# Using build in id() function
id(20)
id("data")
```

**Assignment Statement:** 

```
small_int = 10 # snake case
largeFloat = 1234567.89 # camel case
HighVolumn = 6789 # pascal case
```

**Naming rules:**

+ A variable name must start with a letter or the underscore character

+ A variable name can only contain alpha-numeric characters and underscores `(A-z, 0-9, and _ )`

+ Variable names are case-sensitive (`sd` and `SD` are different )

+ A name cannot coincide with one of Python’s reserved words (keywords).
```  
False       class       finally     is          return
None        continue    for         lambda      try
True        def         from        nonlocal    while
and         del         global      not         with
as          elif        if          or          yield
assert      else        import      pass     
break       except      in          raise
```

+ Using meaningfull names (`counter` is better than `c`)


## *Mutable and Immutable Objects*

In Python, objects are either immutable or mutable. 

**Imutable Object**

An object is immutable if its value cannot be changed after it is created.

+ If you want to change the value of an immutable object, you must create a new object with the desired value. 
+ Examples of immutable objects in Python include numbers (`int`, `float`), strings, and tuples.

**Mutable Object**

An object is mutable if its value can be changed after it is created.

+ Changes to mutable objects affect the original object, rather than creating a new object. 
+ Examples of mutable objects in Python include lists,dictionaries, and sets.





## _Fundamental Data Types_

### **Integer and Float**
Python `int` type is used for integer numbers (positive and negative)

Python `float` type (floating-point) represents a real number with a fractional part

```
# integers
type(3)
type(100_000)

# real numbers
type(3.0) # A float with a decimal point
type(2.0e-4) # # A float with an exponent
```




***task:*** Type conversion
```
# int to float
float(3_000)

# float to int. 
int(3.1415) # truncation, cut the fraction part.
int(-3.1415)
```

A `float` object may take less memory than a `int` object

```
import sys

print(sys.getsizeof(1000_000_000)) # memory size in byte
print(sys.getsizeof(1e9)) # memory size in byte
```


Floating numbers in computer are just approximations.
(https://docs.python.org/3/tutorial/floatingpoint.html)

```
0.1 + 0.1 + 0.1 == 0.3
format(0.1, '.20f')
``` 

When dealing with float numbers, don't use `==` for equality comparision. Use `isclose` funtion in the `math` module. 
```
import math
math.isclose(0.1 + 0.1 + 0.1, 0.3, rel_tol = 0.001, abs_tol = 0.001)
```

#### *Arithmetic Operators*
```
# +, - as unary operators
+5
-12.5

# +, - as binary operators
3 + 5
type(3 + 5.0) # float, Python make implicit conversion to float.
23 - 30

# multipication and exponentiation
5 * 3 
type(5*3)
type(5*3.0)

5 ** 3 
type(5**3) 
5.5 ** 3.5

# True division '/' always return a float
4/2
type(4/2)
```






**Floor Division "//":** 

Divides one number by another and returns the largest "integer" value that is less than or equal to the result of the true division.
```
# Returns a int object
print(7//3)
print(type(7//3))

print(7/-3)
print(7//-3)

# Returns a float object
print(8.5//2.1)
print(type(8.5//2.1))


```

**Modulo "%":**

Returns the remainder of dividing one number by another.

Python uses the following equation to find the remainder (`m % n`):
```
m = n * (m//n) + m % n
```

```
print(17//3)
print(17%3)
print(3 * (17//3) + (17%3) )

print(-17//3)
print(-17%3)
print(3 * (-17//3) + (-17%3) )

print(17//-3)
print(17%-3)
print(-3 * (17//-3) + (17%-3) )

print(25.5//3.6)
print(25.5%3.6)
print(3.6 * (25.5//3.6) + (25.5%3.6))
```

**Round() function**

The `round()` function in Python is used to round a floating-point number to a specified number of decimal places or to the nearest integer. The function takes two arguments: 
+ the number to be rounded, 
+ an optional second argument that specifies the number of decimal places to round to.

```
print(round(1233.1415))
print(round(1233.1415, 2))
print(round(1233.1415, -1))
```

**Note**

The `round()` function uses a rounding rule called "round half to even", which means that if the digit to be rounded is exactly halfway between two possible values, the function rounds to the nearest even number. 
```
print(round(2.5))
print(round(3.5))

print(round(12.45, 1))

```

#### *Augmented Assignment Operators*

In Python, augmented assignment operators are shorthand operators that combine an arithmetic or bitwise operation with an assignment operation. 

They allow you to perform an operation and assign the result to a variable in a single step.
```
x = 3
x += 2    # equivalent to x = x + 2
print(x)  # Output: 5
```



### **None Type**
`None` is the only value of the `NoneType`. 
+ `NoneType` can be used for missing values.
+ If a function does not have a `return` statement, the function will automatically return `None`.

### **Bool**

Bool type has two values: `True, False` 
  + `True and False` are objects. You can use `id(True), id(False)` to check their addresses.
  + Type of `True, False` is `bool`
    ```
    type(True)
    # Output: bool
    ```  
  + Type conversion (casting)
    + The result of a conditional statement will be implicitly converted to `bool` type.

      ```
      name = ""
      if name: # empty string is evaluated as False.
         print(name)
      else:
         print("name is empty")

      # Output: name is empty       
      ```

    + Using `bool()` for expicit conversion.   
    + Truthy and Falsey. 
      + Empty `string` is evaluated as `False` when used in a conditional statement. So it is a Falsey. Other string values are evaluated as `True`, so they are Truthy.
        ```
        bool("")
        # Output: False

        bool(" ") # one space
        # Output: True
        ``` 
      + For numbers like `int, float`, zero is converted to `False`, other values are Truthy.
        ```
        bool(0) # int 
        # Output: False

        bool(0.0) # float 
        # Output: False
        ``` 
      + Empty `tuple, list, set, dict` are Falsey      
      + `None` object is Falsey.
      + In general, dunder method `__bool__()` is called when `bool` type convertion is invoked. 
        ```
        bool(10)
        # Output: True

        # It is equivalent to:
        (10).__bool__()
        # Output: True
        ``` 

        + So for a custom class, we can implement `__bool__()` to tell Python how to convert a instance of our class to `bool` type.

          ```
          class FalseClass:
              def __bool__(self):
                  return False
                  
          tmp_ins = FalseClass()
          bool(tmp_ins)
          # Output: False        
          ``` 




#### *Comparison Operators*
In Python, comparison operators are used to compare two values and return a Boolean value (`True` or `False`) based on the comparison result. 

+ `==` (equal to): returns `True` if the values on both sides of the operator are equal.
+ `!=` (not equal to): returns `True` if the values on either side of the operator are not equal.
+ `<` (less than): returns `True` if the value on the left side of the operator is less than the value on the right side.
+ `>` (greater than): returns `True` if the value on the left side of the operator is greater than the value on the right side.
+ `<=` (less than or equal to): returns `True` if the value on the left side of the operator is less than or equal to the value on the right side.
+ `>=` (greater than or equal to): returns `True` if the value on the left side of the operator is greater than or equal to the value on the right side.

```
x = 5
y = 10

# Check if x is less than y
print(x < y)  # Output: True

# Check if x is equal to y
print(x == y)  # Output: False

# Check if x is not equal to y
print(x != y)  # Output: True

# Check if y is greater than or equal to x
print(y >= x)  # Output: True

```

##### **Equality vs Identity**

+ In Python the `==` operator checks for value equality, not object identity. 
+ If you want to check whether two objects are the same object (i.e., they have the same memory address), you can use the `is` operator. 
```
x = [1, 2, 3]
y = [1, 2, 3]
print(x == y)  # Output: True
print(x is y)  # Output: False
print(id(x))
print(id(y))
```

#### *Logical Operators*
In Python, logical operators are used to perform logical operations on Boolean values. There are three logical operators available in Python:

+ `and` - returns `True` if both operands are `True`, otherwise returns `False`.
+ `or` - returns `True` if at least one operand is `True`, otherwise returns `False`.
+ `not` - returns the opposite Boolean value of the operand. If the operand is `True`, it returns `False`, and if the operand is `False`, it returns `True`.

```
x = 5
y = 10
z = 15

# Using the "and" operator
if x < y and y < z:
    print("x is less than y, and y is less than z")
else:
    print("Either x is greater than or equal to y, or y is greater than or equal to z")

# Using the "or" operator
if x < y or x < z:
    print("x is less than either y or z")
else:
    print("x is not less than either y or z")

# Using the "not" operator
if not x == y:
    print("x is not equal to y")
else:
    print("x is equal to y")

```

##### **Short-circuit Evaluation**

In Python, short-circuit evaluation is a behavior exhibited by logical operators `and` and `or`. It is a way of optimizing the evaluation of boolean expressions by evaluating only what is necessary.

+ When using the `and` operator, if the left-hand operand evaluates to `False`, the right-hand operand is not evaluated because the entire expression is guaranteed to be `False` regardless of the value of the right-hand operand. 
+ Similarly, when using the `or` operator, if the left-hand operand evaluates to `True`, the right-hand operand is not evaluated because the entire expression is guaranteed to be `True` regardless of the value of the right-hand operand.

+ Short-circuit evaluation can be useful in situations where evaluating an expression could be time-consuming, or to avoid exceptions.

```
# Using the "and" operator
x = 5
y = None
if y is not None and x/y > 2:
    print("x is more than twice y")

x/y

# Using the "or" operator
my_list = []
if len(my_list) == 0 or my_list[0] == "Hello":
    print("The list is empty or the first element is 'Hello'")

my_list[0] == "Hello"    
```



### **Collections**
A collection object refers to any data structure that can hold multiple elements.
+ Sequence type
+ Mapping type

#### *Sequence Type*
Sequence contains an **ordred** list of objects. All the sequence types implement `__getitem__()` method, which can be used to extract objects using index operator `[]`.

##### **String**
A string object is an ordered **immutable** sequence of characters. 


###### *Create A String*
```
# Use ' or " 
str_1 = 'data science'
str_2 = "statistics"
str_3 = "it's hard"
str_4 = 'it's hard' # SyntaxError: invalid syntax 

# to include ' inside a string with single quotes, 
# need to use the escape character, the backslash (\).
str_4 = 'it\'s hard'

# Use ''' or """ to create multiline strings
mstr_1 = '''this is a 
multiline string
''' 
```
 

###### *Subset a String*
```
string = "Hello, world!"
# Use index
string = "Hello, world!" # 13 characters
print(string[0])  # H
print(string[7])  # w

# Use slice object
print(string[0:5])   # Hello
print(string[7:])    # world!
print(string[:5])    # Hello
print(string[-6:])   # world!

```

###### *Operators, Functions and Methods for String Objects*
***task:*** Concatenate multiple strings
```  
str_1 = "You can "
str_2 = "subset a string "
str_3 = "using indexing and slicing."

str_12 = str_1 + str_2 
str_12
# this is equivalent to:
str_12 = str_1.__add__(str_2)

str_123 = str_1 + str_2 + str_3
str_123
```



***task:*** Repeat a string multiple times
```
str_1 = "Data"
str_1 * 3
# or
str_1.__mul__(3)
```

***task:*** Find the index of a substring
```
# find() method
str_1 = "tennis"
str_1.find("is") # 4. It returns the lowest index of the first match.
str_1.find("ball") # -1. 
str_1.find("n") # 2. It returns the lowest index of the first match. 

str_1.find("n", 3, None) # 3. Find the pattern in a substring, starting from index 3 to the end of the string. 

# index() method is very similar to find() method
# But index() method will raise ValueError when the substring is not found

str_1.index("ball") # ValueError: substring not found
```

***task:*** Replace some or all occurrences of a substring by a new string.
```
# replace() method
str_1 = "cat " * 4
print(str_1)

str_1.replace("cat", "dog")
str_1.replace("cat", "dog", 3)
str_1.replace("cat", "dog", 6)
str_1.replace("lion", "dog")
```

***tasks:*** Case Conversion
```
str_1 = "data sciEnCe"
str_1.lower()
str_1.upper()
str_1.casefold() # useful when you deal with non-english words

str_1.swapcase()
str_1.title()
str_1.capitalize()
```





***tasks:*** Remove character(s) from the beginning/end of a string.
```
str_1 = " Hello world! \t \n"
str_1.strip()
str_1.rstrip()
str_1.lstrip()

str_1.strip(" He")
```

***task:*** Format string dynamically
```
# Use f-string (Python 3.6 or later)
lang_1 = "R"
lang_2 = "Python"
version = 3.6
f"{lang_1} is easy to learn, so is {lang_2} {version}"
```

##### **List**
fixed length, **mutable**, heterogeneous, sequence.


###### *Create a List*
***task:*** Create a list
```
# Use []
list_1 = [1, 2, 3]
list_2 = [1, 3.5, "string"]
    
# Using list(obj) to convert an iterable object to a list.

list_1 = list("string")
list_2 = list(range(5))
list_3 = list(range(0, 5))
list_4 = list(range(0, 10, 2))
```
***task:*** Get the length of a list
```
# Use len() function
list_1 = [1, 2, 3] 
print(len(list_1))
```

###### *Subset a List*
***task:*** Access an element of a list
```
list_1 = [1, 2, 3, 4, 5]
# Using [] with an single index:
    
list_1[0] # 1 return a single object, not list
list_1[-2] # 4 Negative index will be converted to a positive number using length of the list plus the index. So, in this example -2 will be converted to 5+ (-2) = 3
  
list_1[10] # IndexError: list index out of range
```




We can also use `[]` with slice object to get a subset of a list.

###### *Slice Object*
    
* In Python, a slice object is used to define a range of elements to extract from a sequence, such as a string, list, or tuple.
 
* A slice object is created using the built-in `slice()` function, and it takes three arguments: start, stop, and step.

```
my_list = [1,3,5,7,9,11,13]
my_slice = slice(2, 5)
my_list_slice = my_list[my_slice]
print(my_list_slice)

# You can also use shorthand notation to create a slice object:
my_list_slice = my_list[2:5]
print(my_list_slice) 
```



    


More Examples:

```
 
list_1[0:4:2] # [1, 3]
# equivalent to: 
list_1[slice(0, 4, 2)]

list_1[0:1] # [1] returns a list!

list_1[2:4] # [3, 4]  
# equivalent to:
list_1[slice(2,4)]

list_1[1:] # [2, 3, 4, 5]
# equivalent to:
list_1[slice(1, None, None)]

list_1[1::2] # [2, 4]
list_1[:3] # [1, 2, 3]
list_1[:3:2] # [1, 3]

list_1[1:-2] # <=> list_1[1:3] -> [2, 3]
list_1[3:1] # <=> list[3:1:1] -> []

## The followings are less common    
list_1[-9::1] # <=> list(0:5:1) => [1, 2, 3, 4, 5]
list_1[-9:-8:1] # <=> list(0:0:1) => []
    
list_1[10::-1] # <=> list(5:0:-1) => [5, 4, 3, 2, 1]
list_1[10:9:-1] # <=> list(5:5:-1) => []
list_1[-6::-1] # []
    
list_1[3:1:-1] # [4, 3] negative step value means move backward.
list_1[0:4:-1] # []
```

When a slice object is used to slice a sequence. Python will get effective start and stop value when a sequence is given (based on the lengh of the sequence).

      ```
      seq[i:j] 
      if i > len(seq) -> len(seq);  if j > len(seq) -> len(seq)
      if i < 0 -> max(0, len(seq) + i);  if j < 0 -> max(0, len(seq) + j)
  
      if i ommited or None -> 0
      if j ommited or None -> len(seq)
      ```
  
      ```
      seq[i:j:k] with k < 0 
      if i >= len(seq) -> len(seq) - 1;  if j >= len(seq) -> len(seq) - 1
      if i < 0 -> max(-1, len(seq) + i);  if j < 0 -> max(-1, len(seq) + j)
  
      if i ommited or None -> len(seq) - 1
      if j ommited or None -> -1
      ```


###### *Operations on one list*


***task:*** Add one element to a list
+ Append an element
  ```
  list_1 = [1, 2, 3, 4, 5]
  list_1.append(6) 
  # in-place method, return None. 
  # Append 6 to the end of the sequence 
  
  list_1 # [1, 2, 3, 4, 5, 6]
  ```
+ Insert an element  
  ```  
  list_1 = [1, 2, 3, 4, 5]
  list_1.insert(1, 'new') 
  # in-place method, return None. 
  # Insert a element at the given index.

  list_1 # [1, 'new', 2, 3, 4, 5]

  list_1.insert(10, 'new') 
  # [1, 'new', 2, 3, 4, 5, 'new']
  # if index is outof bound on the right hand side, then append the element to the end.

  list_1.insert(-1, 'aa') 
  # [1, 'new', 2, 3, 4, 5, 'aa', 'new'] 
  # negative index will be converted to a positive index as usual.

  list_1.insert(-len(list_1), 'test') # ['test', 1, 'new', 2, 3, 4, 5, 'aa', 'new']

  list_1 = [1, 2, 3, 4, 5]
  list_1.insert(-6, "test") 
  # ['test', 1, 2, 3, 4, 5] if index is outof bound on the left hand side, then insert the element at the beginning of the list.    
  ```




***tasks:*** Update one or more elements of a list
```
my_list = [1,2,3,4,5,6]
my_list[1] = 20
print(my_list)

my_list[1:3] = [200] # my_list[1:3] = 200 does not work
print(my_list)

my_list[2:4] = [400, 500, 600, 700]
print(my_list)
```

***tasks*** Remove an element from a list
```
# Use del statement
my_list = [1,2,5,3,5,6,7]
del my_list[2]
print(my_list)

# Use remove method 
my_list = [1,2,5,3,5,6,7]
print(my_list.remove(5))
print(my_list) # Remove first occurrence of value.

my_list.remove(100) # ValueError: list.remove(x): x not in list

# Use pop method 
my_list = [1,2,5,3,5,6,7]
print(my_list.pop())
print(my_list)

print(my_list.pop(2))
print(my_list)

my_list.pop(20) # Raises IndexError if list is empty or index is out of range.
```

***task:*** Membership checking
```
list_1 = [1,2,3]
print(1 in list_1)
print(11 not in list_1)
```

***task:*** Find the index value of a given object in the list (or subset of the list).
```
## Using index() method.
## li.index(x[, i[, j]]) index of the first 
## occurrence of x in list li (at or after index i and before index j)

my_list = [4, 5, 6, 5, 7, 8, 7]
print(my_list.index(5)) # 1
my_list.index(20) # ValueError: 20 is not in list
print(my_list.index(5, 2)) # 3
my_list.index(7, , 5) # SyntaxError: invalid syntax
```  

***task:*** Count frequency of elements in a list
```
## Using count method
my_list = (1, 2, 2, 2, 3, 4, 2)
print(my_list.count(2)) # 4
print(my_list.count(20)) # 0
```

***tasks:*** Replicate a list multiple times
```
## Using * operator
['hello', 'world'] * 3 # ['hello', 'world', 'hello', 'world', 'hello', 'world']
```

Note that the objects themselves are not copied, only the references to them are copied. (This is called **shallow copy**)
```
list_1 = [[1,2], [3,4]] 
list_2 = list_1 * 2
print(list_2) # [[1, 2], [3, 4], [1, 2], [3, 4]]

list_1[0].append(10)
print(list_2) # [[1, 2, 10], [3, 4], [1, 2, 10], [3, 4]]
  
list_2[0].append(15)
print(list_2) 
# [[1, 2, 10, 15], [3, 4], [1, 2, 10, 15], [3, 4]] -> the first and the third element share same memory address.
  
print(list_1) # [[1, 2, 10, 15], [3, 4]]
```

***task*** Sort a list
```
# Use sort() method to sort and modifies the list in-place
numbers = [4, 2, 7, 1, 3]
numbers.sort()
print(numbers) 

numbers = [4, 2, 7, 1, 3]
numbers.sort(reverse = True) # sort in descending order
print(numbers)

strings = ["dog", "rabbit", "horse", "cat", "dragon"]
strings.sort(key = len) # Using len() function
print(strings) # sort is stable, which means the order of two equal elements is maintained.

# Use sorted() Python fucntion to returns a new sorted list, 
# leaving the original list unchanged
strings = ["dog", "rabbit", "horse", "cat", "dragon"]
print(sorted(strings, key = len))
print(strings)
```

###### *Operations on two or more lists*



***task*** Combine two lists
```
# "+" operator
list_1 = [1,2,3,4]
list_2 = [5,6,7]
print(list_1 + list_2)

# extend() method
# adds multiple elements to the end of a list. 
# It takes an iterable (e.g., list, tuple, set, string, etc.) as an argument
# modifies the original list in-place.

print(list_1.extend(list_2))
print(list_1)
```

***tasks*** Combine a list of strings into one string
```
# Use string method join()
list_strings = ["abc", "def", "gh"]
",".join(list_strings)
```

###### *List Unpacking*   
  
***task:*** Multiple assignments  
```
## assign a list to several variables
my_list = [1,2,3]
a, b, c = my_list # equivalent to a,b,c = 1,2,3
print(a) #1
    
d, e = my_list # ValueError: too many values to unpack (expected 2)
    
my_list = [1, 2, [3,4]]
a,b, (c, d) = my_list
print(d) #4
```

***task:*** Swap values of two variable   
```
## swap values of two variable using upacking
a, b = [1, 2]
b, a = [a, b] # righ hand evaluaed first, than assign to the left hand.
print(a, b) # 2 1
```  

List unpakcing when * operator is on the left hand of `=`
```
values = [1,2,3,4,5]
a, b, *rest = values
print(a, b) # 1 2
print(rest) # [3, 4, 5] 
    
a, *rest, b = values
print(a, b) # 1 5
print(rest) # [2, 3, 4]
```
  
List unpakcing when * operator is on the right hand of `=`  
```
list_1 = [1, 2]
list_2 = [3, 4]
list_3 = [*list_1, *list_2]
print(list_3) # [1, 2, 3, 4] 
```    

##### **Tuple**

Important characteristics:
  * Fixed length, **immutable** sequence.
  * Elements can be of different types.

###### *Create a tuple*
  
***tasks:*** Creat a tuple
  + `tup = 1, 2, 3` # called tuple packing
  + `tup = (1, 2, 3)`
  + `tup = tuple([4, 0, 2])` 
    * using tuple constructor (`tuple()`) to convert any sequence type of iterator to a tuple.
    * `tup = tuple('str') # ('s', 't', 'r')`. string type is a sequence type.
    
  + `tup = (5, )` you need this format to create a tuple with single element.
  + each member of the tuple could be of different types. `(1, 4.5, 'hello')`  

###### *Subset a Tuple*
***tasks:*** Access one or more elements in a tuple

Using `[]`, Python will call `__getitem__()` method of the object.

  ```{}
  tup = tuple('hello')
    
  # Using [] with a single index. index starts from 0 not 1. 
  tup[1] # 'e'
  tup[10] # IndexError: tuple index out of range
  tup[-1] # 'o'. Negative index will be converted to a positive number using lenght of the tuple plus the index. So, in this example -1 will be converted to 5+ (-1) = 4
    
  tup[2, 4] # TypeError: tuple indices must be integers or slices, not tuple
    
  # Using slice object
  tup[slice(start = 0, stop = 3)] # ('h', 'e', 'l')
    
  tup[slice(start = 0, stop = 5, step = 2)] # ('h', 'l', 'o')
    
  tup[0:5:2] # same as above ('h', 'l', 'o')
  ```

You can not modify a tuple in the way of adding, reassigning or deleting a element of a tuple, as it is **immutable**.

***know***
```
tup[1] = 't' # TypeError: 'tuple' object does not support item assignment
del tup[1] # TypeError: 'tuple' object doesn't support item deletion
```
  
**But**, if one element of a tuple is mutable, you can modify the element in-place.
      
```
tup = (10, [3,4,5], 'end')
tup[1].append(6)
tup # (10, [3, 4, 5, 6], 'end')
```

###### *Operations on one or more tuples*

***task:*** Length of the tuple
```
tup = (1,2,3)
len(tup) # 3 or tup.__len__()
```

***task:*** Check whether an object is in the given tuple
```
## Membership or contains
tup = (1, 2, 3)
1 in tup # True
2 not in tup # False
```

***task:*** Find the index value of a given object in the tuple (or subset of the tuple).
```
## Using index() method.
## s.index(x[, i[, j]]) index of the first 
## occurrence of x in tuple s (at or after index i and before index j)

tup = (4, 5, 6, 5, 7, 8, 7)
tup.index(5) # 1
tup.index(20) # ValueError: tuple.index(x): x not in tuple
tup.index(5, 2) # 3
tup.index(7, , 5) # SyntaxError: invalid syntax
```  

In [None]:
tup = (4, 5, 6, 5, 7, 8, 7)
tup.index(5)

1

***task:*** Count frequency of elements in a tuple
```
## Using count method
a = (1, 2, 2, 2, 3, 4, 2)
a.count(2) # 4
a.count(20) # 0
```
    

***tasks:*** Replicate a tuple multiple times
```
## Using * operator
('hello', 'world') * 3 # ('hello', 'world', 'hello', 'world', 'hello', 'world')


```

Note that the objects themselves are not copied, only the references to them.
```
tup = ([1,2], [3,4]) 
tup_2 = tup * 2
print(tup_2) # ([1, 2], [3, 4], [1, 2], [3, 4])

tup[0].append(10)
print(tup_2) # ([1, 2, 10], [3, 4], [1, 2, 10], [3, 4]) 
  
tup_2[0].append(15)
print(tup_2) 
# ([1, 2, 10, 15], [3, 4], [1, 2, 10, 15], [3, 4]) -> the first and the third element share same memory address.
  
print(tup) # ([1, 2, 10, 15], [3, 4])
```
  
    

([1, 2], [3, 4], [1, 2], [3, 4])
([1, 2, 10], [3, 4], [1, 2, 10], [3, 4])
([1, 2, 10, 15], [3, 4], [1, 2, 10, 15], [3, 4])
([1, 2, 10, 15], [3, 4])


***tasks:*** Concatenating two or more tuples using 
```
## Using + operator
(1, 2, 3) + (4, 5) # (1, 2, 3, 4, 5)
``` 

***tasks*** Combine a tuple of strings into one string
```
tup_strings = ("abc", "def", "gh")
":".join(tup_strings)
```

###### *Tuple Unpacking*   
  
***task:*** Multiple assignments  
```
## assign a tuple to several variables
tup = (1,2,3)
a, b, c = tup # equivalent to a,b,c = 1,2,3
a #1
    
d, e = tup # ValueError: too many values to unpack (expected 2)
    
tup = (1, 2, (3,4))
a,b, (c, d) = tup
d #4
```

***task:*** Swap values of two variable   
```
## swap values of two variable using upacking
a, b = 1, 2
b, a = a, b # righ hand evaluaed first, than assign to the left hand.
a, b # (2, 1)
```  
 
  



Tuple unpakcing when * operator is on the left hand of `=`
```
values = 1,2,3,4,5
a, b, *rest = values
a, b # (1, 2)
rest # [3, 4, 5] it is a list not a tuple!!
    
a, *rest, b = values
a #1
b #5
rest # [2, 3, 4]
```
  
Tuple unpakcing when * operator is on the right hand of `=`  
```
tup = (1,2)
tup_s = (*tup)
# SyntaxError: can't use starred expression here
# but we can use the following
tup_s = [*tup]
tup_s # [1, 2]
# or
tup_s = (*tup, *tup)
tup_s # (1, 2, 1, 2) 
```    
  
  

#### *Mapping Type (Association Array)*
A collection of keys and associated values. (order doesn't matter)

##### **Dictionary**

In Python, a dictionary is a collection of key-value pairs. 
+ It is also sometimes called a hash table or an associative array. 
+ Dictionaries are represented using curly braces `{}` and each key-value pair is separated by a colon `:`. 
+ Each key in a dictionary is unique.
+ Constant time complexity for basic operations such as adding, removing, and checking for membership.

```
my_dict = {"apple": 1, "banana": 2, "orange": 3}
print(my_dict)
```

A dictionary's keys can only be hashable objects

**Hashable Object**

+ If an object has a hash value that remains the same throughout its lifetime 
+ The hash value can be compared to other objects for equality. 
+ The hash value is a unique integer that is used to look up an object in a hash table.
+ Hashable objects in Python include:
Immutable data types such as numbers, strings, and tuples.

###### *Dictionary Funtions/Methods*

***Creat a dictionary***
```
# Use "{}"
my_dict = {"apple": 1, "banana": 2, "orange": 3}
print(my_dict)

my_dict = {}
print(my_dict)

# dict() constructor
my_dict = dict(apple = 1, banana = 2, orange = 3)
print(my_dict)

my_list = [('apple', 1), ('banana', 2), ('orange', 3), ('banana', 25)]
my_dict = dict(my_list)
print(my_dict)
```

***Length of a dictionary***
```
my_dict = {"apple": 1, "banana": 2, "orange": 3}
len(my_dict)
```

***Check if a given key exists in a dictionary***
```
# Use "in" operator
my_dict = {"apple": 1, "banana": 2, "orange": 3}
print("apple" in my_dict)
```


***Fetch an item in a dictionary***
```
# Use [] with key
my_dict = {"apple": 1, "banana": 2, "orange": 3}
print(my_dict["banana"])

my_dict["grape"] # KeyError: 'grape'

# Use get() method
print(my_dict.get("banana"))
print(my_dict.get("grape")) # If the key is not present, it returns the default value (if provided) or None.

print(my_dict.get("grape", -1))
```

In [None]:
my_dict = {"apple": 1, "banana": 2, "orange": 3}
print(my_dict["banana"])

print(my_dict.get("banana"))
print(my_dict.get("grape"))
print(my_dict.get("grape", -1))

2
2
None
-1


***Modify an item or add a new item***
```
# Use assignment operator =
my_dict = {"apple": 1, "banana": 2, "orange": 3}
my_dict["apple"] = 10
print(my_dict)

my_dict["peach"] = 20
print(my_dict)
```

***Delete an item***
```
# Use "del" statement
my_dict = {"apple": 1, "banana": 2, "orange": 3, "peach": 4, "pear": 5}
del my_dict["orange"]
print(my_dict)

del my_dict["lettuce"] # KeyError: 'lettuce'

# Use pop method
my_dict = {"apple": 1, "banana": 2, "orange": 3, "peach": 4, "pear": 5}
result = my_dict.pop("apple")
print(result)
print(my_dict)

result = my_dict.pop("lettuce") #KeyError: 'lettuce'
result = my_dict.pop("lettuce", -1)
print(result)
print(my_dict)

```

***Convert a dictionay to a list***
```
my_dict = {"apple": 1, "banana": 2, "orange": 3, "peach": 4, "pear": 5}

# items() method
print(list(my_dict.items()))

# keys() method
print(list(my_dict.keys()))

# values() method
print(list(my_dict.values()))
```

***Combine two dictionaris***
```
my_dict1 = {"apple": 1, "banana": 2, "orange": 3}
my_dict2 = { "banana": 25, "peach": 4, "pear": 5}

# update() method
print(my_dict1.update(my_dict2))
print(my_dict1)
print(my_dict2)

```

##### **Set**

In Python, a set is an unordered collection of unique elements. 

##### *Set Functions/Methods*

***Create a set***
```
my_set = {1, 2, 3}
print(my_set)  # Output: {1, 2, 3}

# Creating a set from a list
my_list = [1, 2, 2, 3, 3, 3]
my_set = set(my_list)
print(my_set)  # Output: {1, 2, 3}

empty_set = set()
print(empty_set)
```



***Add an element to the set.***
```
# add() method
my_set = {1, 2, 3}
my_set.add(4)
print(my_set)  # Output: {1, 2, 3, 4}
```

***Adds elements from an iterable (e.g. list, set, tuple) to the set.***
```
my_set = {1, 2, 3}
my_set.update([3, 4, 5])
print(my_set)  # Output: {1, 2, 3, 4, 5}
```

******

 
***Removes an element from the set.***
```
# remove() method
my_set = {1, 2, 3}
print(my_set.remove(2))
print(my_set)  # Output: {1, 3}

my_set.remove(4) # KeyError: 4

# discard() method
my_set = {1, 2, 3}
print(my_set.discard(2))
print(my_set)  # Output: {1, 3}
my_set.discard(4) # Does not raise an error if the element is not found.
print(my_set)  # Output: {1, 3}

# pop() method removes and returns an arbitrary element from the set.
my_set = {1, 2, 3}
elem = my_set.pop()
print(elem)  
print(my_set)  

set().pop() # KeyError: 'pop from an empty set'

# clear() method removes all elements from the set.
my_set = {1, 2, 3}
my_set.clear()
print(my_set)  # Output: set()
```


Sets support various operations such as union, intersection, difference, and symmetric difference. 

```
set1 = {1, 2, 3}
set2 = {3, 4, 5}

# Union of two sets
union = set1.union(set2)
print(union)  # Output: {1, 2, 3, 4, 5}

# Intersection of two sets
intersection = set1.intersection(set2)
print(intersection)  # Output: {3}

# Difference of two sets
difference = set1.difference(set2)
print(difference)  # Output: {1, 2}

# Symmetric difference of two sets
symmetric_difference = set1.symmetric_difference(set2)
print(symmetric_difference)  # Output: {1, 2, 4, 5}

```

#### **Iterable and Iterator**
*Iterable*
+ In Python, an iterable is any object that can be looped over. 
+ Examples of iterables include lists, tuples, sets, dictionaries, and strings. 
+ An iterable is defined by implementing the `__iter__()` method, which returns an iterator object.

*Iterator*
+ An iterator is an object that produces the next value in an iterable sequence. 
+ In Python, an iterator is defined by implementing the `__next__()` method, which returns the next value in the sequence. 
+ When there are no more values to return, the `__next__()` method raises the `StopIteration` exception.
+ An iterator can only be iterated once

```
numbers = [1, 2, 3]
iterator = iter(numbers)

print(next(iterator))
print(next(iterator))
print(next(iterator))
next(iterator)

list(iterator)
```



## *Flow Control*



### **If-elif-else Statements**
In Python, the if (elif, else) statements are used for conditional execution of code. It is used to make decisions based on the truth value of a condition.

```
if condition:
    # code to be executed if condition is True
elif condition2:
    # code to be executed if condition2 is True and condition is False
else:
    # code to be executed if both condition and condition2 are False

```

Example:
```
x = -2

if x > 0:
    print("x is positive")
elif x < 0:
    print("x is negative")
else:
    print("x is zero")

```
**Important** 
In Python, indentation is used to define the scope of a code block. 
+ Code blocks are used in control structures such as `if`, `for`, `while`, `def`, `class`, etc. 
+ In order to indicate which lines of code belong to a code block, you must indent them with whitespace (usually 4 spaces, but sometimes a tab is used instead).


In [None]:
x = -2

if x > 0:
    print("x is positive")
elif x < 0:
    print("x is negative")
else:
    print("x is zero")

x is negative


### **For Loop**
A `for` loop is used to iterate over a collection of elements (iterable, iterator), such as a list, tuple, string, or dictionary.
```
for variable in collection:
    # code to be executed
```

```
# iterate over a list
for i in [2,3,4,5]:
    print(i)

# iterate over a dictionary
my_dict = {"apple": 1, "banana": 2, "orange": 3}
for key, value in my_dict.items():
    print(key, value)

```



#### *Break and Continue*

The `break` statement breaks out of the innermost enclosing `for` or `while` loop.
```
fruits = ["apple", "banana", "cherry", "orange", "kiwi"]
for fruit in fruits:
    if fruit == "orange":
        break
    print(fruit)

```

The `continue` statement continues with the next iteration of the loop (skip codes in the current iteration.):
```
fruits = ["apple", "banana", "cherry", "orange", "kiwi"]
for fruit in fruits:
    if fruit == "orange":
        continue
    print(fruit)

```


### While Loop
The `while` statement is used for repeated execution as long as an expression is true:
```
count = 1
while count <= 5:
    print(count)
    count += 1

```

### Loop with `else` Clause

Loop statements may have an `else` clause; 
+ The `else` clause is executed when the loop terminates through exhaustion of the iterable (with `for`) or when the condition becomes `false` (with `while`). 
+ It is not executed when the loop is terminated by a `break` statement. 
```
fruits = ["apple", "banana", "cherry", "orange", "kiwi"]
for fruit in fruits:
    if fruit == "pineapple":
        print("Pineapple is in the list!")
        break
else:
    print("Pineapple is not in the list.")
```

## *Functions*
A function is a block of organized, reusable code that performs a specific task. 

Functions provide a way to break down complex programs into smaller, modular pieces, making them easier to understand, reuse, and maintain.



### **Define a Function**
+ functions are defined using the `def` keyword, followed by the function name, any parameters the function takes in (if any), and a colon `:`. 
+ The function code block is then indented, typically by four spaces or a tab. 
```
def add_numbers(a, b):
    result = a + b
    return result
```
+ The `return` statement is optional. If there is no `return` statement in the body of the function, `None` will be returned when the function is called.

To use this function, you can simply call it with two arguments, like this:
```
sum = add_numbers(3, 5)
print(sum)
```



In [None]:
def add_numbers(a, b):
  result = a + b
  return result
sum = add_numbers(3, 5)
print(sum)

8


Fucntions are the first class citizen in Python
  + A function can be assigned to a name in an asighment statement
  ```
  def fcn():
       print("I am a function")
  fcn_1 = fcn  
  ```
  + Funcations can be passed to another function as arguments (higher order function)
  + A function can be returned by another function (higher order function).

Functions are callables, something that can be involked by using `()` and return a value.
```
math.sqrt(4) # calling the function sqrt with parameter 4, will return 2
```
Functions are Python objects
  + Can have attributes
  + Can have methods 
  ```
  def fcn(a):
     print(a)

  fcn.__name__ 
  Out: 'fcn'  
  ```  


#### *Parameter vs Argument*

* When you define a function, the names (object references) in the function definition are called **parameters** 

* When the function is called, the values or names (object references) passed to the function are called **arguments**.

```{}
def fcn(parameter_1, parameter_2):
      print(parameter_1, parameter_2)
      return 0
  
print(fcn(20, "abc")) # 20 and "abc" are called arguments.
print(fcn(12, parameter_2 = 20))
```


##### **Function parameters**
+ Positional-only parameter
+ Positional-keyword parameter
+ Variable-length positional parameter
+ Keyword only paramter
+ Variable-length keyword parameter
    



###### _positional-only parameter (Python 3.8 and later)_
+ When the function is called, we can only use positional argument for this type of parameters. Cannot use keyword argument
      
  ```
  ## All parameters defined before / are positinal-only   
  def fcn(p1, /): 
      print(p1)
            
  fcn(10) # positional argument
  # 10
  fcn(p1 = 10) # keyword argument
  #TypeError: fcn() got some positional-only arguments passed as keyword arguments: 'p1'
         
  ## Positional-only parameter can have a default value. 
  ## It is also called an optional (default) positional-only parameter
         
  def fcn(p1, p2 = 10, /): # p2 is optional
      print(p1, p2)
             
  fcn(20) 
  # Output: 20 10
  fcn(20, 30)
  # Output: 20 30
  ```

###### _Positional-keyword parameter_ 
+ Can be called using either positional or keyword argument.
            
  ```
  # p1 is positional-keyword parameter, as no / in presence.
  def fcn(p1): 
      print(p1)
            
  fcn(10)
  # 10
  fcn(p1 = 10)
  # 10
            
  # p1 is positional-only, p2 is positional-keyword parameter.          
  def fcn(p1, /, p2): 
      print(p1, p2)
            
  fcn(10, 20)   
  # 10 20
  fcn(10, p2 = 20)
  # 10 20
  ```
          

###### *Variable-length positional parameter*
  
```
def fcn(*var_p): ## var_p is a variable length positional parameter
    print(var_p)
              
fcn(10, 30, 50)  
# (10, 30, 50) 
# notice that var_p is a tuple.
          
def fcn(p1, p2, /, p3, p4, *var_p):
    print(p1, p2, p3, p4, var_p)
          
fcn(10, 20, 30, 40, 50, 60, 70)
# 10 20 30 40 (50, 60, 70)          
```

###### *Keyword-only parameter*
      
+ keyword-only parameter without default value (required keyword-only parameter)
        
+ keyword-only parameter with default value (optional keyword-only parameter)
         
  ```
  ## anything after * is keyword-only parameter
  def fcn(*, kp1, kp2 =20): 
      print(kp1, kp2)
             
  fcn(10)
  # TypeError: fcn() takes 0 positional arguments but 1 was given
  fcn(kp1 = 10)
  # 10 20
  fcn(kp2 = 30, kp1 = 10)
  # 10 30
  ```

###### *Variable length keyword-only parameter*
      
  ```
  ## kwvar_p is a variable length keyword-only parameter
  def fcn(**kwvar_p): 
      print(kwvar_p)
             
  fcn(10, 20)
  # TypeError: fcn() takes 0 positional arguments but 2 were given
          
  fcn(p1 = 10, p2 = 20)
  # {'p1': 10, 'p2': 20}
  # notice that kwvar_p is a dictionary.
  ```


###### _The order of parameters in a function definition_

  1. Positional-only parameter(s)
  2. Positional-keyword parameter(s)
  3. Variable length positional parameter
  4. Keyword-only parameter(s)
  5. Variable length keyword-only parameter
    
For example:
      
  ```
  def fcn(pos_only, /, pos_keyword, *var_pos, *, keyword_only, **var_keyword_only):
      pass
  ```

+ Non-default (non-optional, required) positional parameter can not follow default (optional) positional parameter.
  
  ```
  def fcn(p1, p2 = 2,/, p3):
      pass
  # SyntaxError: non-default argument follows default argument
    
  def fcn(p2=2, p3):
      pass
  # SyntaxError: non-default argument follows default argument   
    
  def fcn(p1 , /, p2=2, p3, p4):
      pass
    
  # SyntaxError: non-default argument follows default argument   
  ```

+ But this rule dose NOT apply to keyword-only parameters. 
  ```
  def fcn(p1, p2=10, /, *, kp1):
      print(p1, p2, kp1)
    
  fcn(5,20, kp1 =30)
  # 5 20 30
    
  def fcn(p1, p2=10, /, p3 = 5, *, kp1, kp2 = 20, kp3):
      print(p1, p2, p3, kp1, kp2, kp3)
        
  fcn(5, 15, 25, kp1 = 30, kp3 = 40)
  # 5 15 25 30 20 40    
  ```  

  
###### _The order of arguments in a function call_

  1. Positional argument(s)
  2. Keyword argument(s)
  
For example:
```
fcn(10, 'abc', arg1 = 30, arg2 = 50)
```

When calling a function, any argument after a keyword argument MUST be keyword argument
  
  ```
  def fcn(p1, p2): # p1, p2 are positional-keyword parameters
      print(p1, p2)
      
  fcn(p1=2, p2)    
  # SyntaxError: positional argument follows keyword argument
  ```
  

    





##### *Matching Argments to Parameters*
 
When a function is called, 
+ the positional argument(s) are first matched to positional parameter(s) (positional-only, positional-keyword, variable length positional)
+ then keyword argument(s) are matched to corresponding parameters with the same names. 

Be aware of the following error:
  
  ```
  def fcn(p1, p2):
      print(p1, p2)
        
  fcn(10, 20, p1 = 30) 
  # TypeError: fcn() got multiple values for argument 'p1'
  ```
    


##### _Function call with `*` and `**`_

A funciton can also be called with 'tuple unpacking' and 'dictionary unpacking'
  
```
def fcn(p1, p2, p3, p4):
    print(p1, p2, p3, p4)

tup1 = (1,2)
dict1 = {"p3": 3, "p4": 4}   #here keys must match the name of the function parameters

fcn(*tup1, **dict1)
# 1 2 3 4

fcn(**dict1, *tup1)
# SyntaxError: iterable argument unpacking follows keyword argument unpacking
```  



##### _Local Namespace_  
When a function is called, Python will 
+ create a empty dictionary
+ create `key:value` pairs. Parameters are keys, and arguments are values.
+ run the code with this dict (it is called local namespace). When code is running, the local namespace may be changed as new local variables may be defined in the code.
+ when the function call is done, the local namespace is eliminated.

We can use `locals()` inside the fucntion to see the local namespace.
  

##### *Pass by Obeject Reference*

In a function call, variables are passed by object reference. This means that when you pass a variable to a function, you are passing a reference to the object in memory that the variable is referencing, rather than a copy of the object itself.

For example, consider the following code:

```
def my_function(my_list):
    my_list.append(4)
    print("Inside function: my_list =", my_list)

a = [1, 2, 3]
my_function(a)
print("Outside function: a =", a)

# Local variable
def my_function(my_list):
    my_list = [5,6,7]
    my_list.append(4)
    print("Inside function: my_list =", my_list)

a = [1, 2, 3]
my_function(a)
print("Outside function: a =", a)

```


### *Lambda function*

In Python, a lambda function (also known as an anonymous function) is a small, single-expression function that does not have a name. 

+ Lambda functions are defined using the `lambda` keyword 
+ followed by the function's parameters (if any), a colon `:`, and the expression to be evaluated. 
+ The expression is then returned as the result of the function.
```
sum = lambda a, b: a + b
```

The lambda function can be called like a regular function:

```
result = sum(3, 5)
print(result)
```

Lambda functions are often used in combination with higher-order functions like `map`, `filter`, and `reduce` to create concise and expressive code. 

```
numbers = [1, 2, 3, 4, 5]
squares = map(lambda x: x**2, numbers)
print(list(squares))

squares_2 = map(lambda x,y: x**2+y, numbers, numbers)
print(list(squares_2))
```






```
numbers = [1, 2, 3, 4, 5, 6]
filtered_numbers = filter(lambda x: x % 2 != 0, numbers)
print(list(filtered_numbers))
```


**Reduce function**
In Python, the `reduce` function is a built-in higher-order function in the `functools` module 
+ It takes in a function and an iterable as arguments 
+ returns a single value that is the result of applying the function to the elements of the iterable in a cumulative way. 
+ The function passed into reduce must take in two arguments and return a single value.

```
from functools import reduce

# Product of a list of numbers
numbers = [1, 2, 3, 4, 5]
product = reduce(lambda x, y: x * y, numbers)
print(product)

# Maximum value in a list of numbers
numbers = [1, 5, 2, 7, 3, 8, 4]
def find_max(x, y):
    if x > y:
        return x
    else:
        return y

max_number = reduce(find_max, numbers)
print(max_number)
```

## *Comprehensions in Python*


### **List Comprehension**

List comprehension is a concise way to create new lists in Python. 
+ It allows you to create a new list by applying an **expression** to each element of an existing list (or other iterable object)
+ You can also filter the elements based on a condition. 

+ The syntax of a list comprehension is as follows:
```
new_list = [expression for item in iterable if condition]
```
    + `new_list` is the new list that will be created
    + `expression` is the operation to be applied to each item in the iterable
    + `item` is a variable that represents each item in the iterable
    + `iterable` is the existing list or other iterable that the list comprehension is based on
    + `condition` is an optional condition that filters the elements in the iterable


Examples:
```
# create a list of squares of numbers from 1 to 10
squares = [x**2 for x in range(1, 11)]
print(squares)  # Output: [1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

# Create a list of even numbers from 1 to 10:
even_numbers = [x for x in range(1, 11) if x % 2 == 0]
print(even_numbers)  # Output: [2, 4, 6, 8, 10]

# Create a list of common elements from two given lists:
list1 = [1, 2, 3, 4, 5]
list2 = [3, 4, 5, 6, 7]
common_elements = [x for x in list1 if x in list2]
print(common_elements)  # Output: [3, 4, 5]

# Create a 3x3 matrix (list of lists)
matrix = [[i+j for j in range(3)] for i in range(0, 9, 3)]
print(matrix)  # Output: [[0, 1, 2], [3, 4, 5], [6, 7, 8]]

```

### **Dictionary Comprehension**
Dictionary comprehension is similar to list comprehension, but instead of creating a new list, it creates a new dictionary 
+ It applies an expression to each `key-value` pair in an existing dictionary (or other iterable) 
+ It can also filter the `key-value` pairs based on a condition. 

The syntax of a dictionary comprehension is as follows:
```
new_dict = {key_expression: value_expression for item in iterable if condition}
```

Examples:
```
# Create a dictionary of squares of numbers from 1 to 5
squares = {x: x**2 for x in range(1, 6)}
print(squares)  # Output: {1: 1, 2: 4, 3: 9, 4: 16, 5: 25}

# Create a new dictionary that contains only the items with values greater than or equal to 3 from an existing dictionary:
original_dict = {'apple': 2, 'banana': 3, 'orange': 4, 'peach': 1}
new_dict = {k: v for k, v in original_dict.items() if v >= 3}
print(new_dict)  # Output: {'banana': 3, 'orange': 4}

# Create a new dictionary that maps each letter in a string to its frequency:
sentence = 'Dictionary comprehension is similar to list comprehension'
letter_freq = {letter: sentence.count(letter) for letter in sentence if letter.isalpha()}
print(letter_freq)  # {'D': 1, 'i': 8, 'c': 3, 't': 3, 'o': 6, 'n': 5, 'a': 2, 'r': 4, 'y': 1, 'm': 3, 'p': 2, 'e': 4, 'h': 2, 's': 5, 'l': 2}


```

### **Generator Expression**

A generator expression in Python is a concise way to create an iterator that generates a sequence of values on-the-fly, rather than storing them all in memory like a list or tuple. It has a similar syntax to a list comprehension, but with parentheses instead of square brackets. Here's the basic syntax of a generator expression:
```
generator = (expression for item in iterable if condition)

```

#### *Lazy Evaluation*
Generator expressions are useful when you need to iterate over a large sequence of items, but you don't want to store all of them in memory at once.

Instead, the generator expression generates the values one at a time, only when they are needed.

Examples:
```
# Create a generator that yields the squares of numbers from 1 to 5
squares = (x**2 for x in range(1, 6))
print(squares)  # Output: <generator object <genexpr> at ........>
print(list(squares))  # Output: [1, 4, 9, 16, 25]

```

## *Packages and Modules*

### **Module**
A Python module is a Python code file (`.py` file)  
+ It is a way to organize code into reusable, standalone components that can be easily imported and used in other Python programs. 
+ Modules allow you to break up your code into logical, reusable pieces that can be shared and reused by other developers.
+ Python modules can define functions, classes, variables, and constants that can be used in other Python programs. 
+ Modules can also include executable code that runs when the module is imported, allowing you to define initialization logic or other setup code that should be executed when the module is loaded.

### **Package**

A Python package is a way to organize related modules into a single namespace, making it easier to manage and import modules in your Python program. 
+ A package is simply a directory containing Python modules, along with a special file called `init.py` that is executed when the package is imported.
+ Packages can contain sub-packages, which are themselves packages nested within the main package. This allows for a hierarchical structure of packages and modules that can be organized in a logical way.



***tasks:*** Import Packages/Modules

```
# Use import statement
import pandas # pandas will be the name for referencing the Pandas package object

import pandas as pd # pd will be the name to use. But we can not use symbol pandas. 

pd.Series([1,2])
pandas.Series([1,2]) # NameError: name 'pandas' is not defined

# Use from ... import ... statement
from math import sqrt, cos # sqrt and cos will be in the namespace, can be directly called. BUT math is not in the namespace
```

You can access the functions, classes, and constants in the module using dot notation, for example:
```
pd.DataFrame([[1,2], [3,4]])
```

***Note***

When the import statement is executed in Python, several things happen under the hood:

1. Search for module: The Python interpreter searches for the module in the search path 
  + Search path is a list of directories where Python looks for modules. 
  + The search path includes 
    + the current directory 
    + directories listed in the `PYTHONPATH` environment variable. 

2. Compile the module: If the module is found, the Python interpreter compiles it into byte code. This is done to improve the speed of execution when the module is used repeatedly.

3. Create a module object: The compiled byte code is loaded into memory and a module object is created to represent the module. This module object contains the names defined in the module's namespace, including functions, classes, and variables.

4. Execute the module code: The module code is executed and the names defined in the module's namespace are made available for use in the importing module.

5. Bind the module object to a name: Finally, the module object is bound to a name in the importing module's namespace. This name can be used to access the names defined in the imported module.

If the module has already been imported, Python will skip the first two steps and load the already compiled module from memory. This is done to save time and resources.






## *Unicode and UTF-8 Encoding*

Unicode and UTF-8 are related but distinct concepts in computer science. Unicode defines what each character means, while UTF-8 defines how each character is represented in binary form.

+ Unicode is a standard character encoding system that assigns **a unique numeric code point to every character** in the world's writing systems. 
https://en.wikipedia.org/wiki/List_of_Unicode_characters

+ UTF-8 is a variable-length character encoding system that can **represent all Unicode code points using one to four bytes**. UTF-8 is the most widely used encoding for the World Wide Web and other modern computing applications. 
https://en.wikipedia.org/wiki/UTF-8

+ Beside UTF-8, there are other encoding systems, such as: ASCII, UTF-16, UTF-32




In Python, Unicode strings are represented using the `str` data type, which can store characters from any writing system in the world.

Python uses the UTF-8 encoding by default for Unicode strings. 
+ When you create a string literal in your Python code using quotes (either single or double), the characters in the string are automatically encoded using UTF-8. For example:

```
my_string = "Hello, 世界!"
print(my_string)
```

You can also create Unicode strings using escape sequences that represent Unicode code points in hexadecimal notation. For example:

```
#  The escape sequences use the \u prefix followed by a four-digit hexadecimal code point for each character.

my_string = "\u0048\u0065\u006C\u006C\u006F, \u4E16\u754C!"
print(my_string)
```







### **Encode() and Decode()**

*Encode() string method*

The `encode()` method is used to convert a string from its Unicode representation to a specified character encoding. 

It returns a bytes object that contains the encoded version of the string. Here's an example:
```
string = "Hello, world!"
encoded_string = string.encode("utf-8")
print(encoded_string)
```

*Decode() byte string method*

The `decode()` method is used to convert a bytes object to a string in a specified character encoding. Here's an example:

```
encoded_string = b'Hello, world!'
decoded_string = encoded_string.decode("utf-8")
print(decoded_string)
```

**Note**

It's important to note that when using encode() and decode(), it's necessary to use the same character encoding for both functions to ensure that the data is properly converted. 

### **Byte String**

In Python, a byte string is a sequence of bytes that represents a string of characters. Byte strings are often used to store binary data or data that is not in human-readable format.

```
byte_string = b"hello world" # create a byte string using the b prefix
byte_string = bytes([0x68, 0x65, 0x6c, 0x6c, 0x6f, 0x20, 0x77, 0x6f, 0x72, 0x6c, 0x64]) # create a byte string using the bytes() constructor and a list of byte values
```

+ Byte strings can be used to represent non-textual data, such as images, audio files, and binary data. 
+ They can also be used for network communication, file input/output 
+ Other operations where data needs to be transmitted in a raw byte format.

It's important to note that byte strings are not the same as Unicode strings, which represent human-readable text. While Unicode strings can be encoded into byte strings using a specific character encoding, byte strings cannot be decoded into Unicode strings unless the original encoding is known.


# **Python Learning Resources**

+ Python.org - The official Python website offers beginner guides, documentation, and tutorials. https://www.python.org/

+ Python for Everybody - A free online course offered by the University of Michigan 
that covers the basics of Python programming. https://www.py4e.com/

+ Python Crash Course - A book by Eric Matthes that covers the basics of Python programming and includes hands-on projects.

+ Real Python - A website that offers a collection of Python tutorials, articles, and resources for learners of all levels. https://realpython.com/

These resources should give you a good starting point to learn Python.
