first, we need setup vscode and python environment to run this notebook.

# On local PC

* we need install miniconda and setup a virtual environment for this learning activities.
  * `conda create --name=learn python=3.13`
* then install `ipykernel`
  * `which pip`
  * `pip install ipykernel`

# On VS Code

* we need `python` and `jupyter` extension from microsoft.
* Go to the Extensions view by clicking on the icon on the sidebar or pressing Ctrl+Shift+X (Windows/Linux) or Cmd+Shift+X (macOS).
* Search for and install:
  * "Jupyter" extension by Microsoft. This enables Jupyter notebook integration.
  * "Python" extension by Microsoft for enhanced Python support.
* Set Up Your Environment:
  * Open the Command Palette (Ctrl+Shift+P or Cmd+Shift+P).
  * Type "Configure Python" and select "Python: Set Default Interpreter Path."
  * Choose the location of your Python interpreter or specify it manually.
* Create a New Jupyter Notebook:
  * create a new file with a .ipynb extension.
  * You can start coding and run cells using Ctrl+Enter to execute the current cell or Shift+Enter to run and move to the next cell.

let's start with basic stuff

In [14]:
def fibonacci():
    a, b = 0, 1
    while True:
        yield a
        a, b = b, a + b

fib = fibonacci()
for _ in range(9):
    print(next(fib))

0
1
1
2
3
5
8
13
21


ok, let's continue. Python is a pass-by-reference type of language, means the arguments can be altered inside the function because they are basically references to the objects.

* Mutable objects (like lists or dictionaries) can be modified within the function.
* Immutable objects (like integers or strings) cannot be changed and reassigning them inside the function doesn’t affect the original object.

In [15]:
def fun(a, b):
    a += ", world"
    # here `a` is `a` local variable, a totally different object than global variable `a`
    b.append(", world")
    # here b is a reference to the original list
    print(a)

a = "hello"
b = ["hello"]
fun(a, b)
print(a)
print(b)


hello, world
hello
['hello', ', world']


let's try some data structures like set

In [None]:
my_list = [6, 1, 2, 3, 2, 4, 5, 1]
print(my_list)
my_set = set(my_list)
print(my_set)
print(list(my_set))

# deletions
del my_list[6] # remove the element at index 6
print(my_list)

my_list.pop(3) # remove the element at index 3
print(my_list)

my_list.remove(1) # remove the first occurrence of value 1
print(my_list)

del my_list[:] # remove all elements
print(my_list)

[6, 1, 2, 3, 2, 4, 5, 1]
{1, 2, 3, 4, 5, 6}
[1, 2, 3, 4, 5, 6]
[6, 1, 2, 3, 2, 4, 1]
[6, 1, 2, 2, 4, 1]
[6, 2, 2, 4, 1]
[]


**comprehension**, a syntax construction to ease the creation of list/set/dict. Also, people are saying it is more efficient

In [24]:
# list
old_list = [1, 3, 4, 6, 7, 8, 3, 4, 9]
new_list = [x*2 for x in old_list]
print(new_list)
a_set = {x*2 for x in old_list}
print(a_set)
a_dict = {x: x*4 for x in a_set}
print(a_dict)

[2, 6, 8, 12, 14, 16, 6, 8, 18]
{2, 6, 8, 12, 14, 16, 18}
{2: 8, 6: 24, 8: 32, 12: 48, 14: 56, 16: 64, 18: 72}


some file operation, try to decode a file and convert it to utf-8, but somehow Anki still display random chars

In [34]:
import codecs

# file="/home/xzhao/Downloads/Certifications__2CiscoDev__ENAUTO.apkg"
file="/tmp/apg/713"

def detect_encoding(file):
    with open(file, "rb") as f:
        encoding = codecs.detect_encoding(f.read())
        return encoding['encoding']

def detect_encoding2(file):
    with open(file, "rb") as f:
        data = f.read()

        possible_encodings = ["utf-8", "latin-1", "iso-8859-1"]
        for enc in possible_encodings:
            try:
                data.decode(enc)
            except UnicodeDecodeError as e:
                print(f"Failed to decode with {enc}, error is {e}")
                continue
            else:
                return enc

def convert_file(file, old_enc, new_enc):
    with open(file, "rb") as f:
        data = f.read()

    new_file = f'{file}.new.txt'
    new_data = data.decode(old_enc)

    with open(new_file, "w", encoding=new_enc) as f:
        f.write(new_data)

enc = detect_encoding2(file)
print(enc)
if enc != "utf-8":  
    convert_file(file, enc, "utf-8")
    print(f"Converted {file} from {enc} to utf-8")


Failed to decode with utf-8, error is 'utf-8' codec can't decode byte 0xb5 in position 1: invalid start byte
latin-1
Converted /tmp/apg/713 from latin-1 to utf-8


let's research on dictionary, which is a hash table basically and using linked-list (separate chaining) to resolve the collision.

In [1]:
# Simple example of separate chaining using lists as buckets
class HashTable:
    def __init__(self):
        self.buckets = [[] for _ in range(5)]
    
    def get_bucket(self, key_hash):
        return self.buckets[key_hash % len(self.buckets)]

    def insert(self, key, value):
        bucket = self.get_bucket(hash(key))
        h = hash(key) % len(self.buckets)
        print(f"key: {key}, hash: {h}, bucket: {bucket}")
        # Search the bucket for the key
        for i, (k, v) in enumerate(bucket):
            if k == key:
                bucket[i] = (key, value)
                return
        # If not found, append to the bucket
        bucket.append((key, value))

    def delete(self, key):
        bucket = self.get_bucket(hash(key))
        for i, (k, v) in enumerate(bucket):
            if k == key:
                del bucket[i]
                return
        raise KeyError(key)

    def get(self, key):
        bucket = self.get_bucket(hash(key))
        for k, v in bucket:
            if k == key:
                return v
        raise KeyError(key)

# Create a hash table and insert some values
ht = HashTable()
ht.insert("apple", 1)
ht.insert("banana", 2)
ht.insert("apply", 3)
ht.insert("apple", 5)

try:
    print(ht.get("apple"))   # Output: 1
    print(ht.get("banana"))  # Output: 2
    print(ht.get("apply"))
    print(ht.get("melon"))
except KeyError as e:
    print(f"get error: {e} is not in the hash table")

ht.delete("apply")
try:
    print(ht.get("apply"))
except KeyError as e:
    print(f"get error: {e} is not in the hash table")


key: apple, hash: 3, bucket: []
key: banana, hash: 1, bucket: []
key: apply, hash: 1, bucket: [('banana', 2)]
key: apple, hash: 3, bucket: [('apple', 1)]
5
2
3
get error: 'melon' is not in the hash table
get error: 'apply' is not in the hash table


lambda function, can have any number of parameters but, can have just one statement.

In [19]:
a = lambda x, y : x+y
print(a(7, 19))

26


now let's do something about dictionary

In [28]:
sales_data = [
    {
        "region": "North America",
        "products": [
            {"product_id": 101, "name": "Laptop", "sales": 500},
            {"product_id": 102, "name": "Phone", "sales": 700}
        ]
    },
    {
        "region": "Europe",
        "products": [
            {"product_id": 201, "name": "Tablet", "sales": 400},
            {"product_id": 202, "name": "Smartwatch", "sales": 350}
        ]
    }
]

# Accessing nested data
print(sales_data[0]["products"][0]["name"])  # Output: Laptop

# Modifying sales data for a product in Europe
sales_data[1]["products"][1]["sales"] = 400
# print(sales_data)

for item in sales_data:
    for product in item["products"]:
        print(item["region"], product["name"], product["sales"])

Laptop
North America Laptop 500
North America Phone 700
Europe Tablet 400
Europe Smartwatch 400


about `*args` and `**kwargs`, just using for loop to loop over it

In [29]:
def example_function(arg1, *args, **kwargs):
    print("First argument:", arg1)
    
    print("\nAdditional arguments (*args):")
    for arg in args:
        print(arg)
    
    print("\nKeyword arguments (**kwargs):")
    for key, value in kwargs.items():
        print(f"{key}: {value}")

# Example usage
example_function(1, 2, 3, 4, name="Alice", age=30, city="New York")

First argument: 1

Additional arguments (*args):
2
3
4

Keyword arguments (**kwargs):
name: Alice
age: 30
city: New York


docstring?

In [30]:
def get_total_sales(data):
    """
    Calculate the total sales from the sales data.

    Args:
        data (list): A list of dictionaries containing sales data. Each dictionary represents a region and contains a list of products with their sales.

    Returns:
        int: The total sales from all regions and products.
    """
    total_sales = 0
    for region in data:
        for product in region["products"]:
            total_sales += product["sales"]
    return total_sales

# Example usage
total_sales = get_total_sales(sales_data)
print(f"Total Sales: {total_sales}")

print(get_total_sales.__doc__)

Total Sales: 2000

Calculate the total sales from the sales data.

Args:
    data (list): A list of dictionaries containing sales data. Each dictionary represents a region and contains a list of products with their sales.

Returns:
    int: The total sales from all regions and products.



In [33]:
import math

print(5/2)
print(5//2)
print(math.ceil(5/2))

2.5
2
3
