# Vectors' graphical representation

- Just to remember physical representation of vector's addition/subtraction, resultant vectors etc.
![vector_add.jpg](attachment:aa2692a6-5b7b-4966-b217-79e6b03d17a7.jpg)

# For Loop vs Generators

- When length of an iterable (e.g. list) is small, then we use For loop to iterate over the list.
- Because For loop stores the complete list in memory and to perform iterations, its inefficinet when length of list/iterator is too large.

- In such cases, we use Generators
- Generators are the functions which produce one value at a time and only when needed.
- They dont store the complete data at a time
- They compute the value at every iteration and yield result

- Let's see example:

In [1]:
# Use for loop for small dataset

small_list = [1,2,3,4]

for number in small_list:
    print(number)

1
2
3
4


In [5]:
# Use generator function for large dataset

# Define a generator function
# whenever yield comes the function pauses and returns a value
def count_up_to(n):
    count = 1
    while count <= n:
        yield count
        count +=1

# Use generator function of large dataset
counter = count_up_to(5)
for number in counter:
    print(number)
    

1
2
3
4
5


In [15]:
# In compact form the generator can be written as:

number = [i for i in range(1,6)]
print(number)

[1, 2, 3, 4, 5]


# Queue

- A queue is a data structure that follows the First-In-First-Out (FIFO) principle.
- This means that the first element added to the queue will be the first one to be removed, just like a line of people waiting for a service, where the person at the front of the line gets served first.

**Key Characteristics of a Queue:**
1. FIFO Order: The first element added is the first one to be removed.

**Operations:**

1. Enqueue: Add an element to the back of the queue.
2. Dequeue: Remove an element from the front of the queue.
3. Peek/Front: View the element at the front of the queue without removing it.

**Use Cases:**

Queues are used in scenarios where order matters, such as:
Task scheduling
Managing requests in web servers
Breadth-first search in graphs

**Example in Python:**
Python has a built-in deque (double-ended queue) from the collections module, which can be used as a queue:

In [1]:
from collections import deque

# Create a new queue
queue = deque()

# Enqueue elements
queue.append('a')
queue.append('b')
queue.append('c')

# Dequeue elements
first_element = queue.popleft()  # 'a'
second_element = queue.popleft()  # 'b'

# Peek at the front element
front_element = queue[0]  # 'c'

# Dictionary definition

## dict1 = {} vs dict2 = defaultdict(value_type(can be int list etc))

1. dict1 = {}
- When you access a key that doesn't exist, it raises a KeyError

2. dict2 = defaultdict()
- When you access a key that doesn't exist, it automatically creates the key with a default value provided by the default_factory (defualtdict(default_factory))

In [9]:
from collections import defaultdict

dict1 ={}
dict1["key1"] = 1
print(dict1["key1"])  # Output 1
print(dict1["key2"])  # key error beacuse key2 not defined

1


KeyError: 'key2'

In [10]:
dict2 = defaultdict(int)  # default_factory is int here
dict2["key1"] = 1
print(dict2["key1"])  # Output 1
print(dict2["key2"])  # Outputs 0

1
0


## Sorting a Dictionary

In [13]:
# let's say we have a dict
d = {0: 0.38578114178754597, 1: 0.5147899259830341, 2: 0.5147899259830341, 3: 0.47331418859372887, 
     4: 0.23360704386672448, 5: 0.1501486531300769, 6: 0.08355074577979703, 7: 0.08355074577979703, 
     8: 0.07284270783167102, 9: 0.02729231437283049}

# We want to sort it based on its d.values()

sorted_d = sorted(d.items(), key= lambda i: i[1], reverse=True)

# Here i[1] means d.values
# So, dict will be sorted based on d.values()
# reverse true ensures that it will sort from min->max

print(sorted_d)

[(1, 0.5147899259830341), (2, 0.5147899259830341), (3, 0.47331418859372887), (0, 0.38578114178754597), (4, 0.23360704386672448), (5, 0.1501486531300769), (6, 0.08355074577979703), (7, 0.08355074577979703), (8, 0.07284270783167102), (9, 0.02729231437283049)]


# BFS algorithm

- BFS is a graph traversal algorithm used to explore nodes and edges in a graph systematically.

- It is especially useful for finding the shortest path in unweighted graphs and for level-order traversal.

1. Starting Point: Begin traversal from a chosen starting node (or vertex).

2. Queue Utilization: Use a queue data structure to keep track of nodes to be explored. The queue ensures that nodes are explored in the order they are discovered.

3. Visited Nodes Tracking: Maintain a list or set to keep track of nodes that have been visited. This prevents processing the same node multiple times, avoiding infinite loops in cyclic graphs.

4. Level-wise Processing: Dequeue a node from the front of the queue. Explore all its unvisited neighbors and enqueue them for further exploration. Process each node level-by-level, ensuring that all nodes at the current level are explored before moving to the next level.

5. Repeat Until Completion: Continue the process until the queue is empty, meaning all reachable nodes have been visited.

- Step 1: Start at node A, mark it as visited, and enqueue it.
> - queue = [A]

- Step 2: Dequeue A, explore its neighbors (B and C), mark them as visited, and enqueue them.
> - queue = [B, C] (A out)

- Step 3: Dequeue B, explore its unvisited neighbor (D), mark D as visited, and enqueue it.
> - queue = [C, D] (B out)

- Step 4: Dequeue C (no new neighbors to explore).
> - queue = [D] (C out)

- Step 5: Dequeue D (no new neighbors to explore).
  > - queue = [ ] (D out)

- Step 6: The queue is now empty, and all reachable nodes have been visited.


# float('inf') 

- It is used to initialize a variable min_path_length with a value representing positive infinity.
- Here’s what it means and why it's used:

1. Positive Infinity: float('inf') creates a floating-point number representing positive infinity.
   - This value is greater than any other numerical value.
   - It's useful in algorithms where you need to compare or find the minimum value.

2. Initialization:
   - When you set min_path_length = float('inf'), you’re initializing min_path_length to the largest possible value.
   - This ensures that any valid path length found later will be smaller and thus will update this variable.

# dict.get()

.get(key, default_value): 
- This method is used to access the value associated with key in the dictionary. If the key is not found, it returns default_value.



# Eigen value and Eigen vector

- When A is a square matrix, and it operates on a vector v.
- And when the resulted vector also points in the same direction as v (i.e. we get some scalar multiplication of v as result).
- Then we call that vector v ***eigenvector*.**
- And that scalar multiplier is called ***eigenvalue***  

- Mathematically,
  $$ A \cdot v = \lambda . v $$
  here, $v$ is eigenvector and $\lambda$ is eigenvalue

1. **Matrix and Vector:**
   - Consider a matrix $(A)$ of size $(n \times n)$.
   - Let $(v)$ be a non-zero vector of size $(n \times 1)$.

2. **Eigenvector and Eigenvalue Equation:**
   $
   A \cdot v = \lambda \cdot v
   $

3. **Finding Eigenvalues:**
   - Solve the characteristic equation:
     $[
     \text{det}(A - \lambda I) = 0
     ]$

4. **Finding Eigenvectors:**
   - Substitute eigenvalues into:
     $[
     (A - \lambda I) \cdot v = 0
     ]$

**Example**

1. Given Matrix
$
A = \begin{bmatrix} 4 & 1 \\ 2 & 3 \end{bmatrix}
$

2. Finding Eigenvalues
- Solve $(\text{det}(A - \lambda I) = 0)$:
  $[
  \text{det}\left(\begin{bmatrix} 4 - \lambda & 1 \\ 2 & 3 - \lambda \end{bmatrix}\right) = (\lambda^2 - 7\lambda + 10) = 0
  ]$
- Roots are $(\lambda = 5)$ and $(\lambda = 2)$.

3. Finding Eigenvectors

-  For $(\lambda = 5)$
- Solve:
  $[
  \begin{bmatrix} -1 & 1 \\ 2 & -2 \end{bmatrix} \cdot \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix}
  ]$
- Simplify to find $x$ and $y$.

- Eigenvector:
  $[
  v = \begin{bmatrix} 1 \\ 1 \end{bmatrix}
  ]$



## Power Iteration 

Concept

Power Iteration is a method to approximate the dominant eigenvector of a matrix \( A \). This method helps find the eigenvector associated with the largest eigenvalue.


1. **Initial Vector Direction**:
   - Start with an initial vector \( v \). This vector points in some direction in the vector space.

2. **Matrix Transformation**:
   - Multiply the matrix \( A \) with the vector \( v \). This transformation changes the vector's direction and magnitude.
   - Matrix \( A \) acts as a linear transformation, stretching, shrinking, or rotating the vector.

3. **Dominant Eigenvector**:
   - The dominant eigenvector is associated with the largest eigenvalue of the matrix \( A \). It represents the direction where the transformation \( A \) has the maximum stretching effect.

4. **Iteration Process**:
   - **Multiply**: Each iteration involves multiplying the current vector by the matrix \( A \), making the vector more aligned with the dominant eigenvector.
   - **Normalize**: Normalize the vector to focus on direction, preventing it from becoming too large or too small.

**Convergence with Angles**

- **Initial Angle**: The angle between the initial vector \( v \) and the dominant eigenvector may be large.
- **Iteration**: With each iteration, the angle between the transformed vector and the dominant eigenvector decreases. The vector gradually aligns with the direction of the dominant eigenvector.
- **Final Alignment**: After sufficient iterations, the vector aligns closely with the dominant eigenvector, illustrating convergence.

**Physical Analogy**

Imagine you are in a boat on a river:

- **Initial Heading**: Your initial direction (the vector) might not align with the flow of the river.
- **Matrix Transformation**: The current (matrix \( A \)) pushes you, changing your direction.
- **Iteration**: With each push (iteration), you adjust your heading to better align with the flow.
- **Convergence**: Eventually, you are moving in the same direction as the river's flow (the dominant eigenvector), despite your initial direction.

**Summary**

The power iteration method works by iteratively transforming and normalizing a vector, progressively aligning it with the direction of the dominant eigenvector. The process reduces the angle between the vector and the eigenvector, demonstrating convergence to the dominant direction.


# `__annotations__`

In Python, __annotations__ is a special attribute that stores type annotations for a function's parameters and return value. When you define a function and use type hints to specify the expected types of its arguments and return type, these hints are stored in the function's __annotations__ attribute as a dictionary.

**How __annotations__ Works**
When you define a function with type hints, like this:


In [2]:
def add(a: int, b: int) -> int:
    return a + b

add.__annotations__ # Output: {'a': <class 'int'>, 'b': <class 'int'>, 'return': <class 'int'>}



{'a': int, 'b': int, 'return': int}

**Why Use `__annotations__`**

1. Type Safety: Using annotations helps ensure that the functions used for computing new columns return the expected types.

2. Readability: Type annotations make it clear what types are expected and returned by functions, improving code readability.

3. Automatic Type Handling: By extracting the return type annotations, the code can automatically determine the types of new columns without additional input.

This approach allows the select method to dynamically handle additional columns by understanding the types of values that the provided functions are expected to return.