# Week 1 Notes – Basic Algorithms
## 2024 Data Structures & Algorithms Challenge
### Notes by Cyril Michino | Zindua School

## Before We Begin...
### Why learn data structures and algorithms (Computer Programming Fundamentals)
- Build optimised code: increase speed and reduce cost (Particularly at scale)
- Advance your problem-solving skills with code (Paradigms for solving code problems)
- Crack technical interviews (Typical questions for Big Tech companies)

### What this course will cover:
1. Week 1: Basic Algorithms
    - Big-O Notation
    - Arrays & Hashmaps
    - Search
    - Sorting
2. Week 2: Data Structures
    - Linear Data Structures: Linked Lists, Stacks, Queues
    - Non-Linear Data Structures: Trees, Graphs
3. Week 3: Divide & Conquer Algorithms (Real problem-solving starts)
    - Recursion
    - Dynamic Programming
    - Greedy Algorithms
4. Week 4: Advanced Algorithms
    - Dynamic Programming in Graphs & Grid
    - Advanced Graph Algorithms: Search, Pathfinding, Vertex Coloring
    - Optimise Greedy Algorithms: Hill Climbing (Gradient Descent)
    - NP-Completeness
5. Week 5: Bonus Sessions
    - 2 Career by Gebeya
    - 2 bonus sessions (unstuck)
    - Bonus: Numpy, Monte Carlo, Markov Chains

## Day 1: Introduction to Algorithms 
Objectives: Unbderstand Big-O Notation, learn array search algorithms (Linear vs Binary Search)

### Big-O Notation
Recommended Readings:
1. [FreeCodeCamp, What is Big O Notation Explained: Space and Time Complexity](https://www.freecodecamp.org/news/big-o-notation-why-it-matters-and-why-it-doesnt-1674cfa8a23c/)
2. [Big-O cheatsheet, Know thy complexities](https://www.bigocheatsheet.com/)

**Why the notation:** Different computer run at different speeds but this notation allows us to understand the relative speed of an algorithm. Note that we use the Big-O notation for both runtime and space complexity. We'll use runtime complexity (the number of operations your code makes) to understand this notation better.

These are the notations to take note of though when assessing the complexity of code:
- Big O (O()) describes the upper bound of the complexity (Worst case)
- Omega (Ω()) describes the lower bound of the complexity (Best case)
- Theta (Θ()) describes the exact bound of the complexity (Exact worst case bound)

When gauging the complexity of our code, we always focus on the worst case scenario. Hence, we use Big-O notation to determine the complexity of our code. However, since upper bounds can move to infinity (e.g. n^2,n^3 are all upper bounds of n), most of the time when we talk of Big-O, we are actually talking about Theta (Θ()) i.e. the exact bound of complexity.

#### Runtime Complexity Examples
##### Here is code that runs in constant time O(1)

In [7]:
## Constant time, operations remain the same regardless of the scale of the elements
arr = [3,4,6,7,8]
print(arr) #Whether the array has 2 elements or million elements, the print operation only runs once

a


In [16]:
4 + 5 #Regardless of the weight of the numbers, this operation also has constant time

9

#### Here is code that runs in linear time O(n)

In [17]:
arr = [3,4,6,7,8]
for i in arr:
    print(i) ## Number of print operations will be equal to the length of the array

3
4
6
7
8


#### Here is code that runs in quadratic time O(n<sup>2</sup>)

In [18]:
arr = [[3,4,6,7,8],[3,4,6,7,8],[3,4,6,7,8],[3,4,6,7,8],[3,4,6,7,8]]

for i in arr: # This loop runs n times
    for j in i: # The sub-loop runs n-times every time it is called
        print(j) ## Hence, we have n^2 print operations

3
4
6
7
8
3
4
6
7
8
3
4
6
7
8
3
4
6
7
8
3
4
6
7
8


Here is a diagram of different complexities ranked:

O(1) < O(log n) < O(n) < O(n log n) < O(n<sup>2</sup>) < O(n<sup>k</sup>) < O(2<sup>n</sup>) < O(n!) | where n is the number of elements and k is a constant greater than 2

Note the default base of the log in computer programming is always 2 given that we'll be dealing mostly with binary operations.

![Rank of Complexities](https://www.freecodecamp.org/news/content/images/2021/06/1_KfZYFUT2OKfjekJlCeYvuQ.jpeg)

### Linear Search
We search for elements sequentially in an array from the first index till we find it.
- Worst-case scenario O(n)
- Best-case scenario Ω(1)

In [15]:
def linearsearch(arr,element):
    
    for i in range(len(arr)):
        if arr[i] == element:
            return i
        
    return None ## If element is not in array

arr = [2,45,34,67,4,23,56]
print(linearsearch(arr,4))

4


### Binary Search
Only works for sorted arrays. We look for the mid-point of the array and ask ourselves whether the element is the mid-point, on the right-side (higher value), or the left-side (lower value). If the element is the mid-point, we return it, if it is not we focus on the side (sub-array) where it is found and we repeat the split search process up until we find the element.
- Worst-case scenario O(log n) i.e. number of operation half with an increase in elements
- Best-case scenario O(1)

In [21]:
### Day 1 Challenge: Write a script to search for an element in an array using binary searc
def binarysearch(arr,element):
    left = 0
    right = len(arr)

    while left<=right:
        mid = (left+right)//2 ## Use double-division to truncate out the decimal places
        if arr[mid] == element:
            return mid
        if arr[mid] > element:
            right = mid - 1
        if arr[mid] < element:
            left = mid + 1
    return None ## If element is not in the array


arr = [2,4,6,7,9,10,13] ## Binary search only works for a sorted array
element = 10

print(binarysearch(arr,element))

5


## Day 2: Introduction to Data Structures

### Big-O Notation – Space Complexity

### What are Data Structures?
A way to store, organise and manage data/information that we use in our programs. There are primitive (Integer, Float, String, Boolean) and non-primitive data structures. Here is how we can categosise non-primitive data structures:
1. Linear Data Structures
    - Arrays
    - Linked Lists
    - Stacks
    - Queues
2. Non-Linear Data Structures
    - Hashmaps
    - Trees
    - Graphs

Here are the operations we care about when choosing data structures:

- Accessing data
- Searching for data
- Inserting data
- Deleting data

In [None]:
arr = [4, "we", 5]
arr = ["Cyril", "Gerald", "Ivy", "Shadrach"]
arr = [4,5,6,7,8]
arr = [[4,5,6,7,8],[4,5,6,7,8],[4,5,6,7,8],[4,5,6,7,8]]

In [15]:
arr = [4,5,6,7,8]
arr[2] = 9
arr

[4, 5, 9, 7, 8]

In [14]:
for i in range(len(arr)):
    if arr[i] == 6:
        print(i)

2


### Arrays
- List of items with similar data types (primitive data type).
- Array size cannot be changed (Fixed size). We have caveats in Python where array sizes can increase, or in Java where we have static and dynamic arrays

#### Common operations in arrays
- Accessing an element (Use of indexes, usually integers, to access data) O(1)
- Replacing values in an array O(1)
- Searching O(n)
- Inserting O(n)
- Deleting O(n)

#### Limitations of Arrays
- Fixed Size
- Inefficient when inserting and deleting

#### Alternative to Arrays: Dynamic Arrays (ArrayLists)
- Arrays that do not have a fixed size: Java and C++
- ArrayLists actually take more memory space to save the same amount of data (Data storage is non-contiguous)
- ArrayLists are slower given how data is stored (Array contain memory locations of the actual values in the array)
- ArrayLists cannot take in multidimensional data (Only take in objects, autoboxing used to make this seamless)
- Python Lists are a special hybrid of Arrays and ArrayLists (Call it dynamic arrays)

In [17]:
students = {1:"John",56:"Maureen",345:"Cyril"}
students[56]

'Maureen'

In [18]:
students[78] = "Edwin"
students

{1: 'John', 56: 'Maureen', 345: 'Cyril', 78: 'Edwin'}

In [19]:
students.pop(345)
students

{1: 'John', 56: 'Maureen', 78: 'Edwin'}

### Hashmaps (Dictionaries)
- Stores data as key and value (Keys replace indexes which are used in arrays, keys have to be unique)
- Hashmaps use hash functions and hash tables to map keys to an index (memory address) of the value
    - Hash functions should be collision proof
    - To handles collissions: Open Addressing (Next Address), Closed Addressing (Linked Lists)
- Time complexity for hasmap operations:
    - O(1) for all operations in the best case
    - O(n) if the hash function i inefficient (LinkedLists on one address)

In [None]:
safaricom = {'01/01/2024':234,'02/02/2024':231, '23/11/2024':456}
33,365
if int(key[3:5]) == 1:
    print(key[:2])
if int(key[3:5]) == 2:
    print(key[:2]+31)
if int(key[3:5]) == 3:
    print(key[:2]+31+28)