# Course info

**Title:** Data Structures and Algorithms (for the students of Computer Science)

**Venue:** Shiraz University, Shiraz, Iran

**Instructor:** Reza Rezazadegan

**Course webpage:** https://www.dreamintelligent.com/data-structures

**Course Github:** https://github.com/rezareza007/datastructs2024

**Texts:**
- Introduction to Algorithms from MIT Open Courseware, available [here](https://ocw.mit.edu/courses/6-006-introduction-to-algorithms-spring-2020/).
- Rance Nicaise, Data Structures and Algorithms Using Python, 2011

**Prerequisites:** familiarity with Python programming, including lists, loops, conditionals and classes

**Course evaluation:**
- In-class problem solving (40 minutes a week, 5 points)
- Coding project and presentation (4 points). Projects must be in for of a jupyter notebook. Please check the subject of your project with me in advance.  
- Written exam (11 points)


**Course syllabus**

0- Introductions: Problems, algorithms and efficiency  
1- Sequence structure: arrays  
2- Sequence structure: linked lists  
3- Sequence structure: Queues and Stacks  
4- Sets and Maps  
5- Sorting and Searching  
6- Hashing  
7- Advanced sorting  
8- Binary Trees  
9- AVL Trees  
10- Breadth First Search    
11- Depth First Search  
12- Weighted Shortest Paths  
13- Dijkstra algorithm  
14- Recursive algorithms  
15- Complexity
 



- A **problem** is a binary relation connecting problem inputs to correct outputs.  
If $I, O$ are the spaces of inputs and outputs, respectively then a problem is a subset $R\subset I\times O$. (In set theory, $R$ is called a relation.)

    - Example 1: $I$ can be the space of photos (e.g. 100x100 black and white photos) and $O$ the space of texts, and the relation $R$ between a photo and a text holds if the text is a description (caption) for the photo. 

    - Example 2: Given a set $S$, $I$ is the set of functions on $S$ and $O=\mathbb{R}$ the relation $R$ gives the absolute minimum of the functions $f: S\to \mathbb{R}$. Note: $S$ may be discreet or of a very high dimension. (Optimization)

- A (deterministic) **algorithm** is a procedure that maps inputs to *single* outputs: $f: I\to O$. 
- An algorithm **solves** a problem if for every problem input it returns a correct output. 

We want to not only solve problems, but to **communicate** to others that a solution to a problem is both **correct** and **efficient**. 

In this class, we try to solve problems which generalize to inputs that may be arbitrarily large. 

**Sample problem:** Given the students in this class, return either the names of two students who share the same birthday and year, or state that no such pair exists. 

**Solution:**

## Correctness 
Any computer program you write will have finite size, while an input it acts on may be arbitrarily large. Thus every algorithm we discuss in this class will need to repeat commands in the algorithm via loops or recursion, and we will be able to prove correctness of the algorithm via **induction** on the size of input. 


**Proof of the correcness for bithday matching algorithm:**

## Efficiency

One program is said to be more efficient than another if it can solve the same problem input using fewer resources (most importantly memory and running time).  

The resources used by a program depend on the algorithm, hardware on which the program is run, the programming language, plus code optimization and even the operating system. 

We compare algorithms based on their _asymptotic performance_ relative to problem input size. It enables us to ignore constant factor differences in hardware performance.

## Model of computation


A model of computation: we need to model how long a computer takes to perform basic operations.

A **machine word** is a sequence of w bits representing an integer from the set {0, . . . , 2w − 1}. 
In ordinary computers, word lengths is either 32 or 64 bits. However words lengths of up to 512 bits have been used (e.g. Intel Xeon Phi CPU).

A **Word-RAM processor** can perform basic binary operations on two machine words in constant 
time, including addition, subtraction, multiplication, integer division, modulo, bitwise operations, 
and binary comparisons. 

A processor can access only $2^{w}$ addresses in the memory. For a 64-bit machine: 10^10 GB, for a 32-bit machine: 4 GB


## Asymptotic notation

### Asymptotic upper bound: O Notation:  
Non-negative function g(n) is in O(f(n)) if and only if g(n) ≤ c · f(n) for some constant c and for all n greater than some constant $n_0$.   

E.g. $3x^2+2x+4 \in O(?)$
### Asymptotic upper bound: Ω  Notation:  
Non-negative function g(n) is in O(f(n)) if and only if  c · f(n) ≤ g(n)  for some constant c and for all n greater than some constant integer $n_0$.   


### Asymptotic lower bound: Θ Notation:
$\Theta(f(n))=O(f(n))\cap \Sigma(f(n))$


### Examples of operations:
It is common to use the variable ‘n’ to represent a parameter that is linear in the problem input size.

$O(1)$: accessing the element at a specific index in an array (random access).

$O(\log n)$: dividing an integer by a constant till the result becomes $<1$; binary search in a sorted array.

$O(n)$: finding the length of a null-ended string; traversing a linked list.  

$O(n\log n)$: Merse sort algorithm

$O(n^2)$: any algorithm involving pairwise comparison of the elements in a set, such as the birthday matching problem.  

$O(n^3)$: multiplying two $n\times n$ matrices; finding the mimimum free energy structure of an RNA sequence of length $n$

$O(e^n)$: finding all the subsets of a set; some community detection algorithms in network analysis. 


