# Course material
It's based on the Edx course called [**Data structures: an active learning approach**](https://stepik.org/course/579/syllabus).

# Recap on time complexity

## Big O

$f(x) = O(g(x))$ means:
- There exists a constant $c$ for all $x >= k$ where $|f(x)| < c|g(x)|$. 
- $f(x)$ is bounded **below** by some $c * g(x)$ (It grows slower than $g(x)$)

## Big Ω
It's just the inverse:
- There exists a constant $c$ for all $x >= k$ where $|f(x)| > c|g(x)|$. 
- $f(x)$ is bounded **above** by some $c * g(x)$ (It grows faster than $g(x)$)

## Big θ
$f(n) = θ(g(n))$
when $f(n) = O(g(n))$
AND $f(n) = Ω(g(n))$

formally, 
$g(n) • k1 ≤ f(n) ≤ g(n) • k2$

## Notes
- Remember, it's all about the **input size**. It's **not about the value of input**. This brings us back to the definition of [pseudo-polynomial vs polynomial algorithm](https://stackoverflow.com/questions/19647658/what-is-pseudopolynomial-time-how-does-it-differ-from-polynomial-time).

## Examples
## O(n)
```c++
void print_info(vector<int> a) {
    int n = a.size();
    float avg = 0.0;
    for(int i = 0; i < n; i++) {
        cout << "Element #" << i << " is " << a[i] << endl;
        avg += a[i];
    }
    avg /= n;
    cout << "Average is " << avg << endl;
}
```
## $O(n^{2})$
```c++
void dist(vector<int> a) {
    int n = a.size();
    for(int i = 0; i < n-1; i++) {
        for(int j = i+1; j < n; j++) {
            cout << a[j] << " - " << a[i] << " = " << (a[j]-a[i]) << endl;
        }
    }
}
```
although not as tight as $O(n^{2})$, it's still correct to say $O(n^{2})$

## Trickier
```c++
void tricky(int n) {
    int operations = 0;
    while(n > 0) {
        for(int i = 0; i < n; i++) {
            cout << "Operations: " << operations++ << endl;
        }
        n /= 2;
    }
}
```
The `while` loop would continue until `n` reaches 0. For example, if `n` were 8, 
```
n = 8
n = 4
n = 2
n = 1
n = 0
```
∴ the `for` loop would iterate for $log_{2}n$ each time.

But does this tell you the (tightest) time complexity is $O(nlogn)$? 

The `for` loop goes on for 
```
n
n/2
n/4
n/8
n/16
...

```
times for each loop.

And you know the sum of this geometric series **when $n$ is a power of 2**:
$$1, 1/2, 1/4, 1/8, ..., 1/(2^{logn-1})$$ 
$$ = 2^{k}+2^{k+1}/2+2^{k+2}/4+2^{k+3}/8+ ... +1$$
$$ = 2^{k} + 2^{k-1} + 2^{k-2} + 2^{k-3} ... + 1 = 2k + 1 − 1= 2n−1 $$

because:
$$a + ar + ar^{2} + ar^{3} ... + ar^{m-1}$$
$$ = a(1-r^{m}) / (1 - r) $$
and:
$$r = 2^{k-1} /  2^{k} = 1/2$$
$$a = 2^{k}$$
$$m = k + 1$$
then:
$$ a(1-r^{m}) / (1 - r) $$
$$ = 2^{k}(1-(1/2)^{k+1})/(1-1/2) $$
$$ = 2^{k}(1-1/2(1/2)^{k})/(1-1/2) $$ 
$$ = 2^{k}(2-(1/2)^{k})/(2-1) $$
$$ = 2^{k}(2-(1/2)^{k})/(1) $$ 
$$ = 2^{k}2-2^{k}1/2^{k} $$ 
$$ = 2n-n/n $$ 
because: 
$$2^{k} = n$$
then:
$$ = 2n-1 $$ 

Therefore,
$$ tricky(n) = O(2n-1) = O(n) $$

For more, look at:
- https://math.stackexchange.com/questions/401937/how-is-nn-2n-4-1-equal-to-2n-1-using-the-formula-for-geometric-series
- https://courses.edx.org/courses/course-v1:UCSanDiegoX+CSE100x+1T2018/discussion/forum/course/threads/5ac1bcd984452a083000348b

## +Alpha
if $ tricker(n) = O(nlogn) $, 

the number of total iterations would have been at max: $8log8$. 

for example, if $n = 8$.

but this is not the case. 

For $n = 8$, the total num of iterations would be only $$8 + 4 + 2 + 1 = 15$$, 

which is exactly $2n - 1 = O(n)$). 

But $O(nlogn)$ would give you 

$$8log8 = 8 * 3 = 24$$ which is a looser bound. 

## Visual recap
![time complexities graphs](https://ucarecdn.com/257b05a0-5c91-44c4-b23f-93e39ca1cc1d/)

# Data structures
- A **data structure**, as implied by the name, is a particular structured way of storing data in a computer so that it can be used efficiently.
- For data structure, both **time** complexity and **space** complexity matter.

## Computational problems
- Algorithms are simply solutions to computational problems. **A single computational problem can have numerous algorithms as solutions.**
- For example, for an algorithm to get the largest element in a given array of `int`s, there are hundreds of ways to implement it. 
- There needs to be a way to describe the computational problem itself.

## Classes of computational problems: P, NP, NP-Hard, and NP-Complete

### P
**Polynomial**. Any computational problem that can be solved in polynomial time (or better, of course) is considered a member of P.

### NP
**Nondeterministic Polynomilal time**. 
- Computational problems that can be verified in polynomial time **(whether or not you can solve them in polynomial time)**
-  Do not necessarily need to be able to compute a correct answer in polynomial time
- NOT necessary for there to exist a polynomial-time algorithm that solves the problem optimally
- Therefore, **P is a subset of NP**, but [it has not yet been proven that P = NP](https://en.wikipedia.org/wiki/Millennium_Prize_Problems#P_versus_NP). => Not all problems in NP can be solved in polynomial time.
- Then what's like a problem that's **not a member of P but NP only?**

### NP-hard
**Nondeterministic Polynomial-time hard**.
- A problem can be considered NP-Hard **if it is at least as hard as the hardest problems in NP**.
- A problem H is NP-Hard when every problem L in NP can be "reduced," or transformed, to problem H in polynomial time. 
- Example of an NP-hard problem: [subsset sum problem](https://en.wikipedia.org/wiki/Subset_sum_problem)

### NP-complete
**Intersection between NP and NP-hard**.
- an NP-Hard problem is considered NP-Complete if it can be verified in polynomial time (i.e., it is also in NP).
- example problem: [Boolean satisfiability problem](https://en.wikipedia.org/wiki/Boolean_satisfiability_problem)

## Diagrams to help understand the classes
![classes of computational problems](https://ucarecdn.com/12483788-925a-493c-a227-751ea01f54d2/)

Left: if P ≠ NP 

Right: if P = NP (all problems that can be verified in polynomial time can also be solved in polynomial time)

### Questions to help
This one is really hard if you don't look into the diagram. Which ones are right?:

- [ ] All problems in NP-Complete are also in NP-Hard
- [ ] All problems in NP-Complete are also in NP
- [ ] All problems in NP are also in NP-Complete
- [ ] All problems in P are also in NP
- [ ] P = NP
- [ ] All problems in NP-Hard are also in NP-Complete
- [ ] P ≠ NP
- [ ] All problems in NP are also in P

## Impllications
Knowing the class of a given computational problem, you may expect / save the length of the time you will work on it, or even give up on it because it's simply impossible to finish it in a polynomial time.