### Software Engineering | Interview Crash Course | Data Structure and Algorithm 

* This is a collection of helpful resources for interviews and solving Data Structures & Algorithms problems.

**Time complexity (Big O) cheat sheet**

<img width="823" alt="Big-O Complexity" src="https://github.com/Gpower01/DataStructures_-_Algorithms/assets/51031593/3039d0ee-2d91-46fd-a199-ba5f356a64ab">

#### Tips:
**Brainstorming DS&A**
- Try to figure out what data structure or algorithm is applicable. Break the problem down and try to find common patterns that you've learned. Figure out what the problem needs you to do, and think about what data structure or algorithm can accomplish it with a good time complexity.

<p>

* Once you have decided on what data structires/algorithms to use, you now need to construct your actual algorithm. 

~ **Before coding, you should think of the rough steps of the algorithm.**

### Big-O Complexity
* First let's talk about the time complexity of common operations, split by data structure/algorithm.

<p>

---

1) **Arrays (Dynamic array/list)**

Given `n = arr.lenght`

* Add or remove element at the end: **O(1)** - amortized
* Add or remove element from arbitray index: **O(n)**
* Access or modify element at arbitrary index: **O(1)**
* Check if element exists: **O(n)**
* Two pointers: **O(n.k)**, where `k` is the work done at each iteration, includes sliding window
* Building a prefix sum: **O(n)**
* Finding the sum of a substring giving a prefix sum: **O(1)**

* Add or remove element at the end: **O(1)** - amortozed example using **Queue**

In [1]:
class Queue:
    def __init__(self):
        self.input = [] # Stores elements that are enqueued 
        self.output =  [] # Stores elements that are dequeued 

    def enqueue(self, element):
        self.input.append(element) # Append the element to the input list 

    def dequeue(self):
        if not self.output: # If the output list is empty 
            #Traverse all elements from the input list to the output list, reversing the order
            while self.input: # While the input list is not empty 
                self.output.append(self.input.pop()) #Pop the last element from the input list and append it to the output list 

        return self.output.pop() # Pop and return the last element from the output list 
    

2) **Strings (immutable)**

Given `n = s.length`
<p>

* Add or remove character: **O(n)**
* Access element at arbitray index: **O(1)**
* Concatenation between two strings: **O(n + m)**, where `m` is the length of the substring 
* Two pointers: **O(n.k)**, where `k` is the work done at each iteration, including sliding windows
* Building a string from joining an array, stringbuilder, etc.: **O(n)**

<p>

---


3) **Linked Lists**
Given `n` as the number of nodes in the linked list,

<p>

* Add or remove element given pointer before add/removal location: **O(1)**
* Add or remove element given pointer at add/removal location: **O(1)** - if doubly linked 
* Add or remove element at arbitrary location/position without pointer: **O(n)**
* Access element at arbitrary position without pointer: **O(n)**
* Check if element exists: **O(n)**
* Reverse position `i` and `j`: **O(j - i)**
* Detect a cycle **O(n)** using fast-slow pointers or hash map

<p>

---

4) **Hash table/dictionary**

Given `n = dic.length`

* Add or remove key-value pair: **O(1)**
* Check if key exists: **O(1)**
* Check if value exists: **O(n)**
* Access or modify value associated a key: **O(1)**
* Iterate over all keys, values or both: **O(n)**

**NOTE:**
- `The O(1) operations are constant relative to n. In reality, the hashing algorithm might be expensive. For example, if your keys are strings, then it will cost O(m) where m is the length of the string. The operations only take constant t ime relative to the size of the hash map.`

<p>

---

5) **Set**

Given `n = set.length`

* Add or remove element: **O(1)**
* Check if element exists: **O(1)**

**NOTE:**
- The above `NOTE` also applies here as well.

<p>

---

6) **Stack**

Stack operations are dependent on their implementation. Stack is only required to support `pop` and `push`. if implemented with a dynamic array.

Given `n = stack.length`

* Push element: **O(1)**
* Pop element: **O(1)**
* Peek (see element at top of stack): **O(1)**
* Access or modify element at arbitrary index: **O(1)**
* Check if element exists: **O(n)**

<p>

---

7) **Queue**

Queue operations are dependent on their implementation. A queue is only required to support `dequeue` and `enqueue`. If implemented with a doubly linked list:

Given `n = queue.length`

* Enqueue element: **O(1)**
* Dequeue element: **O(1)**
* Peek (see element at front of queue): **O(1)**
* Access or modify element at arbitrary index: **O(n)**
* Check if element exists: **O(n)**

**NOTE:**
* Most programming languages implement queues in a more sophisticated manner than a simple doubly linked list. Depending on implementation, accessing elements by index may be faster than ***O(n)***, or ***O(n)*** but with a significant constant divisor.

<p>

---

8) **Binary tree problems (DFS/BFS)**

Given `n` as the number of nodes in the tree.

Most algorithms will run in **O(n.k)** time, where _k_ is the work done at each node, usually **O(1)**. This is just a general rule and not always the case. We are assuming here that **BFS** is implemented with an efficient queue. 

<p>

---

9) **Binary search tree**

Given n as the number of nodes in the tree:

* Add or remove element: **O(n)** worst case **O(log n)**
* Check if element exists: **O(n)** worst case **O(log n)**

**NOTE:**
* The average case is when the tree is well balanced - each depth is close to full. The worst case is when the tree is just a straight line.

<p>

---


10) **Heap/Priority Queue**

Given `n = heap.length` and talking about _min_ heaps.

* Add an element: **O(log n)**
* Delete the minimum element: **O(log n)**
* Find the minimum element: **O(1)**
* Check if element exists: **O(n)**

<p>

---


11) **Binary Search**

Binary search runs in **O(log n)** in the worst case, where _n_ is the size of your initial search space.

<p>

---

12) **Miscellaneous**

* Sorting: **O(n.log n)**, where _n_ is the size of the data being sorted
* DFS and BFS on a graph: **O(n.k+e)**, where _n_ is the number of nodes, _e_ is the number of edges, if each node is handled in **O(1)** other than iterating over edges.
* DFS and BFS **Space complexity**: typically **O(n)**, but if it's in a graph, might be **O(n+e)** to store the graph
* Dynamic programming **Time complexity**: **O(n.k)**, where _n_ is the number of states and _k_ is the work done at each state
* Dynamic programming **Space complexity**: **O(n)**, where _n_ is the number of states.

<p>

---

13) **Input Sizes vs time complexity**

The constriants of a problem can be considered as hints because they indicate an upper bound on what your solution's time complexity should be. Being able to figure out the expected time complexity of a solution given the input size is a valuable skil to have. 

<p>

---

**n<=10**

The expected time complexity likely has a factorial or an exponential with a base larger than `2`-**O(n^2 . n!)** or **O(4^n)** for example. 

You should think about backtracking or any brute-force-esque recursive algorithm. `n <= 10` is extremely small and usually **any** algorithm that correctly finds the answer will be fast enough. 

---

**10<n<=20**

The expected time complexity likely involves **O(2^n)**. Any higher base or a facrorial will be too slow (3^20 = ~3.5 billion, and 20! is much larger). A 2^n usually implies that given a collection of elements, you are considering all subsets/subsequences - for each element, there are two choices: take it or don't take it.

Again, this bound is very small, so most algorithms that correct will probably be fast enough. Consider backtracking and recursion. 

---

**20<n<=100**

At this point, exponentials will be too slow. The upper bound will likely involve **O(n^3)**.

There maybe solutions that run in **O(n)**, but the small bound allows brute force solutions to pass (findig the linear time solution might not be considered as "easy").

Consider brute force solutions that involve nested loops. If you come up with a brute force solution, try analyzing the algorithm to find what steps are "slow", and try to improve on those steps using tools like hash map or heaps.

<p>

---

**100<n<=1,000**

In this range, a quadratic time complexity **O(n^2)** should be sufficient, as long as the constant factor isn't too large. 

Similar to the previous range, you should consider nested loops. The difference between this range and the previous one is that **O(n^2)** is usually the expected/optimal time complexity in this range, and it might not be possible to improve.

<p>

---

**1,000<n<100,000**

`n<=10^5` is the most common constriant you will see on LeetCode. In this range, the slowest acceptable common time complexity is **O(n.log n)**, although a linear time approach **O(n)** is commonly the goal. 

In this range, ask yourself if sorting the input or using a heap can be helpful. If not, then aim for an **O(n)** algorithm. Nested loops that run in **O(n^2)** are unacceptable-you will probably need to make use of a technique learned in this course to simulate a nested loop's behaviour in **O(1)** or **O(log n)**:

* Hash map
* A two pointers implementation like sliding window 
* Monotonic stack 
* Binary search 
* Heap 
* A combination of any of the above

If you have an **O(n)**algorithm, that constant factor can be reasonably large (around 40). One common theme for string problems involves looping over the characters of the alphabet at each iteration resulting in a time complexity of **O(26n)**.

<p>

---

**100,000<n<1,000,000**

`n <= 10^6` is a rare constraint, and will likely require a time complexity of **O(n)**. In this range, **O(n.log n)** is usually safe as long as it has a small constant factor. You will very likely need to incorporate a hash map in some way.

<p>

---

**1,000,000<n**

With  huge inputs, typically in the range of 10^9 or more, the most common acceptable time complexity will be logarithmic **O(logn)** or constant **O(1)**. In these problems, you must either significantly reduce your search space at each iteration (usually bineary search) or use clever tricks to find information in constant time (like with math or a clever use of hash maps).

**NOTE:**

Other time complexities are possible like O($\sqrt{n}$), but this is very rare and will usually only be seen in very advanced problems.

<p>

---

14) **Sorting algorithms**

All major programming languages have a built-in method for sorting. It is usually correct to assume and say sorting costs **O(n.log n)**, where _n_ is the number of elements being sorted. For completeness, here is a chart that lists many common sorting algorithms and their completeness. 

The algorithm implemented by a programming language varies; for example, Python uses **Timsort** but C++, the specific algorithm is not mandated and varies. 


<img width="833" alt="Big-O-chart" src="https://github.com/Gpower01/DataStructures_-_Algorithms/assets/51031593/4c2e85ed-399b-4feb-9c85-3efd02e953f6">

- *Selection sort can be implemented as a stable sort, rather than swapping the minimum value with its current value, the minimum value is inserted into the first position and the intervening values shifted up. However, this modification either requires a data structure that supports efficient insertion or deletions, such as a linked list, or it leads to **O(n^2)** writes. 

<p>

---

Definition of a stable sort from [wikipedia](https://en.wikipedia.org/wiki/Category:Stable_sorts): "Stable sorting algorithms maintain the relative order of records with equal keys (i.e. values). That is, a sorting algorithm is stable if whenever there are two records R and S with the same key and with R appearing before S in the original lis, R will appear before S in the sorted list."

<p>

---

15) **General DSA flowchart**

Here's a flowchart that can help you figure out which data structure or algorithm should be used. Note that this is very general as it would be impossible to cover every single scenario.

**NOTE:**
- Advanced algorithms like Dijkstra's is excluded. 

<img width="822" alt="DSA flowchart" src="https://github.com/Gpower01/DataStructures_-_Algorithms/assets/51031593/bbd109ef-f0e9-4e60-b333-83b537cd8aa3">

<p>

---

<img width="820" alt="DSA flowchart2" src="https://github.com/Gpower01/DataStructures_-_Algorithms/assets/51031593/cd47b84b-95c8-43b5-829c-08395530dcd7">


* Do you think the algorithm could be improved in terms of complexity? The answer is usually yes, especially if your algorithm is slower than **O(n)**