## Notes
<hr>

- `data structure` → <u>A format for organizing data in an efficient way. Practically, it can be split into two things ⇒ **the interface and the implementation**</u>

- `interface` → <u>Like a "coding contract" for how we interact w/ a data structure (**How it can be used** {ex: what operations can be performed, what inputs can be expected, what outputs can be expected, etc})</u>

- `implementation`→ <u>The code that makes the data structure actually work</u>
    - Details how data is **stored**
    - How the operations that are performed come into play
    - Ex: Implementation of a *Dynamic Array* might involve...
        1. Allocating memory for the list
        2. Tracking the size
        3. Rearranging the elements when an operation like `remove` is called

<blockquote>
<p>The more important thing is to understand the <span style="color: red;"> interface </span>. All major data structures have built-in implementations in all major programming languages. In an interview, it is expected that you know how to use the built-in data structures, but you wouldn't be asked to implement them yourself.</p>
</blockquote>
<hr>

## Hashing
- `hash function` → <u>A hash function is a function that takes an input and deterministically converts it to an integer that is less than a fixed size set by the programmer</u>
    - Inputs are called `keys`
    - The same input will always be converted to the same integer
    - The following example is a hash algorithm for a string of the English alphabet:

<ol>
<li>Declare an integer <code>total</code>.</li>
<li>Iterate over the string. For each character, convert it to its position in the alphabet. For example, <code>a -&gt; 1</code>, <code>c -&gt; 3</code>, <code>z -&gt; 26</code>.</li>
<li>Take that value, and multiply it by the current position in the string (index + 1). Add this to <code>total</code>. For example, given the string <code>"abc"</code>, the <code>b</code> is at position <code>2</code> in the alphabet and position <code>2</code> in the string, so it would contribute <code>2 * 2 = 4</code> towards <code>total</code>.</li>
<li>After going through every character, <code>total</code> is the converted value.</li>
</ol>
<hr>

- Normally, arrays need indices to be integers → hash converts anything into an Integer → removes the constraint
- `Hash map` → <u>also the same thing as a `hash table` or a `dictionary`, it is just a **hash function** combined with an **array**</u>
    - Ex: A hash map in python is simply `dict = {}`
<hr>
<h3 id="comparison-with-arrays">Comparison with arrays</h3>
<p>In terms of time complexity, hash maps blow arrays out of the water. The following operations are all <span class="maths katex-rendered"><span class="katex"><span class="katex-mathml"><math><semantics><mrow><mi>O</mi><mo>(</mo><mn>1</mn><mo>)</mo></mrow><annotation encoding="application/x-tex">O(1)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height: 1em; vertical-align: -0.25em;"></span><span class="mord mathdefault" style="margin-right: 0.02778em;">O</span><span class="mopen">(</span><span class="mord">1</span><span class="mclose">)</span></span></span></span></span> for a hash map:</p>
<ul>
<li>Add an element and associate it with a value</li>
<li>Delete an element if it exists</li>
<li>Check if an element exists</li>
</ul>
<p>A hash map also has many of the same useful properties as an array with the same time complexity:</p>
<ul>
<li>Find length/number of elements</li>
<li>Updating values</li>
<li>Iterate over elements</li>
</ul>
<blockquote>
<p>Hash maps are also just easier/cleaner to work with. Even if your keys are integers and you could get away with using an array, if you don't know what the max size of your key is, then you don't know how large you should size your array. With hash maps, you don't need to worry about that, since the key will be converted to a new integer within the size limit anyways.</p>
</blockquote>
<p>However, from a practical perspective, there are some disadvantages to using hash maps, and it's important to know them as it is common in interviews to talk about tradeoffs.</p>
<p>The biggest disadvantage of hash maps is that for smaller input sizes, they can be slower due to overhead. Because big O ignores constants, the <span class="maths katex-rendered"><span class="katex"><span class="katex-mathml"><math><semantics><mrow><mi>O</mi><mo>(</mo><mn>1</mn><mo>)</mo></mrow><annotation encoding="application/x-tex">O(1)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height: 1em; vertical-align: -0.25em;"></span><span class="mord mathdefault" style="margin-right: 0.02778em;">O</span><span class="mopen">(</span><span class="mord">1</span><span class="mclose">)</span></span></span></span></span> time complexity can sometimes be deceiving - it's usually something more like <span class="maths katex-rendered"><span class="katex"><span class="katex-mathml"><math><semantics><mrow><mi>O</mi><mo>(</mo><mn>10</mn><mo>)</mo></mrow><annotation encoding="application/x-tex">O(10)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height: 1em; vertical-align: -0.25em;"></span><span class="mord mathdefault" style="margin-right: 0.02778em;">O</span><span class="mopen">(</span><span class="mord">1</span><span class="mord">0</span><span class="mclose">)</span></span></span></span></span> because every key needs to go through the hash function.</p>
<p>Hash tables can also take up more space. <span style="color: red;">Dynamic arrays are actually fixed-size arrays that resize themselves when they go beyond their capacity.</span> Hash tables are also implemented using a fixed size array - remember that the size is a limit set by the programmer. The problem is, resizing a hash table is much more expensive because every existing key needs to be re-hashed, and also a hash table may use an array that is significantly larger than the number of elements stored, resulting in a huge waste of space. Let's say you chose your limit as 10,000 items, but you only end up storing 10. Okay, you could argue that 10,000 is too large, but then what if your next test case ends up needing to store 100,000 elements? The point is, when you don't know how many elements you need to store, arrays are more flexible with resizing and not wasting space.</p>
<blockquote>
<p>Note: remember that time complexity functions only involve the variables you define. When we say that hash map operations are <span class="maths katex-rendered"><span class="katex"><span class="katex-mathml"><math><semantics><mrow><mi>O</mi><mo>(</mo><mn>1</mn><mo>)</mo></mrow><annotation encoding="application/x-tex">O(1)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height: 1em; vertical-align: -0.25em;"></span><span class="mord mathdefault" style="margin-right: 0.02778em;">O</span><span class="mopen">(</span><span class="mord">1</span><span class="mclose">)</span></span></span></span></span>, the variable we are concerned with is usually <span class="maths katex-rendered"><span class="katex"><span class="katex-mathml"><math><semantics><mrow><mi>n</mi></mrow><annotation encoding="application/x-tex">n</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height: 0.43056em; vertical-align: 0em;"></span><span class="mord mathdefault">n</span></span></span></span></span>, which is the size of the hash map. However, this may be misleading. For example, hashing a string requires <span class="maths katex-rendered"><span class="katex"><span class="katex-mathml"><math><semantics><mrow><mi>O</mi><mo>(</mo><mi>m</mi><mo>)</mo></mrow><annotation encoding="application/x-tex">O(m)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height: 1em; vertical-align: -0.25em;"></span><span class="mord mathdefault" style="margin-right: 0.02778em;">O</span><span class="mopen">(</span><span class="mord mathdefault">m</span><span class="mclose">)</span></span></span></span></span> time, where <span class="maths katex-rendered"><span class="katex"><span class="katex-mathml"><math><semantics><mrow><mi>m</mi></mrow><annotation encoding="application/x-tex">m</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height: 0.43056em; vertical-align: 0em;"></span><span class="mord mathdefault">m</span></span></span></span></span> is the length of the string. The constant time operations are only constant <strong>relative to the size of the map</strong>.</p>
</blockquote>
<hr>
<h3 id="sets">Sets</h3>
<p>A set is another data structure that is very similar to a hash table. It uses the same mechanism for hashing keys into integers. The difference between a set and hash table is that sets do not map their keys to anything. Sets are more convenient to use when you only care about checking if elements exist. You can add, remove, and check if an element exists in a set all in <span class="maths katex-rendered"><span class="katex"><span class="katex-mathml"><math><semantics><mrow><mi>O</mi><mo>(</mo><mn>1</mn><mo>)</mo></mrow><annotation encoding="application/x-tex">O(1)</annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height: 1em; vertical-align: -0.25em;"></span><span class="mord mathdefault" style="margin-right: 0.02778em;">O</span><span class="mopen">(</span><span class="mord">1</span><span class="mclose">)</span></span></span></span></span>. </p>
<p>An important thing to note about sets is that they don't track frequency. If you have a set and add the same element 100 times, the first operation adds it and the next 99 do nothing.</p>
<blockquote>
<p>A set is basically a hash map if you only consider the keys.</p>
</blockquote>
<hr>
<h3 id="arrays-as-keys">Arrays as keys?</h3>
<p>We said that being immutable is usually a requirement for being a hash map key. Arrays are mutable, so how do we store an ordered collection of elements as a key? Depending on the language you're using, there are several ways to convert an array into a unique immutable key. In Python, tuples are immutable, so it's as easy as doing <code>tuple(arr)</code>. Another trick is to convert the array into a string, delimited by some character that is guaranteed to not show up in any element. For example, use a comma to separate integers. <code>[1, 51, 163] --&gt; "1,51,163"</code>.</p>
<blockquote>
<p>In some languages, there may be data structures that allow you to associate mutable data structures to values. For example, in C++ there is <code>std::map</code>. Note that these are not hash maps, but they can be used to solve similar problems.</p>
</blockquote>

<hr>
<h3 id="interface-guide">Interface guide</h3>
<p>Here's a quick runthrough of the interface for a hash map in major languages:</p>

In [1]:
# Declaration: a hash map is declared like any other variable. The syntax is {}
hash_map: dict = {}

# If you want to initialize it with some key value pairs, use the following syntax:
hash_map = {1: 2,
            5: 3,
            7: 2}

# Checking if a key exists: simply use the `in` keyword
1 in hash_map # True
9 in hash_map # False

# Accessing a value given a key: use square brackets, similar to an array.
hash_map[5] # 3

# Adding or updating a key: use square brackets, similar to an array.
# If the key already exists, the value will be updated
hash_map[5] = 6

# If the key doesn't exist yet, the key value pair will be inserted
hash_map[9] = 15

# Deleting a key: use the del keyword. Key must exist or you will get an error.
del hash_map[9]

# Get size
len(hash_map) # 3

# Get keys: use .keys(). You can iterate over this using a for loop.
keys = hash_map.keys()
for key in keys:
    print(key)
    
# Get values: use .values(). You can iterate over this using a for loop.
values = hash_map.values()
for val in values:
    print(val)

1
5
7
2
6
2
