# Topic 03: Hash Tables

## Learning Objectives
- Understand how hash tables work internally
- Master Python's dict and set for O(1) lookups
- Solve problems using frequency counting
- Handle collisions and understand trade-offs

## Prerequisites
- Topic 01: Big O Notation
- Topic 02: Arrays & Strings

---

## 1. What is a Hash Table?

A hash table maps keys to values using a **hash function** that converts keys to array indices.

### Key Operations

| Operation | Average | Worst Case |
|-----------|---------|------------|
| Insert | O(1) | O(n) |
| Delete | O(1) | O(n) |
| Search | O(1) | O(n) |

Worst case occurs when all keys hash to the same index (collision).

### Python Implementation
- **dict**: Key-value pairs
- **set**: Just keys (no values)
- **collections.Counter**: Frequency counting
- **collections.defaultdict**: Dict with default values

## 2. Common Patterns

### Pattern 1: Frequency Counting
```python
from collections import Counter
freq = Counter([1, 2, 2, 3, 3, 3])
# {1: 1, 2: 2, 3: 3}
```

### Pattern 2: Two-Pass with Hash Map
```python
# Build index map in first pass
index_map = {val: i for i, val in enumerate(arr)}
# Use map in second pass
```

### Pattern 3: Grouping by Key
```python
from collections import defaultdict
groups = defaultdict(list)
for item in items:
    groups[get_key(item)].append(item)
```

---

## 3. Exercises

### Setup

In [None]:
import sys
sys.path.insert(0, '..')
from dsa_checker import check

---

### Exercise 1: First Unique Character
**Difficulty:** ‚≠ê Easy

**Problem:**
Given a string, find the first non-repeating character and return its index. Return -1 if none exists.

**Target Complexity:** O(n) time, O(1) space (fixed alphabet)

**Examples:**
```
Input: s = "leetcode"
Output: 0  # 'l' is first unique

Input: s = "loveleetcode"
Output: 2  # 'v' is first unique

Input: s = "aabb"
Output: -1
```

---

**üß† Think About:**
- How do you know if a character is unique? You need to count occurrences.
- Why might you need two passes through the string?

**‚ö†Ô∏è Edge Cases:**
- Empty string
- All duplicates
- All unique

<details>
<summary>üí° Hint</summary>
First pass: count frequencies. Second pass: find first character with count 1.
</details>

In [None]:
def first_unique_char(s: str) -> int:
    """
    Find the index of the first non-repeating character.
    
    Args:
        s: Input string
        
    Returns:
        Index of first unique character, or -1 if none
    """
    # Your code here
    pass

In [None]:
check(first_unique_char)

---

### Exercise 2: Group Anagrams
**Difficulty:** ‚≠ê‚≠ê Medium

**Problem:**
Group strings that are anagrams of each other.

**Target Complexity:** O(n √ó k log k) where n = number of strings, k = max length

**Examples:**
```
Input: strs = ["eat", "tea", "tan", "ate", "nat", "bat"]
Output: [["eat", "tea", "ate"], ["tan", "nat"], ["bat"]]

Input: strs = [""]
Output: [[""]]
```

---

**üß† Think About:**
- What property do all anagrams share?
- How can you create a "signature" or key for each anagram group?

**‚ö†Ô∏è Edge Cases:**
- Empty strings
- All anagrams (one group)
- No anagrams (each in own group)

<details>
<summary>üí° Hint 1</summary>
Sorted characters make a good key: "eat" ‚Üí "aet", "tea" ‚Üí "aet"
</details>

<details>
<summary>üí° Hint 2</summary>
Use `defaultdict(list)` to group strings by their key.
</details>

In [None]:
def group_anagrams(strs: list[str]) -> list[list[str]]:
    """
    Group anagrams together.
    
    Args:
        strs: List of strings
        
    Returns:
        List of groups, where each group contains anagrams
    """
    # Your code here
    pass

In [None]:
check(group_anagrams)

---

### Exercise 3: Isomorphic Strings
**Difficulty:** ‚≠ê Easy

**Problem:**
Two strings are isomorphic if characters in s can be replaced to get t, maintaining a one-to-one mapping.

**Target Complexity:** O(n) time, O(1) space (fixed alphabet)

**Examples:**
```
Input: s = "egg", t = "add"
Output: True  # e‚Üía, g‚Üíd

Input: s = "foo", t = "bar"
Output: False  # o cannot map to both 'a' and 'r'

Input: s = "paper", t = "title"
Output: True
```

---

**üß† Think About:**
- A mapping must be consistent: same input always gives same output
- But also: different inputs must give different outputs (one-to-one!)

**‚ö†Ô∏è Edge Cases:**
- Different lengths
- All same characters

<details>
<summary>üí° Hint</summary>
You need to check the mapping in BOTH directions. Use two dictionaries.
</details>

In [None]:
def isomorphic_strings(s: str, t: str) -> bool:
    """
    Check if two strings are isomorphic.
    
    Args:
        s: First string
        t: Second string
        
    Returns:
        True if isomorphic, False otherwise
    """
    # Your code here
    pass

In [None]:
check(isomorphic_strings)

---

### Exercise 4: Word Pattern
**Difficulty:** ‚≠ê Easy

**Problem:**
Given a pattern and a string, determine if the string follows the same pattern.

**Target Complexity:** O(n) time, O(n) space

**Examples:**
```
Input: pattern = "abba", s = "dog cat cat dog"
Output: True

Input: pattern = "abba", s = "dog cat cat fish"
Output: False
```

---

**üß† Think About:**
- This is similar to isomorphic strings but with characters ‚Üí words
- What's the first thing to check?

**‚ö†Ô∏è Edge Cases:**
- Different number of pattern chars and words
- Same word for different pattern characters

<details>
<summary>üí° Hint</summary>
Split the string into words first. Then apply the same bidirectional mapping logic as isomorphic strings.
</details>

In [None]:
def word_pattern(pattern: str, s: str) -> bool:
    """
    Check if string s follows the given pattern.
    
    Args:
        pattern: Pattern of characters
        s: Space-separated string of words
        
    Returns:
        True if s follows pattern
    """
    # Your code here
    pass

In [None]:
check(word_pattern)

---

### Exercise 5: Intersection of Two Arrays
**Difficulty:** ‚≠ê Easy

**Problem:**
Find the intersection of two arrays. Each element in the result must be unique.

**Target Complexity:** O(n + m) time, O(min(n, m)) space

**Examples:**
```
Input: nums1 = [1, 2, 2, 1], nums2 = [2, 2]
Output: [2]

Input: nums1 = [4, 9, 5], nums2 = [9, 4, 9, 8, 4]
Output: [9, 4] or [4, 9]
```

---

**üß† Think About:**
- What data structure automatically removes duplicates?
- What operation finds common elements between sets?

**‚ö†Ô∏è Edge Cases:**
- No common elements
- One or both arrays empty

<details>
<summary>üí° Hint</summary>
Convert both arrays to sets and find their intersection.
</details>

In [None]:
def intersection_of_arrays(nums1: list[int], nums2: list[int]) -> list[int]:
    """
    Find the intersection of two arrays.
    
    Args:
        nums1: First array
        nums2: Second array
        
    Returns:
        List of unique elements in both arrays
    """
    # Your code here
    pass

In [None]:
check(intersection_of_arrays)

---

### Exercise 6: Longest Consecutive Sequence
**Difficulty:** ‚≠ê‚≠ê Medium

**Problem:**
Find the length of the longest consecutive elements sequence. Must run in O(n) time.

**Target Complexity:** O(n) time, O(n) space

**Examples:**
```
Input: nums = [100, 4, 200, 1, 3, 2]
Output: 4  # [1, 2, 3, 4]

Input: nums = [0, 3, 7, 2, 5, 8, 4, 6, 0, 1]
Output: 9
```

---

**üß† Think About:**
- Sorting would work but is O(n log n). How can you do O(n)?
- When you find a number, how do you check if consecutive numbers exist?
- How do you avoid counting the same sequence multiple times?

**‚ö†Ô∏è Edge Cases:**
- Empty array
- All same elements
- Already sorted

<details>
<summary>üí° Hint 1</summary>
Use a set for O(1) lookups.
</details>

<details>
<summary>üí° Hint 2</summary>
Only start counting from a number that's the START of a sequence (i.e., num-1 is not in the set).
</details>

In [None]:
def longest_consecutive(nums: list[int]) -> int:
    """
    Find length of longest consecutive sequence.
    
    Args:
        nums: List of integers
        
    Returns:
        Length of longest consecutive sequence
    """
    # Your code here
    pass

In [None]:
check(longest_consecutive)

---

### Exercise 7: Subarray Sum Equals K
**Difficulty:** ‚≠ê‚≠ê Medium

**Problem:**
Count the number of subarrays that sum to k.

**Target Complexity:** O(n) time, O(n) space

**Examples:**
```
Input: nums = [1, 1, 1], k = 2
Output: 2

Input: nums = [1, 2, 3], k = 3
Output: 2  # [1,2] and [3]
```

---

**üß† Think About:**
- Brute force checks all O(n¬≤) subarrays. How can hash maps help?
- If prefix_sum[j] - prefix_sum[i] = k, what does that tell you?
- What should you store in your hash map?

**‚ö†Ô∏è Edge Cases:**
- Negative numbers
- k = 0
- Single element equals k

<details>
<summary>üí° Hint 1</summary>
Use prefix sums. If the current prefix minus k was seen before, you found a valid subarray.
</details>

<details>
<summary>üí° Hint 2</summary>
Store counts of prefix sums in a hash map. Initialize it to handle subarrays starting at index 0.
</details>

In [None]:
def subarray_sum_equals_k(nums: list[int], k: int) -> int:
    """
    Count subarrays with sum equal to k.
    
    Args:
        nums: List of integers
        k: Target sum
        
    Returns:
        Number of subarrays with sum k
    """
    # Your code here
    pass

In [None]:
check(subarray_sum_equals_k)

---

### Exercise 8: Top K Frequent Elements
**Difficulty:** ‚≠ê‚≠ê Medium

**Problem:**
Return the k most frequent elements. Answer can be in any order.

**Target Complexity:** O(n) average

**Examples:**
```
Input: nums = [1, 1, 1, 2, 2, 3], k = 2
Output: [1, 2]

Input: nums = [1], k = 1
Output: [1]
```

---

**üß† Think About:**
- How do you count frequencies efficiently?
- Once you have frequencies, how do you find the top k?
- Can you avoid full sorting?

**‚ö†Ô∏è Edge Cases:**
- k equals number of unique elements
- All same frequency

<details>
<summary>üí° Hint 1</summary>
First count frequencies using a dictionary or Counter.
</details>

<details>
<summary>üí° Hint 2</summary>
Use a method to get the k most common elements, or use bucket sort for O(n) time.
</details>

In [None]:
def top_k_frequent(nums: list[int], k: int) -> list[int]:
    """
    Find k most frequent elements.
    
    Args:
        nums: List of integers
        k: Number of top elements to return
        
    Returns:
        List of k most frequent elements
    """
    # Your code here
    pass

In [None]:
check(top_k_frequent)

---

## 4. Summary

- Hash tables provide O(1) average-case lookup, insert, delete
- Use `dict` for key-value pairs, `set` for membership testing
- `Counter` simplifies frequency counting
- `defaultdict` avoids KeyError with default values
- Common patterns: frequency counting, grouping, prefix sums with hash maps

## Next Steps
Continue to **Topic 04: Two Pointers & Sliding Window** for efficient array traversal techniques.