# Data Structures Interview Problems

This notebook contains common data structure problems asked in technical interviews, organized by category with multiple solution approaches.

## Table of Contents
1. [Arrays & Strings](#arrays)
2. [Linked Lists](#linked-lists)
3. [Trees](#trees)
4. [Hash Tables](#hash-tables)
5. [Graphs](#graphs)

<a id='arrays'></a>
## 1. Arrays & Strings

### Problem 1: Two Sum (Easy)

**Problem Statement:**
Given an array of integers `nums` and an integer `target`, return indices of the two numbers that add up to `target`.

**Constraints:**
- Each input has exactly one solution
- You may not use the same element twice
- 2 <= nums.length <= 10^4

**Examples:**
```
Input: nums = [2,7,11,15], target = 9
Output: [0,1]
Explanation: nums[0] + nums[1] = 2 + 7 = 9

Input: nums = [3,2,4], target = 6
Output: [1,2]
```

In [None]:
# Solution 1: Brute Force
def two_sum_brute(nums: list[int], target: int) -> list[int]:
    """
    Time Complexity: O(n²) - nested loops
    Space Complexity: O(1) - no extra space
    """
    for i in range(len(nums)):
        for j in range(i + 1, len(nums)):
            if nums[i] + nums[j] == target:
                return [i, j]
    return []

# Solution 2: Hash Map (Optimized)
def two_sum_optimal(nums: list[int], target: int) -> list[int]:
    """
    Time Complexity: O(n) - single pass
    Space Complexity: O(n) - hash map storage
    
    Approach: Store seen numbers with their indices in a hash map.
    For each number, check if (target - number) exists in the map.
    """
    seen = {}  # value -> index
    for i, num in enumerate(nums):
        complement = target - num
        if complement in seen:
            return [seen[complement], i]
        seen[num] = i
    return []

# Test cases
assert two_sum_optimal([2, 7, 11, 15], 9) == [0, 1]
assert two_sum_optimal([3, 2, 4], 6) == [1, 2]
assert two_sum_optimal([3, 3], 6) == [0, 1]
print("✓ All Two Sum tests passed!")

**Common Pitfalls:**
- Using the same element twice (check i != j)
- Not handling duplicate values correctly
- Assuming sorted array (array is not sorted)

**Follow-up Questions:**
- What if the array is sorted? (Use two pointers)
- What if we need all pairs? (Return list of pairs)
- What about Three Sum? (O(n²) with two pointers)

### Problem 2: Valid Parentheses (Easy)

**Problem Statement:**
Given a string containing just the characters `'(', ')', '{', '}', '[', ']'`, determine if the input string is valid.

**Valid means:**
1. Open brackets must be closed by the same type of brackets
2. Open brackets must be closed in the correct order

**Examples:**
```
Input: "()"
Output: true

Input: "()[]{}"
Output: true

Input: "(]"
Output: false
```

In [None]:
def is_valid_parentheses(s: str) -> bool:
    """
    Time Complexity: O(n) - single pass through string
    Space Complexity: O(n) - stack can grow to n/2 in worst case
    
    Approach: Use a stack to track opening brackets.
    When we see a closing bracket, verify it matches the most recent opening bracket.
    """
    stack = []
    pairs = {'(': ')', '{': '}', '[': ']'}
    
    for char in s:
        if char in pairs:  # Opening bracket
            stack.append(char)
        else:  # Closing bracket
            if not stack or pairs[stack.pop()] != char:
                return False
    
    return len(stack) == 0  # All brackets must be closed

# Test cases
assert is_valid_parentheses("()") == True
assert is_valid_parentheses("()[]{}") == True
assert is_valid_parentheses("(]") == False
assert is_valid_parentheses("([)]") == False
assert is_valid_parentheses("{[]}") == True
assert is_valid_parentheses("") == True
assert is_valid_parentheses("(") == False
print("✓ All Valid Parentheses tests passed!")

**Common Pitfalls:**
- Forgetting to check if stack is empty before popping
- Not checking if stack is empty at the end (unclosed brackets)
- Wrong mapping of bracket pairs

**Follow-up Questions:**
- How would you handle nested structures like `"((()))"`?
- What if we need to count the minimum number of additions to make it valid?
- Can you solve it without using extra space? (Not easily for this problem)

### Problem 3: Longest Substring Without Repeating Characters (Medium)

**Problem Statement:**
Given a string `s`, find the length of the longest substring without repeating characters.

**Examples:**
```
Input: "abcabcbb"
Output: 3
Explanation: "abc" is the longest substring without repeating characters

Input: "bbbbb"
Output: 1
Explanation: "b" is the longest

Input: "pwwkew"
Output: 3
Explanation: "wke" is the longest
```

In [None]:
# Solution 1: Brute Force
def longest_substring_brute(s: str) -> int:
    """
    Time Complexity: O(n³) - check all substrings and verify uniqueness
    Space Complexity: O(min(n, m)) - where m is charset size
    """
    def all_unique(substring: str) -> bool:
        return len(set(substring)) == len(substring)
    
    max_len = 0
    for i in range(len(s)):
        for j in range(i + 1, len(s) + 1):
            if all_unique(s[i:j]):
                max_len = max(max_len, j - i)
    return max_len

# Solution 2: Sliding Window (Optimized)
def longest_substring_optimal(s: str) -> int:
    """
    Time Complexity: O(n) - single pass with sliding window
    Space Complexity: O(min(n, m)) - hash set storage
    
    Approach: Use sliding window with two pointers.
    Expand right pointer and add characters to set.
    When duplicate found, shrink from left until duplicate removed.
    """
    char_set = set()
    left = 0
    max_len = 0
    
    for right in range(len(s)):
        # Remove characters from left until no duplicate
        while s[right] in char_set:
            char_set.remove(s[left])
            left += 1
        
        char_set.add(s[right])
        max_len = max(max_len, right - left + 1)
    
    return max_len

# Solution 3: Optimized Sliding Window with Hash Map
def longest_substring_optimal_v2(s: str) -> int:
    """
    Time Complexity: O(n) - single pass
    Space Complexity: O(min(n, m)) - hash map storage
    
    Approach: Use hash map to store last seen index of each character.
    Jump left pointer directly to after the duplicate.
    """
    char_index = {}  # character -> last seen index
    left = 0
    max_len = 0
    
    for right, char in enumerate(s):
        # If char seen and in current window, move left pointer
        if char in char_index and char_index[char] >= left:
            left = char_index[char] + 1
        
        char_index[char] = right
        max_len = max(max_len, right - left + 1)
    
    return max_len

# Test cases
assert longest_substring_optimal("abcabcbb") == 3
assert longest_substring_optimal("bbbbb") == 1
assert longest_substring_optimal("pwwkew") == 3
assert longest_substring_optimal("") == 0
assert longest_substring_optimal("abcdefg") == 7
assert longest_substring_optimal_v2("tmmzuxt") == 5  # "mzuxt"
print("✓ All Longest Substring tests passed!")

**Key Pattern: Sliding Window**
- Used for contiguous subarray/substring problems
- Two pointers: expand right, contract left
- Track window state (set/map)
- Update max on each valid window

**Follow-up Questions:**
- What if we need the actual substring? (Track start index)
- What if we allow at most k repeating characters? (Modified sliding window)
- How would you handle Unicode characters? (Same approach, just larger charset)

### Problem 4: Group Anagrams (Medium)

**Problem Statement:**
Given an array of strings, group anagrams together.

**Examples:**
```
Input: ["eat","tea","tan","ate","nat","bat"]
Output: [["bat"],["nat","tan"],["ate","eat","tea"]]
```

In [None]:
from collections import defaultdict

# Solution 1: Sorted String as Key
def group_anagrams_v1(strs: list[str]) -> list[list[str]]:
    """
    Time Complexity: O(n * k log k) - where n is array length, k is max string length
    Space Complexity: O(n * k) - storing all strings
    
    Approach: Use sorted string as hash key.
    Anagrams will have the same sorted representation.
    """
    anagram_map = defaultdict(list)
    
    for word in strs:
        sorted_word = ''.join(sorted(word))
        anagram_map[sorted_word].append(word)
    
    return list(anagram_map.values())

# Solution 2: Character Count as Key
def group_anagrams_v2(strs: list[str]) -> list[list[str]]:
    """
    Time Complexity: O(n * k) - where n is array length, k is max string length
    Space Complexity: O(n * k)
    
    Approach: Use character count tuple as hash key.
    Faster than sorting since we only count 26 letters.
    """
    anagram_map = defaultdict(list)
    
    for word in strs:
        # Count each character (a-z)
        count = [0] * 26
        for char in word:
            count[ord(char) - ord('a')] += 1
        
        # Use tuple as key (lists can't be dict keys)
        anagram_map[tuple(count)].append(word)
    
    return list(anagram_map.values())

# Test cases
result = group_anagrams_v1(["eat","tea","tan","ate","nat","bat"])
# Sort for consistent comparison
result = [sorted(group) for group in result]
result.sort()
expected = [["ate","eat","tea"], ["bat"], ["nat","tan"]]
expected.sort()
assert result == expected

assert group_anagrams_v2([""]) == [[""]]
assert group_anagrams_v2(["a"]) == [["a"]]
print("✓ All Group Anagrams tests passed!")

**Key Insights:**
- Hash maps are perfect for grouping problems
- Choose the right key representation
- Sorting vs counting trade-offs

**Follow-up Questions:**
- What if strings contain Unicode? (Use Counter or larger array)
- How would you handle very long strings? (Sorting might be better)
- Can you do it with O(1) space? (No, need to store groups)

<a id='linked-lists'></a>
## 2. Linked Lists

In [None]:
# Helper class for linked list problems
class ListNode:
    def __init__(self, val=0, next=None):
        self.val = val
        self.next = next
    
    def __repr__(self):
        return f"ListNode({self.val})"

def list_to_linkedlist(arr: list) -> ListNode:
    """Helper to create linked list from array"""
    if not arr:
        return None
    head = ListNode(arr[0])
    current = head
    for val in arr[1:]:
        current.next = ListNode(val)
        current = current.next
    return head

def linkedlist_to_list(head: ListNode) -> list:
    """Helper to convert linked list to array for testing"""
    result = []
    while head:
        result.append(head.val)
        head = head.next
    return result

### Problem 5: Reverse Linked List (Easy)

**Problem Statement:**
Reverse a singly linked list.

**Examples:**
```
Input: 1 -> 2 -> 3 -> 4 -> 5
Output: 5 -> 4 -> 3 -> 2 -> 1
```

In [None]:
# Solution 1: Iterative
def reverse_list_iterative(head: ListNode) -> ListNode:
    """
    Time Complexity: O(n)
    Space Complexity: O(1)
    
    Approach: Iterate through list, reversing pointers.
    Use three pointers: prev, current, next.
    """
    prev = None
    current = head
    
    while current:
        next_node = current.next  # Save next
        current.next = prev       # Reverse pointer
        prev = current            # Move prev forward
        current = next_node       # Move current forward
    
    return prev  # New head

# Solution 2: Recursive
def reverse_list_recursive(head: ListNode) -> ListNode:
    """
    Time Complexity: O(n)
    Space Complexity: O(n) - recursion stack
    
    Approach: Recursively reverse from the end.
    Base case: empty or single node.
    Recursive case: reverse rest, then fix pointers.
    """
    # Base cases
    if not head or not head.next:
        return head
    
    # Reverse the rest of the list
    new_head = reverse_list_recursive(head.next)
    
    # Fix pointers: head.next should point back to head
    head.next.next = head
    head.next = None
    
    return new_head

# Test cases
head = list_to_linkedlist([1, 2, 3, 4, 5])
reversed_head = reverse_list_iterative(head)
assert linkedlist_to_list(reversed_head) == [5, 4, 3, 2, 1]

head = list_to_linkedlist([1, 2])
reversed_head = reverse_list_recursive(head)
assert linkedlist_to_list(reversed_head) == [2, 1]

assert reverse_list_iterative(None) == None
print("✓ All Reverse Linked List tests passed!")

**Key Pattern: Three Pointers**
- prev: tracks previous node
- current: tracks current node
- next: saves next node before modifying pointers

**Follow-up Questions:**
- Can you reverse a linked list in place? (Yes, iterative solution)
- How would you reverse nodes in groups of k?
- What about reversing a doubly linked list?

### Problem 6: Detect Cycle in Linked List (Easy)

**Problem Statement:**
Given a linked list, determine if it has a cycle.

**Follow-up:** Can you solve it using O(1) memory?

In [None]:
# Solution 1: Hash Set
def has_cycle_hashset(head: ListNode) -> bool:
    """
    Time Complexity: O(n)
    Space Complexity: O(n) - hash set storage
    
    Approach: Track visited nodes in a set.
    If we see a node twice, there's a cycle.
    """
    seen = set()
    current = head
    
    while current:
        if current in seen:
            return True
        seen.add(current)
        current = current.next
    
    return False

# Solution 2: Floyd's Cycle Detection (Fast & Slow Pointers)
def has_cycle_optimal(head: ListNode) -> bool:
    """
    Time Complexity: O(n)
    Space Complexity: O(1) - only two pointers
    
    Approach: Use two pointers at different speeds.
    If there's a cycle, fast will eventually meet slow.
    If no cycle, fast will reach the end.
    """
    if not head or not head.next:
        return False
    
    slow = head
    fast = head.next
    
    while slow != fast:
        if not fast or not fast.next:
            return False
        slow = slow.next
        fast = fast.next.next
    
    return True

# Test cases
# Create a cycle: 1 -> 2 -> 3 -> 4 -> 2 (cycle back to 2)
head = ListNode(1)
node2 = ListNode(2)
node3 = ListNode(3)
node4 = ListNode(4)
head.next = node2
node2.next = node3
node3.next = node4
node4.next = node2  # Create cycle

assert has_cycle_optimal(head) == True

# No cycle
head = list_to_linkedlist([1, 2, 3, 4])
assert has_cycle_optimal(head) == False

assert has_cycle_optimal(None) == False
print("✓ All Cycle Detection tests passed!")

**Key Pattern: Fast & Slow Pointers (Floyd's Algorithm)**
- Also called "tortoise and hare"
- Slow moves 1 step, fast moves 2 steps
- If cycle exists, they will meet
- Space-efficient for cycle detection

**Follow-up Questions:**
- How do you find where the cycle begins? (Continue with specific algorithm)
- What's the length of the cycle? (Count steps after they meet)
- Can you break the cycle? (Yes, find cycle start and break link)

<a id='trees'></a>
## 3. Trees

In [None]:
# Helper class for tree problems
class TreeNode:
    def __init__(self, val=0, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right
    
    def __repr__(self):
        return f"TreeNode({self.val})"

### Problem 7: Maximum Depth of Binary Tree (Easy)

**Problem Statement:**
Given a binary tree, find its maximum depth (number of nodes along the longest path from root to leaf).

**Examples:**
```
    3
   / \
  9  20
     / \
    15  7
Output: 3
```

In [None]:
# Solution 1: Recursive DFS
def max_depth_recursive(root: TreeNode) -> int:
    """
    Time Complexity: O(n) - visit each node once
    Space Complexity: O(h) - recursion stack, h is height
    
    Approach: Recursively find max depth of left and right subtrees.
    Current depth is 1 + max(left_depth, right_depth).
    """
    if not root:
        return 0
    
    left_depth = max_depth_recursive(root.left)
    right_depth = max_depth_recursive(root.right)
    
    return 1 + max(left_depth, right_depth)

# Solution 2: Iterative BFS (Level-order)
def max_depth_iterative(root: TreeNode) -> int:
    """
    Time Complexity: O(n)
    Space Complexity: O(w) - where w is max width of tree
    
    Approach: Level-order traversal using queue.
    Count number of levels.
    """
    if not root:
        return 0
    
    from collections import deque
    queue = deque([root])
    depth = 0
    
    while queue:
        depth += 1
        # Process all nodes at current level
        for _ in range(len(queue)):
            node = queue.popleft()
            if node.left:
                queue.append(node.left)
            if node.right:
                queue.append(node.right)
    
    return depth

# Test cases
root = TreeNode(3)
root.left = TreeNode(9)
root.right = TreeNode(20)
root.right.left = TreeNode(15)
root.right.right = TreeNode(7)

assert max_depth_recursive(root) == 3
assert max_depth_iterative(root) == 3
assert max_depth_recursive(None) == 0
assert max_depth_recursive(TreeNode(1)) == 1
print("✓ All Max Depth tests passed!")

**Tree Traversal Patterns:**
- **DFS (Depth-First):** Recursion or stack, explores depth first
  - Preorder: root, left, right
  - Inorder: left, root, right
  - Postorder: left, right, root
- **BFS (Breadth-First):** Queue, explores level by level

**Follow-up Questions:**
- What about minimum depth?
- How would you find diameter of tree?
- What if tree is n-ary instead of binary?

### Problem 8: Binary Tree Level Order Traversal (Medium)

**Problem Statement:**
Given a binary tree, return the level order traversal of its nodes' values (left to right, level by level).

**Examples:**
```
    3
   / \
  9  20
     / \
    15  7
Output: [[3], [9, 20], [15, 7]]
```

In [None]:
from collections import deque

def level_order(root: TreeNode) -> list[list[int]]:
    """
    Time Complexity: O(n) - visit each node once
    Space Complexity: O(w) - where w is max width
    
    Approach: BFS using queue.
    Process nodes level by level, storing each level's values.
    """
    if not root:
        return []
    
    result = []
    queue = deque([root])
    
    while queue:
        level_size = len(queue)
        level_values = []
        
        # Process all nodes at current level
        for _ in range(level_size):
            node = queue.popleft()
            level_values.append(node.val)
            
            if node.left:
                queue.append(node.left)
            if node.right:
                queue.append(node.right)
        
        result.append(level_values)
    
    return result

# Test cases
root = TreeNode(3)
root.left = TreeNode(9)
root.right = TreeNode(20)
root.right.left = TreeNode(15)
root.right.right = TreeNode(7)

assert level_order(root) == [[3], [9, 20], [15, 7]]
assert level_order(None) == []
assert level_order(TreeNode(1)) == [[1]]
print("✓ All Level Order tests passed!")

**Key Pattern: Level-Order Traversal**
- Use queue (BFS)
- Track level size to group nodes
- Process entire level before moving to next

**Variations:**
- Zigzag level order (alternate left-right, right-left)
- Right side view (last node of each level)
- Bottom-up level order (reverse result)

<a id='hash-tables'></a>
## 4. Hash Tables

### Problem 9: LRU Cache (Medium)

**Problem Statement:**
Design a data structure that follows Least Recently Used (LRU) cache constraints.

Implement:
- `LRUCache(capacity)` - Initialize with positive size
- `get(key)` - Return value if exists, -1 otherwise
- `put(key, value)` - Update or insert value. If full, evict LRU item

Both operations should run in O(1) average time.

In [None]:
class Node:
    """Doubly linked list node"""
    def __init__(self, key=0, value=0):
        self.key = key
        self.value = value
        self.prev = None
        self.next = None

class LRUCache:
    """
    Time Complexity: O(1) for both get and put
    Space Complexity: O(capacity)
    
    Approach: Combine hash map + doubly linked list.
    - Hash map: O(1) lookup
    - Doubly linked list: O(1) insertion/deletion and maintain order
    - Most recent at head, least recent at tail
    """
    
    def __init__(self, capacity: int):
        self.capacity = capacity
        self.cache = {}  # key -> Node
        
        # Dummy head and tail for easier list operations
        self.head = Node()
        self.tail = Node()
        self.head.next = self.tail
        self.tail.prev = self.head
    
    def _remove(self, node: Node):
        """Remove node from list"""
        node.prev.next = node.next
        node.next.prev = node.prev
    
    def _add_to_head(self, node: Node):
        """Add node right after head (most recent)"""
        node.next = self.head.next
        node.prev = self.head
        self.head.next.prev = node
        self.head.next = node
    
    def _move_to_head(self, node: Node):
        """Move existing node to head (mark as recently used)"""
        self._remove(node)
        self._add_to_head(node)
    
    def get(self, key: int) -> int:
        if key not in self.cache:
            return -1
        
        node = self.cache[key]
        self._move_to_head(node)  # Mark as recently used
        return node.value
    
    def put(self, key: int, value: int) -> None:
        if key in self.cache:
            # Update existing key
            node = self.cache[key]
            node.value = value
            self._move_to_head(node)
        else:
            # Add new key
            node = Node(key, value)
            self.cache[key] = node
            self._add_to_head(node)
            
            # Check capacity and evict if needed
            if len(self.cache) > self.capacity:
                # Remove least recently used (tail.prev)
                lru = self.tail.prev
                self._remove(lru)
                del self.cache[lru.key]

# Test cases
cache = LRUCache(2)
cache.put(1, 1)
cache.put(2, 2)
assert cache.get(1) == 1       # returns 1
cache.put(3, 3)                # evicts key 2
assert cache.get(2) == -1      # returns -1 (not found)
cache.put(4, 4)                # evicts key 1
assert cache.get(1) == -1      # returns -1 (not found)
assert cache.get(3) == 3       # returns 3
assert cache.get(4) == 4       # returns 4
print("✓ All LRU Cache tests passed!")

**Key Design Pattern: Hash Map + Doubly Linked List**
- Hash map for O(1) lookup
- Doubly linked list for O(1) insertion/deletion anywhere
- Dummy nodes simplify edge cases
- Track recency by position in list

**Follow-up Questions:**
- How would you implement LFU (Least Frequently Used) cache?
- What if we need to support expiration times?
- How would you make this thread-safe?

## Summary

### Key Patterns Covered
1. **Hash Maps** - O(1) lookups, grouping, counting
2. **Two Pointers** - Arrays, strings, linked lists
3. **Sliding Window** - Contiguous subarrays/substrings
4. **Fast & Slow Pointers** - Cycle detection, finding middle
5. **DFS/BFS** - Tree and graph traversal
6. **Combined Data Structures** - Hash map + linked list (LRU)

### Problem-Solving Checklist
- [ ] Clarify requirements and constraints
- [ ] Walk through examples (including edge cases)
- [ ] Identify pattern/approach
- [ ] Discuss time/space complexity
- [ ] Code solution
- [ ] Test with examples
- [ ] Optimize if needed

### Next Steps
- Practice more problems in each category
- Review [problems_algorithms.ipynb](problems_algorithms.ipynb) for algorithmic patterns
- Check [problems_python_specific.ipynb](problems_python_specific.ipynb) for Python-specific challenges
- Study [problems_system_design.ipynb](problems_system_design.ipynb) for design patterns