# Problem Solving with Python Sets and Dictionaries

This notebook explores a variety of problems that can be efficiently solved using Python's `set` and `dict` data structures. We will cover common use cases and demonstrate how the unique properties of sets (uniqueness, fast membership testing) and dictionaries (key-value mapping) make them powerful tools for programmers.

## 1. Dictionary Problems

### Problem 1: Word Frequency Counter

**Problem Statement:** Given a piece of text, count the frequency of each word. [11, 17] This is a classic problem in text analysis. [2, 3]

**Why Dictionaries?** Dictionaries are perfect for this task because they allow you to store a unique key (the word) and associate it with a value (its count). The fast key lookup makes updating the count for each word very efficient.

In [None]:
import re
from collections import Counter

text = """
Python is a high-level, interpreted, general-purpose programming language. 
Its design philosophy emphasizes code readability with its notable use of significant whitespace. 
Its language constructs and object-oriented approach aim to help programmers write clear, logical code for small and large-scale projects.
Python is dynamically-typed and garbage-collected. It supports multiple programming paradigms, including structured (particularly, procedural), object-oriented, and functional programming.
"""

def word_frequency(text):
    """Counts the frequency of each word in a given text."""
    words = re.findall(r'\b\w+\b', text.lower())
    freq_dict = {}
    for word in words:
        freq_dict[word] = freq_dict.get(word, 0) + 1
    return freq_dict

def word_frequency_counter(text):
    """Counts word frequency using collections.Counter."""
    words = re.findall(r'\b\w+\b', text.lower())
    return Counter(words)

frequencies = word_frequency(text)
print("Word Frequencies (manual):")
print(frequencies)

print("\nWord Frequencies (using Counter):")
frequencies_counter = word_frequency_counter(text)
print(frequencies_counter)

### Problem 2: Grouping Anagrams

**Problem Statement:** Given a list of words, group the anagrams together. Anagrams are words that have the same letters but in a different order (e.g., "eat", "tea", "ate").

**Why Dictionaries?** We can use a sorted version of a word as a key. All anagrams will have the same sorted form. The values in the dictionary will be lists of the anagrams.

In [None]:
from collections import defaultdict

words = ["eat", "tea", "tan", "ate", "nat", "bat"]

def group_anagrams(words):
    """Groups anagrams together from a list of words."""
    anagram_map = defaultdict(list)
    for word in words:
        sorted_word = "".join(sorted(word))
        anagram_map[sorted_word].append(word)
    return list(anagram_map.values())

anagram_groups = group_anagrams(words)
print("Anagram Groups:")
print(anagram_groups)

### Problem 3: Merging Dictionaries

**Problem Statement:** Combine two dictionaries into one. If there are overlapping keys, the value from the second dictionary should be used.

**Why Dictionaries?** This is a fundamental dictionary operation. Python provides several ways to do this, including the `update()` method and dictionary unpacking (`**`).

In [None]:
dict1 = {'a': 1, 'b': 2, 'c': 3}
dict2 = {'b': 20, 'd': 40}

# Method 1: Using the update() method
merged_dict1 = dict1.copy()
merged_dict1.update(dict2)
print(f"Merged with update(): {merged_dict1}")

# Method 2: Using dictionary unpacking (Python 3.5+)
merged_dict2 = {**dict1, **dict2}
print(f"Merged with unpacking: {merged_dict2}")

## 2. Set Problems

### Problem 1: Finding Unique Elements

**Problem Statement:** Given a list with duplicate elements, find all the unique elements.

**Why Sets?** Sets by definition only store unique elements. Converting a list to a set is the most Pythonic and efficient way to get the unique items. [14]

In [None]:
numbers = [1, 2, 2, 3, 4, 4, 4, 5, 6, 1]

unique_numbers = set(numbers)
print(f"Original list: {numbers}")
print(f"Unique numbers (set): {unique_numbers}")
print(f"Unique numbers (list): {list(unique_numbers)}")

### Problem 2: Finding Common and Different Elements

**Problem Statement:** Given two lists, find the elements that are common to both, and the elements that are unique to each list.

**Why Sets?** Sets provide built-in methods for these exact operations: intersection (`&`), union (`|`), difference (`-`), and symmetric difference (`^`). These operations are highly optimized.

In [None]:
list1 = [1, 2, 3, 4, 5]
list2 = [4, 5, 6, 7, 8]

set1 = set(list1)
set2 = set(list2)

# Common elements (intersection)
common_elements = set1.intersection(set2) # or set1 & set2
print(f"Common elements: {common_elements}")

# Elements only in list1 (difference)
unique_to_list1 = set1.difference(set2) # or set1 - set2
print(f"Elements only in list1: {unique_to_list1}")

# Elements only in list2 (difference)
unique_to_list2 = set2.difference(set1) # or set2 - set1
print(f"Elements only in list2: {unique_to_list2}")

# All unique elements from both lists (union)
all_elements = set1.union(set2) # or set1 | set2
print(f"All unique elements: {all_elements}")

# Elements in one list but not both (symmetric difference)
symmetric_difference = set1.symmetric_difference(set2) # or set1 ^ set2
print(f"Elements in one list but not both: {symmetric_difference}")

### Problem 3: Fast Membership Testing

**Problem Statement:** You have a large collection of items, and you need to repeatedly check if a given item is in that collection.

**Why Sets?** Checking for the existence of an element in a set is, on average, an O(1) operation (constant time). This is much faster than searching through a list, which is an O(n) operation (linear time), where n is the number of elements in the list. [14]

In [None]:
import time

# Create a large list and a large set with the same elements
large_list = list(range(10000000))
large_set = set(large_list)

element_to_find = 9999999

# Time the search in the list
start_time = time.time()
result_list = element_to_find in large_list
end_time = time.time()
print(f"Time to find in list: {end_time - start_time:.6f} seconds")

# Time the search in the set
start_time = time.time()
result_set = element_to_find in large_set
end_time = time.time()
print(f"Time to find in set:  {end_time - start_time:.6f} seconds")