# Intermediate Collections

Most of these are from the built-in `collections` module.

## Counter

| Concept & Usage                               | Description                                                                                                                                                                                                                                     | Example                                 |
|-----------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------|
| Python Counter Class                         | `collections.Counter` is a subclass of a dictionary that counts hashable elements. It's used to count the occurrences of elements in an iterable (e.g., list, tuple, string).                                                                  | `from collections import Counter`<br>`data = [1, 2, 3, 1, 2, 4, 1, 4, 5, 1]`<br>`counter = Counter(data)`<br>`print(counter)`<br>Output: `Counter({1: 4, 2: 2, 4: 2, 3: 1, 5: 1})` |
| Basic Operations                             | - Access count of an element: Use `counter[element]`. If the element is not present, it returns 0. <br> - Adding Counters: Use `+` operator to combine two Counters, and it will add up the counts for the same elements.<br> - Subtracting Counters: Use `-` operator to subtract the counts of elements in the right Counter from the left Counter. If the count becomes zero or negative, it won't be shown in the result.          | `counter1 = Counter([1, 2, 3, 1, 2])`<br>`print(counter1[2])`<br>Output: `2`<br><br>`counter2 = Counter([2, 2, 3, 4])`<br>`combined_counter = counter1 + counter2`<br>`print(combined_counter)`<br>Output: `Counter({1: 2, 2: 4, 3: 2, 4: 1})`<br><br>`counter3 = Counter([1, 2, 3, 4, 1, 2, 1])`<br>`subtracted_counter = counter3 - counter2`<br>`print(subtracted_counter)`<br>Output: `Counter({1: 2})` |
| Useful Methods                               | - `elements()`: Returns an iterator over the elements in the Counter, repeated as many times as their count. <br> - `most_common(n)`: Returns a list of the n most common elements and their counts, in descending order. <br> - `update(iterable)`: Adds elements from an iterable to the counter, updating the counts. <br> - `clear()`: Resets the counter by removing all elements and counts.  | `data = [1, 2, 3, 1, 2, 4, 1, 4, 5, 1]`<br>`counter = Counter(data)`<br>`print(list(counter.elements()))`<br>Output: `[1, 1, 1, 1, 2, 2, 3, 4, 4, 5]`<br>`print(counter.most_common(2))`<br>Output: `[(1, 4), (2, 2)]`<br>`counter.update([1, 6, 6])`<br>`print(counter)`<br>Output: `Counter({1: 5, 2: 2, 4: 2, 3: 1, 5: 1, 6: 2})`<br>`counter.clear()`<br>`print(counter)`<br>Output: `Counter()` |


In [1]:
from collections import Counter

students = [
    {'name': 'Amit', 'house': 'Shivaji'},
    {'name': 'Rahul', 'house': 'Tagore'},
    {'name': 'Sneha', 'house': 'Raman'},
    {'name': 'Kavita', 'house': 'Shivaji'},
    {'name': 'Ankit', 'house': 'Ashoka'},
    {'name': 'Smita', 'house': 'Tagore'},
    {'name': 'Prakash', 'house': 'Raman'},
    {'name': 'Manju', 'house': 'Shivaji'},
    {'name': 'Rajesh', 'house': 'Tagore'},
]

house_counter = Counter(student['house'] for student in students)

print(house_counter)

Counter({'Shivaji': 3, 'Tagore': 3, 'Raman': 2, 'Ashoka': 1})


In [2]:
group1_survey = Counter({'option1': 25, 'option2': 12, 'option3': 5})
group2_survey = Counter({'option1': 15, 'option2': 8, 'option3': 10})
group3_survey = Counter({'option1': 18, 'option2': 22, 'option3': 7})

overall_survey = group1_survey + group2_survey + group3_survey
print(overall_survey)

Counter({'option1': 58, 'option2': 42, 'option3': 22})


In [3]:
twinkle_twinkle = """Twinkle, twinkle, little star
How I wonder what you are
Up above the world so high
Like a diamond in the sky
Twinkle, twinkle little star
How I wonder what you are

When the blazing sun is gone
When he nothing shines upon
Then you show your little light
Twinkle, twinkle, all the night
Twinkle, twinkle, little star
How I wonder what you are"""

words = twinkle_twinkle.lower().split()

word_counter = Counter(words)

# Get the two most common fruits
most_common_words = word_counter.most_common(5)
print(most_common_words)

[('twinkle,', 7), ('little', 4), ('you', 4), ('the', 4), ('star', 3)]


In [4]:
# Initial inventory quantities
inventory_beginning = Counter({'apple': 50, 'banana': 30, 'orange': 25, 'grapes': 40})

# Inventory quantities at the end of the month
inventory_end = Counter({'apple': 20, 'banana': 35, 'kiwi': 15, 'grapes': 60})

# Calculate the change in inventory
inventory_change = inventory_end.copy()
inventory_change.subtract(inventory_beginning)

print(inventory_change)

Counter({'grapes': 20, 'kiwi': 15, 'banana': 5, 'orange': -25, 'apple': -30})


## defaultdict

| Concept & Usage                               | Description                                                                                                                                                                                                                                     | Example                                 |
|-----------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------|
| Python defaultdict Class                     | `collections.defaultdict` is a subclass of `dict` that takes a default factory as an argument. The default factory specifies the type of default value that will be assigned to missing keys. If no default factory is provided, it will default to `None`.                                                                                                                  | `from collections import defaultdict`<br>`my_dict = defaultdict(int)`<br>`my_dict['a'] += 1`<br>Output: `defaultdict(<class 'int'>, {'a': 1})` |
| Common Scenario: Counting occurrences        | One common use case of `defaultdict` is counting occurrences of elements in an iterable. By using the `int` factory, you can directly increment the count of elements without worrying about key errors.                                                                                                                                                                   | `from collections import defaultdict`<br>`data = ['apple', 'banana', 'apple', 'orange', 'banana', 'apple']`<br>`counter = defaultdict(int)`<br>`for item in data:`<br>&nbsp;&nbsp;&nbsp;&nbsp;`counter[item] += 1`<br>Output: `defaultdict(<class 'int'>, {'apple': 3, 'banana': 2, 'orange': 1})` |
| Common Scenario: Grouping data               | `defaultdict` is helpful when you want to group data elements based on a common attribute. It allows you to efficiently create a dictionary where keys represent the attribute, and values are lists of elements having that attribute.                                                                                                                          | `from collections import defaultdict`<br>`data = [('apple', 'fruit'), ('carrot', 'vegetable'), ('banana', 'fruit'), ('broccoli', 'vegetable')]`<br>`grouped_data = defaultdict(list)`<br>`for item, category in data:`<br>&nbsp;&nbsp;&nbsp;&nbsp;`grouped_data[category].append(item)`<br>Output: `defaultdict(<class 'list'>, {'fruit': ['apple', 'banana'], 'vegetable': ['carrot', 'broccoli']})` |
| Common Scenario: Handling missing keys        | `defaultdict` is useful when dealing with nested dictionaries, where some keys may not exist yet. Instead of checking for key existence before accessing or updating values, you can create a nested `defaultdict`. It saves you from the hassle of checking and initializing missing keys.                             | `from collections import defaultdict`<br>`nested_dict = lambda: defaultdict(nested_dict)`<br>`my_dict = nested_dict()`<br>`my_dict['a']['b']['c'] = 42`<br>Output: `defaultdict(<function <lambda> at 0x...>, {'a': defaultdict(<function <lambda> at 0x...>, {'b': defaultdict(<function <lambda> at 0x...>, {'c': 42})})` |


In [5]:
from collections import defaultdict

# Weather data for the last 7 days (Monday to Sunday)
weather_conditions = ['rainy', 'sunny', 'moderate', 'sunny', 'moderate', 'sunny']

# weather_counter = dict()

# for condition in weather_conditions:
#     if condition not in dict:
#         weather_counter[condition] = 1
#     else:
#         weather_counter[condition] += 1

# Create a defaultdict to count the occurrences of weather conditions
weather_counter = defaultdict(int)

# Count the number of days for each weather condition
for condition in weather_conditions:
    weather_counter[condition] += 1

# Print the results
print(f"Rainy days: {weather_counter['rainy']} days")
print(f"Sunny days: {weather_counter['sunny']} days")
print(f"Moderate days: {weather_counter['moderate']} days")

Rainy days: 1 days
Sunny days: 3 days
Moderate days: 2 days


In [7]:
from collections import defaultdict

words = ['apple', 'banana', 'ant', 'bat', 'ball']

word_group = defaultdict(list)
for word in words:
    word_group[word[0]].append(word)

print(word_group)

defaultdict(<class 'list'>, {'a': ['apple', 'ant'], 'b': ['banana', 'bat', 'ball']})


In [6]:
from collections import defaultdict

# Employee feedback data
feedback_data = [
    {'question': 'Work-Life Balance', 'rating': 4},
    {'question': 'Job Satisfaction', 'rating': 3},
    {'question': 'Work-Life Balance', 'rating': 5},
    {'question': 'Benefits', 'rating': 4},
    {'question': 'Job Satisfaction', 'rating': 5},
    {'question': 'Work-Life Balance', 'rating': 4},
    {'question': 'Benefits', 'rating': 3},
    {'question': 'Job Satisfaction', 'rating': 5},
]

# Create a defaultdict to count ratings for each question
ratings_counter = defaultdict(lambda: defaultdict(int))

# Count the occurrences of each rating for each question
for feedback in feedback_data:
    question = feedback['question']
    rating = feedback['rating']
    ratings_counter[question][rating] += 1

# Print the results
for question, ratings in ratings_counter.items():
    print(f"Question: {question}")
    for rating, count in ratings.items():
        print(f"Rating {rating}: {count} times")
    print()

Question: Work-Life Balance
Rating 4: 2 times
Rating 5: 1 times

Question: Job Satisfaction
Rating 3: 1 times
Rating 5: 2 times

Question: Benefits
Rating 4: 1 times
Rating 3: 1 times



In [8]:
from collections import defaultdict

names = ['Alice', 'Bob', 'Charlie']

name_ages = defaultdict(lambda: 25)
name_ages['Alice'] = 30

for name in names:
    print(f"{name} - Age: {name_ages[name]}")

Alice - Age: 30
Bob - Age: 25
Charlie - Age: 25


## deque

Python's `deque` (double-ended queue) from the `collections` module is a versatile data structure that provides efficient operations for adding and removing elements from both ends. It is ideal for scenarios where you need fast appends and pops from either end of a sequence. Here are some common use cases for using `deque` along with examples:

1. Implementing Queues and Stacks:

`deque` can be used to implement both queues and stacks efficiently. For a queue, you can use the `append()` method to enqueue elements and the `popleft()` method to dequeue elements. For a stack, you can use the `append()` method to push elements onto the stack and the `pop()` method to pop elements from the stack.

Example - Implementing a Queue:
    

In [9]:
from collections import deque

queue = deque()
queue.append('Alice')
queue.append('Bob')
queue.append('Charlie')

print(queue.popleft())  # Output: Alice
print(queue.popleft())  # Output: Bob

Alice
Bob


2. Sliding Window Operations:

`deque` is useful for sliding window operations, where you maintain a fixed-size window over a sequence and efficiently update it as new elements are added.

Example - Finding Maximum in a Sliding Window:

In [10]:
from collections import deque

def max_in_sliding_window(nums, k):
    max_values = []
    window = deque()

    for i, num in enumerate(nums):
        while window and nums[window[-1]] < num:
            window.pop()

        window.append(i)

        if window[0] <= i - k:
            window.popleft()

        if i >= k - 1:
            max_values.append(nums[window[0]])

    return max_values

nums = [1, 3, -1, -3, 5, 3, 6, 7]
k = 3
print(max_in_sliding_window(nums, k))  # Output: [3, 3, 5, 5, 6, 7]

[3, 3, 5, 5, 6, 7]


3. Reversing Elements:

`deque` allows you to efficiently reverse a sequence using the `reverse()` method, which can be helpful when dealing with data that needs to be processed in reverse order.

Example - Reversing a List:

In [11]:
from collections import deque

nums = [1, 2, 3, 4, 5]
dq = deque(nums)
dq.reverse()
reversed_list = list(dq)
print(reversed_list)  # Output: [5, 4, 3, 2, 1]

[5, 4, 3, 2, 1]


4. Memory Efficiency:

`deque` is memory efficient when you need to work with large sequences and want to avoid creating copies of the entire sequence during operations.

Example - Efficiently Removing from Both Ends:

In [12]:
from collections import deque

items = list(range(1, 1000001))
dq = deque(items)

# Efficiently remove elements from both ends
while dq:
    dq.pop()
    dq.popleft()

In this example, using a `deque` instead of a regular list improves the performance of removing elements from both ends, especially for large sequences.

## namedtuple

Python's `namedtuple` from the `collections` module is a lightweight data structure that creates tuple subclasses with named fields. It is used in situations where you need simple, immutable objects with named attributes, providing more expressive code compared to regular tuples or dictionaries. Here are some common use cases for using `namedtuple` along with examples:

1. Representing Structured Data:

`namedtuple` is an excellent choice for representing structured data with named fields, providing a convenient way to access elements by their names.

Example - Representing a Point:

In [13]:
from collections import namedtuple

# Define a namedtuple for representing a Point
Point = namedtuple('Point', ['x', 'y'])

# Create an instance of Point
p1 = Point(1, 2)

# Access the attributes using names
print(p1.x)  # Output: 1
print(p1.y)  # Output: 2

1
2


2. Returning Multiple Values from a Function:

`namedtuple` can be used to return multiple values from a function in a more readable and self-documenting way.

Example - Returning Min and Max Values:

In [14]:
from collections import namedtuple

def min_max(numbers):
    Result = namedtuple('Result', ['min', 'max'])
    return Result(min(numbers), max(numbers))

numbers = [10, 3, 7, 15, 4]
result = min_max(numbers)
print(result.min)  # Output: 3
print(result.max)  # Output: 15

3
15


3. Data Serialization and Deserialization:

`namedtuple` can be useful for data serialization and deserialization when working with JSON or CSV data, as it provides a convenient way to convert data between dictionaries and namedtuples.

Example - Serializing and Deserializing Data:

In [15]:
from collections import namedtuple
import json

# Define a namedtuple for representing a Person
Person = namedtuple('Person', ['name', 'age', 'city'])

# Serialize data to JSON
person_data = Person(name='Alice', age=30, city='New York')
json_data = json.dumps(person_data._asdict())

# Deserialize JSON data back to namedtuple
loaded_data = json.loads(json_data)
person = Person(**loaded_data)

print(person.name)  # Output: Alice
print(person.age)   # Output: 30
print(person.city)  # Output: New York

Alice
30
New York


4. Better Code Readability:

In situations where you have data with multiple attributes, using `namedtuple` can significantly improve code readability compared to using plain tuples or dictionaries.

Example - Representing a Student:

In [16]:
from collections import namedtuple

# Define a namedtuple for representing a Student
Student = namedtuple('Student', ['name', 'age', 'major'])

# Create an instance of Student
student = Student('Bob', 21, 'Computer Science')

# Access the attributes using names
print(student.name)   # Output: Bob
print(student.age)    # Output: 21
print(student.major)  # Output: Computer Science

Bob
21
Computer Science
