# Introduction to HashSets

In simple terms, a HashSet is a collection or a group of unique elements. Think of it as a party where everyone has a different name; no two people can have the same name. In coding terms, this means HashSets do not allow duplicate values.

## The Difference at a Glance: HashTable vs. HashSet

HashTable is like a dictionary, full of word (key) and definition (value) pairs. In contrast, HashSet is more like a unique-word collector, not bothered about definitions, just the words.

## Core Characteristics of HashSets

- **Uniqueness**: The first thing to remember about HashSets is that every element is unique. It’s like having a basket of fruits, and no two fruits in it are the same. This uniqueness is the superpower of HashSets!

- **Null Elements**: A HashSet can include a null element. It’s like having an empty slot in your fruit basket, and that’s perfectly okay!

- **Order**: When it comes to order, HashSets like to keep it casual. They do not guarantee any specific order of the elements. It’s like reaching into your fruit basket blindfolded – you never know which fruit you’ll grab first!

## Internal Working of a HashSet

- **Handling Duplicates**: So, what happens when HashSets encounter a duplicate? They simply ignore it! If you try to add a fruit that’s already in your basket, HashSet will just keep the original and discard the duplicate.

- **Hashing Function**: HashSets, like HashTables, use a hashing function to decide where each element should go. It’s like assigning a specific spot in your basket for each fruit based on its name.

- **Resizing the Set**: What happens when your basket gets full? When a HashSet starts running out of room, they increase the basket size to make space for more unique elements.

In the next section, we will solve some problems related to HashSets.

!["hash_set"](images/hashset.svg)

**Problem Statement:**

Given a list of integers, determine the count of numbers for which there exists another number in the list that is greater by exactly one unit.

In other words, for each number x in the list, if x + 1 also exists in the list, then x is considered for the count.

**Examples:**

*Example 1:*

Input: [4, 3, 1, 5, 6]

Expected Output: 3

Justification: The numbers 4, 3, and 5 have 5, 4, and 6 respectively in the list, which are greater by exactly one unit.

*Example 2:*

Input: [7, 8, 9, 10]

Expected Output: 3

Justification: The numbers 7, 8, and 9 have 8, 9, and 10 respectively in the list, which are greater by exactly one unit.

*Example 3:*

Input: [11, 13, 15, 16]

Expected Output: 1

Justification: Only the number 15 has 16 in the list, which is greater by exactly one unit.

**Constraints:**

1 <= arr.length <= 1000

0 <= arr[i] <= 1000

**Solution:**

To solve this problem, the first step is to add all elements of the given array into a HashSet. This allows for constant time complexity (O(1)) checks for the presence of any number. After populating the HashSet, iterate through the array once more. During this iteration, for each element, check if its direct successor (the element itself plus one) is present in the HashSet. If the successor is found, increment a counter. This counter will keep track of the total number of elements that have their direct successors within the array. The final value of this counter, after completing the iteration over the array, will be the solution to this problem.

**Algorithm Walkthrough:**

Given the input list [4, 3, 1, 5, 6]:

1. Initialize an empty hashset.
2. Insert each number from the list into the hashset: {4, 3, 1, 5, 6}.
3. Initialize a counter to 0.
4. Traverse the list:
   - For 4, check if 5 exists in the hashset. It does, so increment the counter.
   - For 3, check if 4 exists in the hashset. It does, so increment the counter.
   - For 1, check if 2 exists in the hashset. It doesn't, so move on.
   - For 5, check if 6 exists in the hashset. It does, so increment the counter.
   - For 6, check again if 7 exists in the hashset. It still doesn't, so move on.
5. The counter now holds the value 3, which is the final result.

!["counting_elements"](images/counting_elements.svg)

In [1]:
from typing import List

class Solution:
    def countElements(self, arr: List[int]) -> int:
        # Convert the list to a set for O(1) lookups
        num_set = set(arr)
        
        # Initialize a counter to keep track of valid elements
        valid_count = 0
        
        # For each number in the list, check if its next consecutive number exists in the set
        for current_num in arr:
            if current_num + 1 in num_set:
                valid_count += 1
        
        return valid_count

if __name__ == "__main__":
    sol = Solution()
    # Test cases
    print(sol.countElements([4, 3, 1, 5, 6]))   # Expected: 3
    print(sol.countElements([7, 8, 9, 10]))     # Expected: 3
    print(sol.countElements([11, 13, 15, 16]))  # Expected: 1


3
3
1


**Time Complexity**: O(n), where n is the length of the input list arr, as both the creation of the set and the iteration through the list take linear time.

**Space Complexity**: O(n), as the space required by the set num_set grows linearly with the size of the input list arr.

**Problem Statement**

Given two strings. The first string represents types of jewels, where each character is a unique type of jewel. The second string represents stones you have, where each character represents a type of stone. Determine how many of the stones you have are also jewels.

**Examples:**

Example 1:

Input: 
```
Jewels = "abc", Stones = "aabbcc"
```
Expected Output: 
```
6
```
Justification: 
```
All the stones are jewels.
```

Example 2:

Input: 
```
Jewels = "aA", Stones = "aAaZz"
```
Expected Output: 
```
3
```
Justification: 
```
There are 2 'a' and 1 'A' stones that are jewels.
```

Example 3:

Input: 
```
Jewels = "zZ", Stones = "zZzZzZ"
```
Expected Output: 
```
6
```
Justification: 
```
All the stones are jewels.
```

**Constraints:**

- 1 <= jewels.length, stones.length <= 50
- jewels and stones consist of only English letters.
- All the characters of jewels are unique.

**Solution**

To solve this problem, we can use a HashSet for its property of constant-time lookups. First, we add each character (jewel) from the first string to a HashSet. This set now represents all unique jewels. Next, we iterate through each character (stone) in the second string and check if it is present in the HashSet. If it is, it means this stone is also a jewel. We keep a count of such stones. The final count gives us the total number of stones that are also jewels. The efficiency of this solution lies in the HashSet's ability to quickly check for the presence of a jewel, making the overall process highly efficient for large strings.

**Algorithm Walkthrough:**

Given the jewels "aA" and the stones "aAaZz":

1. Initialize an empty hashset and a counter set to 0.
2. Traverse the jewels "aA" and populate the hashset: {'a', 'A'}.
3. Traverse the stones "aAaZz". For each character:
   - Check if it exists in the hashset.
   - If it does, increment the counter.
4. After traversing all the stones, the counter will be 3, which is the result.

For the third example, with jewels "zZ" and stones "zZzZzZ":

1. Initialize an empty hashset and a counter set to 0.
2. Traverse the jewels "zZ" and populate the hashset: {'z', 'Z'}.
3. Traverse the stones "zZzZzZ". For each character:
   - Check if it exists in the hashset.
   - Since all the stones are jewels in this case, the counter will increment for each stone.
4. After traversing all the stones, the counter will be 6, which is the result.

![jewels_stone](images/jewels_stone.svg)

In [2]:
class Solution:
    def numJewelsInStones(self, jewels: str, stones: str) -> int:
        # Create a set to store unique jewel types
        jewel_types = set(jewels)
        
        # Initialize a counter to track the number of stones that are jewels
        jewel_count = 0
        
        # Iterate through each stone
        for stone in stones:
            # Check if the stone is a jewel
            if stone in jewel_types:
                # Increment the jewel count
                jewel_count += 1
        
        # Return the total count of jewels found in the stones
        return jewel_count

if __name__ == "__main__":
    sol = Solution()
    # Test cases
    print(sol.numJewelsInStones("abc", "aabbcc"))    # Expected: 6
    print(sol.numJewelsInStones("aA", "aAaZz"))      # Expected: 3
    print(sol.numJewelsInStones("zZ", "zZzZzZ"))     # Expected: 6


6
3
6


**Time Complexity:**  
The time complexity of this solution is O(n + m), where n is the length of the jewels string and m is the length of the stones string. This complexity arises from creating the jewel set and iterating through each stone to check if it is a jewel.

**Space Complexity:**  
The space complexity is O(n), where n is the number of unique jewel types. This is because we store the unique jewel types in a set.

## Unique Number of Occurrences (easy)


**Problem Statement**

Given an array of integers, determine if the number of times each distinct integer appears in the array is unique.

In other words, the occurrences of each integer in the array should be distinct from the occurrences of every other integer.

**Examples:**

Input: `[4, 5, 4, 6, 6, 6]`
Expected Output: `true`
Justification: The number 4 appears 2 times, 5 appears 1 time, and 6 appears 3 times. All these occurrences (1, 2, 3) are unique.

Input: `[7, 8, 8, 9, 9, 9, 10, 10]`
Expected Output: `false`
Justification: The number 7 appears 1 time, 8 appears 2 times, 9 appears 3 times, and 10 appears 2 times. The occurrences are not unique since the number 2 appears twice.

Input: `[11, 12, 13, 14, 14, 15, 15, 15]`
Expected Output: `false`
Justification: The number 11 appears 1 time, 12 appears 1 time, 13 appears 1 time, 14 appears 2 times, and 15 appears 3 times. These occurrences are not unique.

**Constraints:**

- 1 <= arr.length <= 1000
- -1000 <= arr[i] <= 1000

**Solution**

The solution to this problem involves two main steps. First, we count the occurrences of each element in the array using a hashmap. Each element of the array serves as a key, and its frequency as the value.

After we have the frequencies, the second step is to ensure the uniqueness of these frequencies. We achieve this by storing each frequency in a hashset. Since a hashset does not allow duplicates, if we encounter a frequency that's already in the set, it indicates that there are at least two elements with the same frequency, and thus, we return false. If all frequencies are unique, our final result is true.

Here is the step-by-step solution.

1. **Counting Occurrences:** Start by counting the occurrences of each integer in the array. This can be done using a hashmap where the key is the integer and the value is its count.

2. **Checking Uniqueness:** Once we have the counts, we need to check if these counts are unique. This can be done by inserting each count into a hashset. If at any point, we try to insert a count that already exists in the set, we can conclude that the occurrences are not unique.

3. **Return Result:** If all counts are successfully inserted into the set without any repetition, then the occurrences are unique, and we return true. Otherwise, we return false.

This approach ensures that we determine the uniqueness of occurrences in an efficient manner. By using a hashmap to count occurrences and a hashset to check for uniqueness, we can solve the problem in linear time.

**Algorithm Walkthrough:**

Given the input `[4, 5, 4, 6, 6, 6]`:

- Initialize an empty hashmap `countMap`.
- Traverse the array. For each integer, increase its count in `countMap`.
- After traversal, `countMap` will be: `{4:2, 5:1, 6:3}`
- Initialize an empty hashset `uniqueCounts`.
- Traverse the values of `countMap`. For each count, try to insert it into `uniqueCounts`.
- Insert 2 (from integer 4) into `uniqueCounts`.
- Insert 1 (from integer 5) into `uniqueCounts`.
- Insert 3 (from integer 6) into `uniqueCounts`.
- Since all counts were inserted without repetition, return true.

![unique_occurance](images/unique_occurance.svg)

In [3]:
from collections import Counter

class Solution:
    def uniqueOccurrences(self, arr):
        # Count occurrences of each number
        count_map = Counter(arr)
        
        # Check if the occurrences are unique
        unique_counts = set()  # Set to store unique counts
        for count in count_map.values():
            if count in unique_counts:  # If count already exists in the set
                return False
            unique_counts.add(count)  # Add count to the set
        return True

# Test the function
sol = Solution()
print(sol.uniqueOccurrences([4, 5, 4, 6, 6, 6]))
print(sol.uniqueOccurrences([7, 8, 8, 9, 9, 9, 10, 10]))
print(sol.uniqueOccurrences([11, 12, 13, 14, 14, 15, 15, 15]))


True
False
False


### Time and Space Complexity Analysis

- **Time Complexity:** The solution has a time complexity of O(n) where n is the length of the input array since it involves iterating through the array once to count occurrences and once to check for uniqueness.
  
- **Space Complexity:** The space complexity is O(n) as well, where n is the number of unique elements in the input array, since the countMap and uniqueCounts structures may grow linearly with the input size.

## Longest Substring Without Repeating Characters (medium)

### Problem Statement
Given a string, identify the length of its longest segment that contains distinct characters. In other words, find the maximum length of a substring that has no repeating characters.

#### Examples:

**Example 1:**

Input: "abcdaef"  
Expected Output: 6  
Justification: The longest segment with distinct characters is "bcdaef", which has a length of 6.

**Example 2:**

Input: "aaaaa"  
Expected Output: 1  
Justification: The entire string consists of the same character. Thus, the longest segment with unique characters is just "a", with a length of 1.

**Example 3:**

Input: "abrkaabcdefghijjxxx"  
Expected Output: 10  
Justification: The longest segment with distinct characters is "abcdefghij", which has a length of 10.

#### Constraints:

- 0 <= s.length <= 5 * 10^4
- s consists of English letters, digits, symbols, and spaces.

### Algorithm Description:

To solve the problem, iterate through the characters of the given string while maintaining a HashSet to track the characters already encountered. As we traverse the string, each character is checked against the HashSet. If the character is not present in the HashSet, it indicates no repetition, and we add it to the HashSet and continue. However, if a character is already in the HashSet, it means a repetition has occurred. At this point, we update the length of the longest substring found so far (if necessary), and modify the HashSet to remove the characters up to and including the repeated character. This process continues until we have traversed the entire string. The final result is the length of the longest substring without repeating characters.

#### Initialization: 
Begin with two pointers, start and end, both at the start of the string. The hashset will initially be empty.

#### Sliding Window Expansion:
Progressively move the end pointer to the right until you come across a character that's already in the hashset, indicating a repetition.

#### Adjusting Start Pointer: 
Upon detecting a repeated character, increment the start pointer by one position and remove the character at the start position from the hashset. This action ensures that the window only contains unique characters.

#### Result Calculation: 
At each step, calculate the length of the current window (from start to end). Keep track of the maximum length observed.

By the end of this process, the maximum length observed will be the length of the longest segment of unique characters in the string.

### Algorithm Walkthrough:

Given the string "abrkaabcdefghijjxxx":

- Initialize start and end to 0, and an empty hashset.
- As you move end from 0 to the end of the string:
  - When end is at position 4 (character 'a'), since 'a' is already in the hashset, we will remove the character at position 0 ('a') from the hashset and move start to position 1.
  - Continue this process, always ensuring the characters between start and end are unique.
  - Calculate the length of the segment at each step (start -> end) and update the maximum length.
- The maximum length observed will be 10, corresponding to the segment "abcdefghij".

--- 

This markdown should present the content in a clear and structured manner. Let me know if you need further adjustments!

![longest_substring](images/longest_substring.svg)

In [4]:
class Solution:
    def lengthOfLongestSubstring(self, s: str) -> int:
        # Set to store unique characters
        unique_chars = set()
        # Maximum length of the substring without repeating characters
        max_length = 0
        # Pointers for the start and end of the current substring
        start_ptr, end_ptr = 0, 0
        
        # Traverse the string with the end pointer
        while end_ptr < len(s):
            # If the character is not in the set, it's a unique character for the current substring
            if s[end_ptr] not in unique_chars:
                unique_chars.add(s[end_ptr])
                # Update the maximum length if needed
                max_length = max(max_length, end_ptr - start_ptr + 1)
                # Move the end pointer
                end_ptr += 1
            else:
                # If we find a repeating character, remove the character at the start pointer and move the start pointer
                unique_chars.remove(s[start_ptr])
                start_ptr += 1
        
        return max_length

if __name__ == "__main__":
    sol = Solution()
    # Test cases
    print(sol.lengthOfLongestSubstring("abcdaef"))       # Expected: 6
    print(sol.lengthOfLongestSubstring("aaaaa"))         # Expected: 1
    print(sol.lengthOfLongestSubstring("abrkaabcdefghijjxxx")) # Expected: 10


6
1
10


Sure, here's the time and space complexity analysis in two lines:

- **Time Complexity:** O(n), where n is the length of the input string.
- **Space Complexity:** O(min(n, k)), where n is the length of the input string and k is the size of the character set (English letters, digits, symbols, and spaces).