# Hashtables - The Dance Move 🥰

## `dict`'s of Python

The `dict` type is not only widely used in our programs but also a fundamental part of the Python implementation. 

Class and instance attributes, module namespaces, and function keyword arguments are some of the core Python constructs represented by dictionaries in memory. 

In real world, the domain-name system (**DNS**) maps a host name.

Such as `www.wiley.com`, to an Internet-Protocol (IP) address, such as `208.215.179.146`.

DNS locates and serves the web page you’re looking for in a matter of seconds.

It's like a phone book for the web.

---

In this notebook we will cover `dict`, `set`, `frozenset` and more (some of Collections Classes) from Python.

### What are `dict`'s?

Dicts are associative arrays or maps. 

They are implemented as hash tables which has a **hash code** and **compression function**.

`Hash code` takes a string and returns an index between **-inf** and **+inf**.

`Compression function` makes that index in between $0$ and $N - 1$ where $N$ is the bucket array size.

#### Cool info 😉

Python dicts actually store the hash value for their keys, because computing hash methods each time is costly.

### For the Curious 🤔 - If needed detail, Chapter 11 of the [Brown Book](https://www.amazon.com/Structures-Algorithms-Python-Michael-Goodrich/dp/1118290275).

### Collision handling schemes ? 

#### Separate Chaining or Open Addressing

Separate chaining is like holding long lists in each slot of the bucket array to avoid collisions

Open addressing is things like, linear probing, quadratic probing, double hashing.

linear probing - if occupied go next quadratic probing - if occupied go next but i ^2 double hashing - insert another hash to the calculation.

### Cool info

Python dicts use Open Addressing with a pseudo random number generator. With load factor threshold 2/3

Load factor should NOT go over 1. For python dicts, if it goes over 2/3, bucket array is resized.

In default python dictionaries will initialize with bucket array with size 8

### Sorted Maps

Keys are sorted.these are really good for finding nearest keys range queries. 

They provide both fast key:value lookups and maintain keys in order.

EXAMPLE - FLIGHT DATABASES - Maxima Sets

### Skip Lists

They store data MORE EFFICIENTLY. PERFORM operations like insert, delete, search in logarithmic time.

##### REALLY GOOD

in sorted data management range queries they are easier to implement than TREES

##### BAD

Takes A LOT of memory with all those lists in it not widely known.

### The lucky charm ? ☀️

Summary about Maps - Hash tables - Skip lists:

Dictionaries are implemented as hash tables.

Hash function consists of hash code and compression function (MAD method).

Collision handling schemes - separate chaining (another mini  map), open addressing (linear, quadratic probing) (double hashing).

Python dicts uses Open Addressing with pseudo random generator - (2/3 load factor)

### Even the basic c = a + b

is just calls for local namespace `__getitem__` & `__setitem__`.

In [1]:
# we use braces to make dictionaries
my_dict = {"key_one": 12, "key_two": 6}
empty_dict = {}

In [2]:
my_second_dict = {"a": 2, "b": 5, "c" : 8}
print(my_second_dict)
print(type(my_second_dict))

{'a': 2, 'b': 5, 'c': 8}
<class 'dict'>


In [3]:
dial_codes = [(55, "Brazil"), (81, "Japan"), (7, "Russia")]

# you can use comprehensions here too!
country_dial = {country:code for code, country in dial_codes}
print(country_dial) # {'Brazil': 55, 'Japan': 81, 'Russia': 7}

{'Brazil': 55, 'Japan': 81, 'Russia': 7}


In [4]:
# you can go deep with comprehensions
conditions = {code: country.upper() 
              for country, code in sorted(country_dial.items()) 
              if code < 70}

print(conditions) # {55: 'BRAZIL', 7: 'RUSSIA'}

{55: 'BRAZIL', 7: 'RUSSIA'}


The `collections.abc` module provides the `Mapping` and `MutableMapping` ABC'es describing the interfaces of `dict` and similar types.

The main value of the ABCs is documenting and formalizing the standard interfaces for mappings and serving as a criteria for `isinstance` tests in code that needs to support mappings in broad sense:

In [3]:
from collections import abc

abc.Container
my_dict = {}
print(isinstance(my_dict, abc.Container))             # True
print(isinstance(my_dict, abc.Collection))            # True
print(isinstance(my_dict, abc.Mapping))               # True
print(isinstance(my_dict, abc.MutableMapping))        # True

True
True
False
True
True


In [4]:
workout = {"bench": 120, "squat": 140, "deadlift": 200}

# you can clear the dict
workout.clear()
print(workout) # {}

{}


In [7]:
workout = {"bench": 120, "squat": 140, "deadlift": 200}

# delete a single key
del workout["bench"]

print(workout) # {'squat': 140, 'deadlift': 200}

print(len(workout)) # 2

{'squat': 140, 'deadlift': 200}
2


In [15]:
# you can pop keys
print(workout.pop("squat")) # 140

# now the squat is gone!
print(workout)

# add new records!
workout["bench"] = 140
workout["squat"] = 180

print(workout)

180
{'deadlift': 200, 'bench': 140}
{'deadlift': 200, 'bench': 140, 'squat': 180}


In [16]:
# a normal for loop will give you keys
# because dictionaries iterate over their keys
for k in workout:
    print(f"Key is {k}")

Key is deadlift
Key is bench
Key is squat


#### Dictionary view objects

The objects returned by `dict.keys()`, `dict.values()` and `dict.items()` are view objects. 

They provide a dynamic view on the dictionary’s entries, which means that when the dictionary changes, the view reflects these changes.

In [19]:
# you can specifically ask for keys
for k in workout.keys():
    print(f"I only want keys! Here is one: {k}")

I only want keys! Here is one: deadlift
I only want keys! Here is one: bench
I only want keys! Here is one: squat


In [20]:
# you can specifically ask for values
for v in workout.values():
    print(f"I only wany values. Here is one {v}")

I only wany values. Here is one 200
I only wany values. Here is one 140
I only wany values. Here is one 180


In [21]:
# you can just get keys and values in your dict:
for k, v in workout.items():
    print(f"For the key: {k}, value is {v}")

For the key: deadlift, value is 200
For the key: bench, value is 140
For the key: squat, value is 180


In [22]:
# you can traverse inside dict, with enumerate
# this will give you index as well as key
for index, key in enumerate(workout):
    print(index, key, workout[key])

0 deadlift 200
1 bench 140
2 squat 180


In [23]:
print(f"Original workout {workout}")

# you can also popitem ?
workout.popitem()

# squat is gone
print(workout)

Original workout {'deadlift': 200, 'bench': 140, 'squat': 180}
{'deadlift': 200, 'bench': 140}


Dictionaries do not maintain a well defined order on their elements. $O(1)$ access to elements. 😍

In [24]:
# here is even more practice

hmap = {"gary" : 1, "alex": 3, "artour" : 7, "greg": 10, "andrej": 20}

print(hmap["gary"]) # 1

print(hmap.get("alexx", 100)) # 100

print("artour" in hmap) # True

print(hmap == {True: 1}) # False

print(max(hmap)) # "greg" - biggest key, literally

print(max(hmap, key=hmap.get)) # andrej - key for max changed

1
100
True
False
greg
andrej


### Application: Finding The min/max Value based on values of dict

Although Python dict's iterate over their keys, we can define `max()`/`min()` with the help of `key` parameter, such that we are able make comparisons based on values:

In [3]:
d = {100:2, 4:45, 7:78, 3: 123}

print("Max valued key is: ", max(d, key= lambda x: d[x])) # 3

print("Min of dict: ", min(d)) # 3

print("Min valued key is : ", min(d, key = lambda x : d[x])) # 100

Max valued key is:  3
Min of dict:  3
Min valued key is :  100


The novel concept for a hash table is the use of a **hash function** to map general keys to corresponding indices in a table.

The goal of a hash function, $h$, is to map each key $k$ to an integer in the range `[0, N − 1]`, where $N$ is the capacity of the bucket array for a hash table.

It is common to view the evaluation of a hash function, $h(k)$, as consisting of two portions:

- a hash code that maps a key k to an integer.

- a compression function that maps the hash code to an integer within a range of indices, $[0, N − 1]$, for a bucket array.

In [27]:
# defaultdicts are cool
# they let you have a default value for every key
# we will discover about them more, later in this notebook

from collections import defaultdict

f = defaultdict(list)

print(f) # defaultdict(<class 'list'>, {})

# these will not throw errors
print(f[123]) # []

print(f.get(123, 0 )) # []

another_def_dict = defaultdict(int)

# you can just increase the value of a random key
# which did not exist before
for i in range(4):
    another_def_dict[i] += 1

print(another_def_dict)

defaultdict(<class 'list'>, {})
[]
[]
defaultdict(<class 'int'>, {0: 1, 1: 1, 2: 1, 3: 1})


Dicts are great frequency counters:

In [1]:
v = {}
j  = [1,2,3,2,3,32,32,1,2,3,2,2]
for elem in j:
	v[elem] = v.get(elem, 0) + 1

print(v) # {1: 2, 2: 5, 3: 3, 32: 2}

# or 
print()

from collections import Counter # has most_common

c = Counter(j)

print(c) # Counter({2: 5, 3: 3, 1: 2, 32: 2})

# or 
print()

from collections import defaultdict

d = defaultdict(int)

for elem in j:
	d[elem] += 1

print(d) # defaultdict(<class 'int'>, {1: 2, 2: 5, 3: 3, 32: 2})


{1: 2, 2: 5, 3: 3, 32: 2}

Counter({2: 5, 3: 3, 1: 2, 32: 2})

defaultdict(<class 'int'>, {1: 2, 2: 5, 3: 3, 32: 2})


## Example's are here:

In [7]:
# Can we find most occurances in a given sequence?

def most_occurence(seq):
    map = {}
    for elem in seq:
        map[elem] = map.get(elem, 0) + 1
    
    # this works
    # return max(map, key= lambda x : map[x])
    
    # or
    return max(map , key = map.get)

most_occurence([1,2,3,4,4,4,5,3,2,4,4,4]) # 4

4

In [3]:
def twoSum(nums: list[int], target: int) -> list[int]:
    # keys as numbers and indexes as values
    nums_to_index = {}
    
    # i is one of the answers
    for i , elem in enumerate(nums):
        complement = target - elem
        # if complement is in out dict
        if complement in nums_to_index:
            # order does not matter
            # return them!
            return [nums_to_index[complement], i]
        # if not, just add the element to our dictionary
        nums_to_index[elem] = i

print(twoSum([1,2,3,4,5], 7))

[2, 3]


In [5]:
from collections import Counter

def topKFrequent(nums: list[int], k: int) -> list[int]:
    c = Counter(nums)
	# this is elem[0] because it will normally return tuples as
	# [ (element, frequency), (element, frequency), (element, frequency)]
    return [elem[0] for elem in c.most_common(k)]

print(topKFrequent(nums = [1,1,1,2,2,3], k = 2))

[1, 2]


In [8]:
"""Let B be an array of size n ≥ 6 containing 
integers from 1 to n - 5, inclusive, with exactly five 
repeated. 

Describe a good algorithm for finding the five 
integers in B that are repeated.
"""

def find_most_repeated_integer(data):
    # just find the most repeated,
    # a superset solution
    
    # counter = {element: data[index] for index, element in enumerate(data)}
    counter = {}
    
    for elem in data:
        if elem in counter:
            counter[elem] += 1
        else:
            counter[elem] = 1

    return max(counter, key = counter.get)

find_most_repeated_integer([1,2,3,4,4,4,4,4,5,3])

4

In [30]:
"""
You are given an integer array nums and an integer k.

The frequency of an element x is the number of times 
it occurs in an array.

An array is called good if the frequency of each element 
in this array is less than or equal to k.

Return the length of the longest good subarray of nums.

A subarray is a contiguous non-empty sequence of 
elements within an array.

Example 1:

    Input: nums = [1,2,3,1,2,3,1,2], k = 2
    Output: 6
    Explanation: The longest possible good subarray is [1,2,3,1,2,3] 
        since the values 1, 2, and 3 occur at most twice in 
        this subarray. Note that the subarrays [2,3,1,2,3,1] and 
        [3,1,2,3,1,2] are also good.
    
    It can be shown that there are no good subarrays 
    with length more than 6.

Example 2:

    Input: nums = [1,2,1,2,1,2,1,2], k = 1
    Output: 2
    Explanation: The longest possible good subarray is [1,2] since 
    the values 1 and 2 occur at most once in this subarray. Note 
    that the subarray [2,1] is also good.
    
    It can be shown that there are no good subarrays 
    with length more than 2.

Example 3:

    Input: nums = [5,5,5,5,5,5,5], k = 4
    Output: 4
    Explanation: The longest possible good subarray is [5,5,5,5] since 
    the value 5 occurs 4 times in this subarray.
    
    It can be shown that there are no good subarrays 
    with length more than 4.
 

Constraints:

    1 <= nums.length <= 10^5
    1 <= nums[i] <= 10^9
    1 <= k <= nums.length

Takeaway:

    Freq counters -> default dicts are cool.
    
    two pointers, slding window.

    Update attribute as you move, with att = max(att, new_value)

"""

from collections import defaultdict

class Solution:
    def maxSubarrayLength_(self, nums : list[int], k: int) -> int:
        # get the length
        n = len(nums)

        # make a map for frequency
        m = defaultdict(int)
        
        # a sliding window
        i, j = 0, 0
        ans = 1
        
        while i < n and j < n:
            # we are inside the borders of nums
            
            # increment map
            m[nums[j]] += 1
            
            while m[nums[j]] > k:
                # freq is bigger than k
                # decrease freq
                m[nums[i]] -= 1
                # move up the start
                i += 1
            
            # update answer
            ans = max(ans, j - i + 1)
            
            # increment end
            j += 1
        
        return ans
    
    def maxSubarrayLength(self, nums: list[int], k: int) -> int:
        # we are trying to find the 
        # length of a window

        # so how about using a sliding window LOL

        # define max length
        max_length = 0
        # make a map
        frequency = defaultdict(int)
        # pointer for left side
        start = 0
        
        for end, num in enumerate(nums):
            # increment on the fly
            frequency[num] += 1
            
            # If the frequency of any element exceeds 
            # k, adjust the window
            while frequency[num] > k:
                # drop the first element
                # decrease the freq
                frequency[nums[start]] -= 1
                # move start
                start += 1
            
            # Update the maximum length of the subarray
            max_length = max(max_length, end - start + 1)
            # [1,2,3,4]
            # end 3 - start 0
            # lenght -> 3 - 0 + 1 = 4
        
        return max_length

sol = Solution()
print(sol.maxSubarrayLength_(nums = [1,2,3,1,2,3,1,2], k = 2)) # 6
print(sol.maxSubarrayLength(nums = [1,2,1,2,1,2,1,2], k = 1)) # 2
print(sol.maxSubarrayLength(nums = [5,5,5,5,5,5,5], k = 4)) # 4

6
2
4


In [31]:
"""
Given an integer array nums and an integer k, 
return the number of good subarrays of nums.

A good array is an array where the number of 
different integers in that array is exactly k.

For example, [1,2,3,1,2] has 3 different 
integers: 1, 2, and 3.

A subarray is a contiguous part of an array.

Example 1:

    Input: nums = [1,2,1,2,3], k = 2
    
    Output: 7
    
    Explanation: 
    
        Subarrays formed with exactly 2 different
        integers: [1,2], [2,1], [1,2], [2,3],
                  [1,2,1], [2,1,2], [1,2,1,2]

Example 2:

    Input: nums = [1,2,1,3,4], k = 3
    
    Output: 3
    
    Explanation: 
    
        Subarrays formed with exactly 3 different 
            integers: [1,2,1,3], [2,1,3], [1,3,4].

Constraints:

    1 <= nums.length <= 2 * 10^4
    
    1 <= nums[i], k <= nums.length

Takeaway:

    Subarrays are just basically asking for sliding windows

"""

from collections import deque

class Solution:
    def subarraysWithKDistinct_(self, nums: list[int], k: int) -> int:
        # brute force
        # Memory limit exceeded
        
        # find all subarrays
        subarrays = []
        for i in range(len(nums)):
            for j in range(i+1, len(nums) +1):
                subarrays.append(nums[i:j])
                
        # print(subarrays)       
        res = 0
        # check if they are good
        for elem in subarrays:
            if len(set(elem)) == k:
                res += 1
        return res
    
    def subarraysWithKDistinct__(self, nums: list[int], k: int) -> int:
        # DOES NOT WORK
        
        # can we solve this with a sliding window ? 
        
        # until we have k distinct elements, we can expand window
        # if we went over k, we should shrink window
        
        window = deque()
        res = 0
        for elem in nums:
            if len(set(window)) < k:
                window.append(elem)
                
            elif len(set(window)) > k:
                window.popleft()
                
            elif len(set(window)) == k:
                res += 1
                # check if next element is unique 
                temp = window.append(elem)
                if len(set(temp)) ==  k + 1:
                    window.append(elem)
                    window.popleft()
            
            else:    
                res += 1
                print(window)
                
        return res
    
    def subarraysWithKDistinct(self,  nums: list[int], k: int) -> int:
        # We will use two sliding windows
        
        # For a window ending at index j, we keep track of
        # the count of each number in the window. 
        # Let's call this count as window_count
        window_count = {}

        # We maintain two windows, one for left and one for right
        # window_count_left will have count of each number for the left window
        # Initially, both windows are empty

        window_count_left = {}
    
        left1, left2 = 0, 0
        result = 0

        for num in nums:
            window_count[num] = window_count.get(num, 0) + 1
            window_count_left[num] = window_count_left.get(num, 0) + 1
        
            # If the length of the unique elements in the current window
            # is greater than k, we contract the left window
            while len(window_count) > k:
                window_count[nums[left1]] -= 1
                if window_count[nums[left1]] == 0:
                    del window_count[nums[left1]]
                left1 += 1

            # If the length of the unique elements in the current window
            # is greater than or equal to k, we contract the left window
            while len(window_count_left) >= k:
                window_count_left[nums[left2]] -= 1
                if window_count_left[nums[left2]] == 0:
                    del window_count_left[nums[left2]]
                left2 += 1

            # The number of subarrays ending at index j 
            # with at most k distinct elements
            # is the difference of the two left pointers
            result += left2 - left1

        return result
    

sol = Solution()
print(sol.subarraysWithKDistinct(nums = [1,2,1,2,3], k = 2)) # 7
print(sol.subarraysWithKDistinct(nums = [1,2,1,3,4], k = 3)) # 3

7
3


In [32]:
"""
Given an integer array nums, return true if any value appears 
at least twice in the array, and return false if 
every element is distinct.

Example 1:

    Input: nums = [1,2,3,1]
    Output: true

Example 2:

    Input: nums = [1,2,3,4]
    Output: false

Example 3:

    Input: nums = [1,1,1,3,3,4,3,2,4,2]
    Output: true

Constraints:

    1 <= nums.length <= 10^5
    -10^9 <= nums[i] <= 10^9

Takeaway:

    You can use sets or you can use a dictionary
    
    You can see the max valued dict by: max(dict.items()) 
    
    CAREFUL - max (dict , key=dict.get()) WILL GIVE YOU THE KEY not the VALUE
    
    You can use Greedy approach in the loop to exit faster.
    
    sets have add() and remove() methods

"""

class Solution:

    def containsDuplicate__(self, nums) -> bool:
        # my first solution
        freq = {elem : 0 for elem in nums}
        for elem in nums:
            freq[elem] += 1
        max_occurance = max(freq.values())
        return max_occurance > 1
 
    def containsDuplicate_(self, nums) -> bool:
        # another one would be
        freq = {elem : 0 for elem in nums}
        for elem in nums:
            freq[elem] += 1
            if freq[elem] > 1:
                return True
        return False

    def containsDuplicate(self, nums)-> bool:
        # sets are really helpful too.
        hset = set()
        
        for elem in nums:
            if elem in hset:
                return True
            else:
                hset.add(elem)

        return False 


if __name__ == '__main__':
    sol = Solution()
    print(sol.containsDuplicate([1,2,3,4]))
    print(sol.containsDuplicate([1,2,3,4,3]))
    print(sol.containsDuplicate_([1,2,3,4]))
    print(sol.containsDuplicate_([1,2,3,4,3]))
    print(sol.containsDuplicate__([1,2,3,4]))
    print(sol.containsDuplicate__([1,2,3,4,3]))

False
True
False
True
False
True


In [33]:
"""
Given two strings s and t, return true if t is an
anagram of s, and false otherwise.

An Anagram is a word or phrase formed by rearranging the 
letters of a different word or phrase, typically using 
all the original letters exactly once.
 
Example 1:

    Input: s = "anagram", t = "nagaram"
    Output: true

Example 2:

    Input: s = "rat", t = "car"
    Output: false
 
Constraints:

    1 <= s.length, t.length <= 5 * 10^4
    s and t consist of lowercase English letters.

Takeaway:

    Counter is just made for the job!
"""

from collections import Counter

class Solution:

    def isAnagram__(self, s: str, t: str) -> bool:
        # my first solution - brute force
        dict_1 = {character: s.count(character)  for character in set(s)} # o(n^2)
        dict_2 = {character: t.count(character)  for character in set(t)} # o(n^2)
        return dict_1 == dict_2
    
    def isAnagram_(self, s: str, t: str) -> bool:
        # second solution - way better for bigger sized strings\
        return Counter(s) == Counter(t)  

    def isAnagram___(self, s: str, t: str) -> bool:
        # adding elements and comparing the dicts would work
        map_1, map_2 = {}, {}

        for elem in s:
            map_1[elem] = map_1.get(elem, 0) + 1

        for elem in t:
            map_2[elem] = map_2.get(elem, 0) + 1

        return map_1 == map_2

    def isAnagram(self, s:str, t:str) -> bool:
        # even faster one

        # basically, we can fill up the dictionary 
        # with one string
        # and empty it with another

        # o(n) time complexity

        if len(s) != len(t):
            return False

        # let's use only a single dict 
        char_count = {}
        for char in s:
            # if it already exists, add 1 to value
            char_count[char] = char_count.get(char, 0) + 1

        for char in t:
            # if t has a different character
            if char not in char_count:
                return False
            char_count[char] -= 1

            # if we expire the count of a character
            if char_count[char] == 0:
                del char_count[char]

        # expecting an empty dict in the end
        return not char_count

sol = Solution()
print(sol.isAnagram(s = "rat", t  = "car")) # False
print(sol.isAnagram(s = "anagram", t  = "nagaram")) # True

print(sol.isAnagram_(s = "rat", t  = "car")) # False
print(sol.isAnagram_(s = "anagram", t  = "nagaram")) # True


False
True
False
True


In [34]:
"""
Given an array of integers nums and an integer target, return 
indices of the two numbers such that they add up to target.

You may assume that each input would have exactly one 
solution, and you may not use the same element twice.

You can return the answer in any order.

Example 1:

    Input: nums = [2,7,11,15], target = 9
    Output: [0,1]
    
    Explanation: Because nums[0] + nums[1] == 9, we return [0, 1].

Example 2:

    Input: nums = [3,2,4], target = 6
    Output: [1,2]

Example 3:

    Input: nums = [3,3], target = 6
    Output: [0,1]

Constraints:

    2 <= nums.length <= 10^4
    -10^9 <= nums[i] <= 10^9
    -10^9 <= target <= 10^9
    Only one valid answer exists.

Takeaway: 

    Hashmaps are doing wonders.
"""

class Solution:
    def twoSum(self, nums: list[int], target: int) -> list[int]:
        # only one solution
        # make index map
        index_map = {}

        for i, elem in enumerate(nums):
            difference = target - elem
            if difference in index_map:
                return [i, index_map[difference]]
            index_map[elem] = i

sol = Solution()

print(sol.twoSum(nums = [2,7,11,15], target = 9))
print(sol.twoSum(nums = [3,2,4], target = 6))

[1, 0]
[2, 1]


In [35]:
"""
Given an array of strings strs, group the anagrams together. 

You can return the answer in any order.

An Anagram is a word or phrase formed by rearranging the letters
of a different word or phrase, typically using all the original 
letters exactly once.

Example 1:

    Input: strs = ["eat","tea","tan","ate","nat","bat"]
    Output: [["bat"],["nat","tan"],["ate","eat","tea"]]

Example 2:

    Input: strs = [""]
    Output: [[""]]

Example 3:

    Input: strs = ["a"]
    Output: [["a"]]

Constraints:

    1 <= strs.length <= 10^4
    0 <= strs[i].length <= 100
    strs[i] consists of lowercase English letters.


Takeaway:

    We can use a dictionary to keep key value relationships 
    between sequence elements and their Counters
    
    We can sort all elements in sequence to unify the frequency of characters
    
    if not seen before in the dictionary, we can initialize an empty list
    later to append on it.

"""

from collections import Counter
from collections import defaultdict

class Solution:
    def groupAnagrams(self, strs: list[str]) -> list[list[str]]:
        # we can hold all strings as counters, together
        groups = defaultdict(list)
        
        for s in strs:
            # we can use a frozenset to unify strings
            # and make them comparable
            groups[frozenset(Counter(s).items())].append(s)

        return list(groups.values())

    def groupAnagrams_(self, strs: list[str]) -> list[list[str]]:
        # we can also solve it with sorting the strings:
        strs_map = {}

        for s in strs:
            # sort the word
            sorted_str = "".join(sorted(s))

            if sorted_str not in strs_map:
	            # make an new key for this element
                strs_map[sorted_str] = []
            
            # add the elem to that key
            strs_map[sorted_str].append(s)
            
        return list(strs_map.values())

    def groupAnagrams__(self, strs: list[str]) -> list[list[str]]:
        # we can also take the collections route with tuples
        res = defaultdict(list)
        for s in strs:
	        # we can use a tuple for our dict keys
	        # sorted returns a list
            res[tuple(sorted(s))].append(s)
        return list(res.values()) 

sol = Solution()
print(sol.groupAnagrams(strs = ["eat","tea","tan","ate","nat","bat"]))
print(sol.groupAnagrams_(strs = ["eat","tea","tan","ate","nat","bat"]))
print(sol.groupAnagrams__(strs = ["eat","tea","tan","ate","nat","bat"]))

[['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']]
[['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']]
[['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']]


In [36]:
"""
Given an integer array nums and an integer k, return the k most frequent 
elements. You may return the answer in any order.

Example 1:

    Input: nums = [1,1,1,2,2,3], k = 2
    Output: [1,2]

Example 2:

    Input: nums = [1], k = 1
    Output: [1]
 
Constraints:

    1 <= nums.length <= 10^5
    -10^4 <= nums[i] <= 10^ 4
    k is in the range [1, the number of unique elements in the array].
    It is guaranteed that the answer is unique.
 
Follow up: Your algorithm's time complexity must be better 
than O(n log n), where n is the array's size.

Takeaway:

    USE COLLECTIONS. A library that is written by competent 
    people is way better your weird approaches.

    COunter has most_common method.

    List the n most common elements and their counts from the most 
    common to the least. If n is None, then list all element counts.

"""

from collections import Counter
from heapq import nlargest

class Solution:
    
    def topKFrequent(self, nums: list[int], k: int) -> list[int]:
        # we can use Counters with most_common
        c = Counter(nums)
        return [elem[0] for elem in c.most_common(k)]


    def topKFrequent_(self, nums: list[int], k: int) -> list[int]:
        # or we can take the heap route just to use, nlargest

        # edge case 
        if k == len(nums):
            return nums

        c = Counter(nums)
        
        return nlargest(k, c.keys(), key = c.get)


sol = Solution()
print(sol.topKFrequent(nums = [1,1,1,2,2,3], k = 2))
print(sol.topKFrequent(nums = [1], k = 1))

print(sol.topKFrequent_(nums = [1,1,1,2,2,3], k = 2))
print(sol.topKFrequent_(nums = [1], k = 1))

[1, 2]
[1]
[1, 2]
[1]


In [1]:
"""
Given an integer array nums, return an array answer such that 
answer[i] is equal to the product of all the elements of nums except nums[i].

The product of any prefix or suffix of nums is guaranteed 
to fit in a 32-bit integer.

You must write an algorithm that runs in O(n) time and 
without using the division operation.

Example 1:

    Input: nums = [1,2,3,4]
    Output: [24,12,8,6]


Example 2:

    Input: nums = [-1,1,0,-3,3]
    Output: [0,0,9,0,0]
 
Constraints:

    2 <= nums.length <= 10^5
    -30 <= nums[i] <= 30
    
    The product of any prefix or suffix of nums is 
        guaranteed to fit in a 32-bit integer.
 

Follow up: Can you solve the problem in O(1) extra space complexity? 
(The output array does not count as extra space for space complexity analysis.)

Takeaway:

    If you multiply every element to the left and to the right, 
    you will get the product of all elements.

"""

import random

# for an extra test
from time import perf_counter_ns

class Solution:
    def productExceptSelf_(self, nums: list[int]) -> list[int]:
        # can we brute force it?
        # yeah, but it is not o(n), here it is anyway
        n = len(nums)
        result = [1] * n
        for i in range(n):
            product = 1 
            for j in range(n):
                if j != i:
                    product *= nums[j]
            result[i] = product
        
        return result


    def productExceptSelf(self, nums: list[int]) -> list[int]:
        # here is how you solve it in o(n)

        # how can we solve it in o(n) ? 
        # we basically cannot have nested loops,
        # we can iterate over the list only. No sorting.
        
        # decomposition!

        n = len(nums)
        result = [1] * n

        # compute products to the left of each element
        # and store it in result
        left_product = 1
        for i in range(n):
            # multiply the result element with current 
            # left_product
            result[i] *= left_product
            # adjust the left product with given list element
            left_product *= nums[i]

        # compute products to the left of each element
        # and store it in result
        right_product = 1

        # in backwards:
        for i in range(n-1, -1, -1):
            # multiply the result element with current 
            # left_product
            result[i] *= right_product
            # adjust the left product with given list element
            right_product *= nums[i]

        return result


if __name__ == "__main__":
    sol = Solution()
    
    print(sol.productExceptSelf_(nums = [1,2,3,4]))
    print(sol.productExceptSelf_(nums = [-1,1,0,-3,3]))
    
    # a = perf_counter_ns()
    # print(sol.productExceptSelf_(random.sample(range(1,234623576), 30)))
    # b = perf_counter_ns()
    # print(F"Brute force took {(b - a)} nanoseconds")
    
    print(sol.productExceptSelf(nums = [1,2,3,4]))
    print(sol.productExceptSelf(nums = [-1,1,0,-3,3]))
    
    # this will work!
    # c = perf_counter_ns()
    # print(sol.productExceptSelf(random.sample(range(1,234623576), 30)))
    # d = perf_counter_ns()
    # print(F"Linear Time solution took {(d - c)} nanoseconds")

[24, 12, 8, 6]
[0, 0, 9, 0, 0]
[24, 12, 8, 6]
[0, 0, 9, 0, 0]


In [38]:
"""
Design a time-based key-value data structure that can store multiple values
for the same key at different time stamps and retrieve the key's value
at a certain timestamp.

Implement the TimeMap class:

    TimeMap() Initializes the object of the data structure.

    void set(String key, String value, int timestamp) Stores the key 'key' 
        with thevalue 'value' at the given time timestamp.

    String get(String key, int timestamp) Returns a value such that set was 
        called previously, with timestamp_prev <= timestamp. If there are multiple such 
        values, it returns the value associated with the largest timestamp_prev.
        If there are no values, it returns "".
 
Example 1:

    Input

        ["TimeMap", "set", "get", "get", "set", "get", "get"]

        [[], ["foo", "bar", 1], ["foo", 1], ["foo", 3], 
                ["foo", "bar2", 4], ["foo", 4], ["foo", 5]]

    Output

        [null, null, "bar", "bar", null, "bar2", "bar2"]

    Explanation

        TimeMap timeMap = new TimeMap();

        timeMap.set("foo", "bar", 1);  // store the key "foo" and value "bar" 
        along with timestamp = 1.

        timeMap.get("foo", 1);         // return "bar"

        timeMap.get("foo", 3);         // return "bar", since there is no value 
        corresponding to foo at timestamp 3 and timestamp 2, then the only value 
        is at timestamp 1 is "bar".

        timeMap.set("foo", "bar2", 4); // store the key "foo" and value "bar2" along
         with timestamp = 4.

        timeMap.get("foo", 4);         // return "bar2"

        timeMap.get("foo", 5);         // return "bar2"
 

Constraints:

    1 <= key.length, value.length <= 100
    key and value consist of lowercase English letters and digits.
    1 <= timestamp <= 10^7
    All the timestamps timestamp of set are strictly increasing.
    At most 2 * 10^5 calls will be made to set and get.

Takeaway:

    Obviously we can use a dictionary.
"""

class TimeMap:
    """The idea of yourself is not bad.
    But for get method you can use the fact that timestamps 
    are strictly increasing.
    If that was not the case, you would have to sort the list 
    of timestamps for each get"""

    def __init__(self):
        # key is a string
        # value is a list of value and timestamp
        self.map = {}

    def set(self, key: str, value: str, timestamp: int) -> None:
        if key not in self.map:
            self.map[key] = []

        self.map[key].append([value, timestamp])
        
    def get(self, key: str, timestamp: int) -> str:
        result = ""

        values = self.map.get(key, [])

        # binary search
        l, r = 0 , len(values) - 1
        while l <= r:
            m = l + ((r - l) // 2)
            if values[m][1] <= timestamp:
                # closest we have seen so far
                result = values[m][0]
                # search to the left of the sequence
                l = m + 1
            else:
                r = m - 1

        return result

# Your TimeMap object will be instantiated and called as such:
# obj = TimeMap()
# obj.set(key,value,timestamp)
# param_2 = obj.get(key,timestamp)

In [39]:
"""
Given an array of integers nums and an integer target, return 
indices of the two numbers such that they add up to target.

You may assume that each input would have exactly 
one solution, and you may not use the same 
element twice.

You can return the answer in any order.

Example 1:

    Input: nums = [2,7,11,15], target = 9
    
    Output: [0,1]
    
    Explanation: 
    
        Because nums[0] + nums[1] == 9, we return [0, 1].

Example 2:

    Input: nums = [3,2,4], target = 6
    
    Output: [1,2]

Example 3:

    Input: nums = [3,3], target = 6
    
    Output: [0,1]

Constraints:

    2 <= nums.length <= 10^4
    
    -10^9 <= nums[i] <= 10^9
    
    -10^9 <= target <= 10^9
    
    Only one valid answer exists.

Follow-up: 
    
    Can you come up with an algorithm that is 
        less than O(n^2) time complexity?

Takeaway:

    dictionaries are wonderful.

    you can fill them while traversing.

"""

class Solution:
    def twoSum__(self, nums: list[int], target: int) -> list[int]:
        # we are looking for indices
        # use every element once

        # this solution works, but a bit messy

        indices_map = {elem: index for index, elem in enumerate(nums)}

        for i, num in enumerate(nums):
            remainder = target - num
            if remainder in indices_map:
                if i != indices_map[remainder]:
                    return [i, indices_map[remainder]]

    def twoSum_(self, nums: list[int], target: int) -> list[int]:
        # this works too
        # we do not have to populate 
        # the dict right away
        num_map = {}

        for i, num in enumerate(nums):
            complement = target - num
            
            if complement in num_map:
                # found the solution
                return [num_map[complement], i]

            # otherwise, just add the element to map
            num_map[num] = i
        
        return None
    
    def twoSum(self, nums: list[int], target: int) -> list[int]:
        # even more advanced version.
        map = {}
        # index and element traversal
        for i, n in enumerate(nums):
            if n in map:
                return [map[n], i]
            else:
                map[target - n] = i

sol = Solution()
print(sol.twoSum(nums = [2,7,11,15], target = 9)) # [0, 1]
print(sol.twoSum(nums = [3,2,4], target = 6)) # [1, 2]

[0, 1]
[1, 2]


In [40]:
"""
Roman numerals are represented by seven 
different symbols: 

I, V, X, L, C, D and M.

Symbol       Value
I             1
V             5
X             10
L             50
C             100
D             500
M             1000

For example, 2 is written as II in Roman numeral, just 
two ones added together. 

12 is written as XII, which is simply X + II. 

The number 27 is written as XXVII, which is XX + V + II.

Roman numerals are usually written largest 
to smallest from left to right. 

However, the numeral for four is not IIII. 

Instead, the number four is written as IV. 

Because the one is before the five we subtract 
it making four. 

The same principle applies to the number nine, which 
is written as IX. 

There are six instances where subtraction is used:

    I can be placed before V (5) and X (10) to make 4 and 9. 

    X can be placed before L (50) and C (100) to make 40 and 90. 

    C can be placed before D (500) and M (1000) to make 400 and 900.

Given a roman numeral, convert it to an integer.

Example 1:

    Input: s = "III"
    
    Output: 3
    
    Explanation: III = 3.

Example 2:

    Input: s = "LVIII"
    
    Output: 58
    
    Explanation: L = 50, V= 5, III = 3.

Example 3:

    Input: s = "MCMXCIV"
    
    Output: 1994
    
    Explanation: 
    
        M = 1000, CM = 900, XC = 90 and IV = 4.

Constraints:

    1 <= s.length <= 15
    
    s contains only the characters 
        ('I', 'V', 'X', 'L', 'C', 'D', 'M').
    
    It is guaranteed that s is a valid 
        roman numeral in the range [1, 3999].

Takeaway:

    Control flows require mastery.

    Dict and reverse traversal.

"""

class Solution:

    def romanToInt__(self, s: str) -> int:
        # from a homie - different approach
        total = 0
        
        # dictionary for roman numerals
        val_map = {"0": 0, 
                   "I": 1, 
                   "V": 5, 
                   "X": 10, 
                   "L": 50, 
                   "C": 100, 
                   "D": 500, 
                   "M": 1000}
        
        # Adding a special "0" character to the end 
        # of every input,  

        # so that we iterate from index 1:n and look 
        # back at the previous character.
        s += "0"
        
        for i in range(1, len(s)):
            
            v = val_map[s[i-1]]
            
            if val_map[s[i]] > v:
                total -= v
            
            else:
                total += v
        
        return total   
    
    
    def romanToInt_(self, s: str) -> int:
        # we can make a map for values
        # for each element, lookup the value
        # do not forget the edge cases

        # a sliding window can work

        roman_map = {"I": 1,
                     "V": 5, 
                     "X": 10,
                     "L": 50,
                     "C": 100,
                     "D": 500,
                     "M": 1000}

        edge_cases = {"IV": 4,
                      "IX": 9,
                      "XL": 40,
                      "XC": 90,
                      "CD": 400,
                      "CM": 900} 

        start = 0
        result = 0

        for end in range(1, len(s)+1):

            if start >= end:
                continue

            if s[start:end] in ("I", "X", "C"):
                if s[start:end + 1] in edge_cases:
                    result += edge_cases[s[start:end + 1]]
                    start += 2
                else:
                    result += roman_map[s[start:end]]
                    start += 1
            else:
                result += roman_map[s[start:end]]
                start += 1

        return result

    def romanToInt(self, s: str) -> int:
        # another appproach would be, traversing in reverse
        
        roman_map = {
            'I': 1,
            'V': 5,
            'X': 10,
            'L': 50,
            'C': 100,
            'D': 500,
            'M': 1000
        }

        # Initialize variables to keep track of 
        # the total value and the previous value
        result = 0
        prev = 0

        # Iterate through the string in reverse
        for i in s[::-1]:
            # integer value of current roman numeral
            curr = roman_map[i]  

            # If the previous value is greater than 
            # the current value, subtract the current value

            # like a "IV", the total should be subtracted by I
            if prev > curr:
                result -= curr
            else:
                # If the current value is greater than or 
                # equal to the previous value, 
                # add the current value
                result += curr
                
                # Update the previous value to the 
                # current value for the next iteration
                prev = curr

        return result  # Return the total integer value


sol = Solution()

print(sol.romanToInt_(s = "LVIII")) # 58
print(sol.romanToInt_(s = "III"))   # 3
print(sol.romanToInt_(s = "MCMXCIV")) # 1994

print()

print(sol.romanToInt(s = "LVIII")) # 58
print(sol.romanToInt(s = "III"))   # 3
print(sol.romanToInt(s = "MCMXCIV")) # 1994

58
3
1994

58
3
1994


## `set`'s of Python

Python’s `set` class represents the mathematical notion of a `set`, namely a collection of elements, **without duplicates**, and without an inherent order to those elements.  

The major advantage of using a `set`, as opposed to a `list`, is that it has **a highly optimized method for checking whether a specific element is contained in the set.**

The set does not maintain the elements in any particular order. 

However, after Python 3.6, `set.pop()` become non random, so it pops items like Queues.

The second is that only instances of **immutable types** (`hashable`) can be added to a Python set. 

Python uses curly braces `{}` as delimiters for a set, for example, as `{17}` or ```{ "red" , "green" , "blue" }```. 

The exception to this rule is that `{}` does not represent an empty set; for historical reasons, it represents an empty dictionary (see next paragraph). 

Instead, the constructor syntax `set()` produces an empty set.  

In [41]:
my_set = {"sezai", "fatmanur", "ozlem"}
print(my_set)

empty_set = set() 

print(set("apple")) # {"a", "p", "l", "e"} 

{'ozlem', 'sezai', 'fatmanur'}
{'l', 'p', 'e', 'a'}


Therefore, objects such as integers, floating-point numbers, and character strings are eligible to be elements of a set. It is possible to maintain a set of tuples, but not a set of lists or a set of sets, as lists and sets are mutable. The `frozenset` class is an immutable form of the set type, so it is legal to have a set of `frozensets`.

In [42]:
cold = frozenset((1,))
print(cold)

frozenset({1})


Python uses curly braces `{}` as delimiters for a set, for example, as `{17}` or ```{ "red" , "green" , "blue" }```. The exception to this rule is that `{}` does not represent an empty set; for historical reasons, it represents an empty dictionary (see next paragraph). Instead, the constructor syntax `set()` produces an empty set.  

In [43]:
# There are add and discard methods for sets

s = set()
s.add(1)
s.add(2)
s.add(3)
print(s)

s.discard(4) # No error
s.discard(4) # No error
print(s)

{1, 2, 3}
{1, 2, 3}


In [44]:
# do not worry about this
import traceback

# KeyError: 4
try:
    # try to remove 4 from set
    s.remove(4) 
except Exception as e:
    # print the error if you cannot remove it
    print(traceback.format_exc())

Traceback (most recent call last):
  File "/tmp/ipykernel_35472/1459238347.py", line 7, in <module>
    s.remove(4)
KeyError: 4



In [45]:
# containment check is fantastic, it is O(1)
print(s) # {1, 2, 3}

print(3 in s) # True
print(4 not in s) # True

{1, 2, 3}
True
True


### Here are operations that are (kinda) unique to sets: 

`S | T` - `S |= T` UNION

`S - T` - `S -= T` DIFFERENCE

`S & T` - `S &= T` INTERSECTION

`S ^ T` - `S ^= T` SYMMETRIC DIFFERENCE (elements are precisely in one of S or T)

Sets do not provide order between elements, so comparison is not lexicographic. 

**No orders here**. 

Also there is no slicing for `[]` sets. 

`Sets` and `frozensets` support the following operators:  

In [3]:
hset = {1 , 2 , 3, 4}
hset_2 = {6, 7, 9, 4}
hset_3 = {1, 2, 3}

print(f" is {2} in the hset: {2 in hset}") # True
# is 2 in the hset: True

# print([] in hset) # unhashable type: 'list'

print(f" is {7} in the hset: {7 in hset}") # False
# is 7 in the hset: False

print(f"are these equal: {hset == hset_2}" ) # False
# are these equal: False

print( f" smaller or equal ? : {hset <= hset_2}") # False
# smaller or equal ? : False

 is 2 in the hset: True
 is 7 in the hset: False
are these equal: False
 smaller or equal ? : False


In [4]:
print(hset < hset_2 ) # False
print(hset >= hset_3) # True
print(hset > hset_3 ) # True

print(hset | hset_2) 
# {1, 2, 3, 4, 6, 7, 9}

print(hset & hset_2) # {4}

print(f"big set - small set: {hset_2 - hset}")
# big set - small set: {9, 6, 7}

# only in set 1 OR set 2
print(hset ^ hset_2) 
# {1, 2, 3, 6, 7, 9}

hset.add(12)
hset.remove(12)
hset.discard(7) 
# does not give an error even though
# 7 is not in the set

False
True
True
{1, 2, 3, 4, 6, 7, 9}
{4}
big set - small set: {9, 6, 7}
{1, 2, 3, 6, 7, 9}


### Here are some examples

In [48]:
# Check If N and Its Double Exist

"""
Given an array arr of integers, check if there exist 
two indices i and j such that :

    i != j
    0 <= i, j < arr.length
    arr[i] == 2 * arr[j]
 
Example 1:

    Input: arr = [10,2,5,3]
    Output: true
    
    Explanation: For i = 0 and j = 2, 
            arr[i] == 10 == 2 * 5 == 2 * arr[j]

Example 2:

    Input: arr = [3,1,7,11]
    Output: false
        
    Explanation: There is no i and j that 
        satisfy the conditions.
 

Constraints:

    2 <= arr.length <= 500
    -10^3 <= arr[i] <= 10^3

Takeaways:

    Using sets is a wonderful tool here.

"""
class Solution:
    def checkIfExist_(self, arr: list[int]) -> bool:
        # a brute force would be linear search
        n = len(arr)
        for i in range(n) :
            for j in range(i) :
                # up until i, check every position
                if (arr[i]==2*arr[j]) or (arr[j]==2*arr[i]) :
                    return True 
        return False    
    
    def checkIfExist(self, arr: list[int]) -> bool:
        # better solution would be using a set
        seen = set()
        
        for i in range(len(arr)):
            # have we encountered before?
            if arr[i] * 2 in seen:
                return True
            # conditions
            elif arr[i] % 2 == 0 and arr[i] / 2 in seen:
                return True
            # just add to set
            else:
                seen.add(arr[i])
        return False

sol = Solution()
print(sol.checkIfExist(arr = [10,2,5,3]))
print(sol.checkIfExist(arr = [3,1,7,11]))

True
False


In [49]:
"""Genius - Real Insight"""

# C-10.34 
# Computing a hash code can be expensive, especially for lengthy keys.
# 
#  In our hash table implementations, we compute the hash code when first inserting
#  an item, and recompute each item’s hash code each time we resize
# our table. 
# 
# Python’s dict class makes an interesting trade-off. The hash
# code is computed once, when an item is inserted, and the hash code is
# stored as an extra field of the item composite, so that it need not be recomputed.
#
#  Reimplement our HashTableBase class to use such an approach.

from collections.abc import MutableMapping
from random import randrange

class HashTableBase(MutableMapping):
    """Abstract base class for hash table implementations."""

    def __init__(self, cap=11):
        """Make an empty hash table."""
        self._table = cap * [None]                      # Make an empty table (list)
        self._n = 0                                     # Number of entries in the map
        self._prime = 109345121                         # Prime number for MAD compression
        self._scale = 1 + randrange(self._prime - 1)    # Scale for MAD compression
        self._shift = randrange(self._prime)            # Shift for MAD compression
        self._hash_codes = {}                           # Dictionary to store hash codes

    def __len__(self):
        """Return the number of entries in the hash table."""
        return self._n

    def _hash_function(self, k):
        """Compute the hash code for key k."""
        # CACHING
        if k in self._hash_codes:
            return self._hash_codes[k]  # Use the stored hash code
        hash_code = (hash(k) * self._scale + self._shift) % self._prime % len(self._table)
        self._hash_codes[k] = hash_code  # Store the hash code
        return hash_code

    def __getitem__(self, k):
        """Return the value associated with key k (raise KeyError if not found)."""
        j = self._hash_function(k)
        return self._bucket_getitem(j, k)

    def __setitem__(self, k, v):
        """Assign value v to key k, overwriting existing value if present."""
        j = self._hash_function(k)
        self._bucket_setitem(j, k, v)
        if self._n > len(self._table) // 2:
            self._resize(2 * len(self._table) - 1)

    def __delitem__(self, k):
        """Remove the item associated with key k (raise KeyError if not found)."""
        j = self._hash_function(k)
        self._bucket_delitem(j, k)
        self._n -= 1

    def __iter__(self):
        """Generate an iterator of keys in the hash table."""
        for bucket in self._table:
            if bucket is not None:
                for key in bucket:
                    yield key

# With this implementation, the hash code for each key is computed only once and stored
#  in the _hash_codes dictionary. Subsequent operations that require the hash code can 
# retrieve it from this dictionary, avoiding the need to recompute it. This optimization can
#  reduce the overhead of computing hash codes, especially for lengthy keys.

In [50]:
"""
Given an unsorted array of integers nums, return the length of the 
longest consecutive elements sequence.

You must write an algorithm that runs in O(n) time.

Example 1:

    Input: nums = [100,4,200,1,3,2]
    Output: 4
    Explanation: The longest consecutive elements sequence is [1, 2, 3, 4].

         Therefore its length is 4.
     
Example 2:

    Input: nums = [0,3,7,2,5,8,4,6,0,1]
    Output: 9
 

Constraints:

    0 <= nums.length <= 10^5
    -10^9 <= nums[i] <= 10^9

Takeaway:

    To get rid of reapparent values, you can use a set,

    Imply defining the question in ENGLISH is the first step.

    you can override a value in an iteration with 

        current = max(current, new)

"""

class Solution:
    def longestConsecutive(self, nums: list[int]) -> int:
        # o(n) means no sorting
        
        # we are interested in unique elements
        nums_set = set(nums)

        # we are only interested in longest streak
        longest_streak = 0
        
        for num in nums_set:
            if num - 1 not in nums_set:
                # a new streak starts
                current_num = num
                current_streak = 1
                
                while current_num + 1 in nums_set:
                    # streak continues
                    current_num += 1
                    current_streak +=1

                # update the longest streak         
                longest_streak = max(current_streak, 
                                        longest_streak)
                
        return longest_streak

sol = Solution()

print(sol.longestConsecutive(nums = [100,4,200,1,3,2])) # 4
print(sol.longestConsecutive(nums = [0,3,7,2,5,8,4,6,0,1])) # 9

4
9


In [51]:
"""
Write an algorithm to determine if a 
number n is happy.

A happy number is a number defined by 
the following process:

    Starting with any positive integer, replace 
        the number by the sum of the squares 
        of its digits.
    
    Repeat the process until the number equals 
        1 (where it will stay), or it loops endlessly 
        in a cycle which does not include 1.

    Those numbers for which this process ends 
        in 1 are happy.

Return true if n is a happy number, and false if not.

Example 1:

    Input: n = 19
    
    Output: True
    
    Explanation:
        
        1^2 + 9^2 = 82
        8^2 + 2^2 = 68
        6^2 + 8^2 = 100
        1^2 + 0^2 + 0^2 = 1

Example 2:

    Input: n = 2
    
    Output: False

Constraints:

    1 <= n <= 2^31 - 1

Takeaway:

    To detect a cycle, set is wonderful!
"""

class Solution:
    def isHappy_(self, n: int) -> bool:
        # this cannot escape the infinite loop
        # does NOT Work

        # get all digits of n
        # sum of squares
        # update n

        def get_all_digits(number):
            digits = []
            
            while number:
                remainder = number % 10
                digits.append(remainder)
                number //= 10
            
            return digits 

        temp = n

        while n != 1:
            dig = get_all_digits(n)
            n = sum((elem * elem for elem in dig))
                
        return True

    def isHappy(self, n: int) -> bool:
        seen = set()
        while True:
            if n == 1:
                # end of loop!
                return True
            elif n in seen:
                # we knew this from before
                return False
            else:
                # add the current number to set
                seen.add(n)
                # use strings to access digits simply
                n = sum(int(i)**2 for i in str(n))

sol = Solution()
print(sol.isHappy(19))
print(sol.isHappy(2))

True
False


In [7]:
"""
You are given two strings, order and s. 

All the characters of order are unique and were sorted 
in some custom order previously.

Permute the characters of s so that they match the 
order that order was sorted. 

More specifically, if a character x occurs before a 
character y in order, then x should occur before y in 
the permuted string.

Return any permutation of s that satisfies this property.

Example 1:

    Input:  order = "cba", s = "abcd" 

    Output:  "cbad" 

    Explanation: 
        
        "a", "b", "c" appear in order, so the order 
            of "a", "b", "c" should be "c", "b", and "a".

        Since "d" does not appear in order, it can be at 
            any position in the returned string. 
            
        "dcba", "cdba", "cbda" are also valid outputs.

Example 2:

    Input:  order = "bcafg", s = "abcd" 

    Output:  "bcad" 

    Explanation: 
        
        The characters "b", "c", and "a" from order dictate 
            the order for the characters in s. 
            
        The character "d" in s does not appear in 
            order, so its position is flexible.

        Following the order of appearance in order, "b", 
            "c", and "a" from s should be arranged as 
            "b", "c", "a". "d" can be placed at any position 
            since it's not in order. 
            
        The output "bcad" correctly follows this rule. 
        
        Other arrangements like "bacd" or "bcda" would also 
            be valid, as long as "b", "c", "a" maintain their order.

Constraints:

    1 <= order.length <= 26
    
    1 <= s.length <= 200
    
    order and s consist of lowercase English letters.
    
    All the characters of order are unique.

Takeaway:

    We can sort based on our own dicts!

"""

class Solution:
    def customSortString(self, order: str, s: str) -> str:
        # order was sorted in a custom order
        # so order should give us hints on positioning
        
        # make a priority map
        order_map = {}
        for i, char in enumerate(order):
            order_map[char] = i

        # now we have a ordering map to sort with!

        # sort the map
        # sort s based on the custom order
        sorted_s = sorted(s, key = lambda x: order_map.get(x, float("inf")))

        # return keys in order
        return ''.join(sorted_s)
    

sol = Solution()
print(sol.customSortString(order = "bcafg", s = "abcd"))

bcad


## `Counter` Class of Python

`Counter` has cool methods too:

It is basically a dictionary, but focused on keeping the frequeny of stuff.

In [5]:
import collections

ct = collections.Counter('abracadabra')
print(ct) 

# Counter({'a': 5, 'b': 2, 'r': 2, 'c': 1, 'd': 1})

Counter({'a': 5, 'b': 2, 'r': 2, 'c': 1, 'd': 1})


In [6]:
# what are the 3 most common keys we have?
print(ct.most_common(3)) # [('a', 10), ('z', 3), ('b', 2)]

# how can we access the keys only?
print([elem[0] for elem in ct.most_common(3)])

[('a', 5), ('b', 2), ('r', 2)]
['a', 'b', 'r']


In [7]:
# we can just update the existing counter
ct.update('aaaaazzz') 
# add counts instead of replacing them
print(ct) 
# Counter({'a': 10, 'z': 3, 'b': 2, 'r': 2, 'c': 1, 'd': 1})

Counter({'a': 10, 'z': 3, 'b': 2, 'r': 2, 'c': 1, 'd': 1})


In [8]:
# and subtract too
ct.subtract("bb") 
# subtracts counts instead of replacing them.

print(ct.most_common(3)) 
# [('a', 10), ('z', 3), ('r', 2)]

[('a', 10), ('z', 3), ('r', 2)]


In [9]:
# how many things we have?
import sys

# Get major and minor version numbers
major_version = sys.version_info.major
minor_version = sys.version_info.minor

# Check if the version is newer than 3.10
if (major_version, minor_version) >= (3, 10):
    print("Sum of total counts: ", ct.total()) 
    # sum of total counts # 17
else:
    print("You need Python 3.10 or newer.")

You need Python 3.10 or newer.


### The `Counter` is _displayed_ in [`most_common`](https://docs.python.org/3/library/collections.html#collections.Counter.most_common) order, whose current documentation says:

>  **Elements with equal counts are ordered in the order first encountered**

The implementation uses sorted, which is:

> ...guaranteed to be stable. A sort is stable if it guarantees not to change the relative order of elements that compare equal...

hence retaining the insertion order of elements with equal counts.

## `OrderedDict` of Python

OrderedDict is a dictionary subclass in Python's collections module that maintains the order of the keys as they are inserted. 

This means that when you iterate over an OrderedDict, the items will be returned in the order they were inserted.

In [3]:
from collections import OrderedDict

# Creating an ordered dictionary
od = OrderedDict()

# adding key value pairs
od['apple'] = 1
od['banana'] = 2
od['cherry'] = 3

# Iterate over the OrderedDict
for key, value in od.items():
    print(f"{key}: {value}")
# Output:
# apple: 1
# banana: 2
# cherry: 3

apple: 1
banana: 2
cherry: 3


In [59]:
# best OrderedDict example is a LRU

from collections import OrderedDict

class LRUCache:
    def __init__(self, capacity):
        self.capacity = capacity
        self.cache = OrderedDict()

    def get(self, key):
        if key not in self.cache:
            return -1
        # Move the accessed key to the end to mark it 
        # as most recently used
        self.cache.move_to_end(key)
        return self.cache[key]

    def put(self, key, value):
        if key in self.cache:
            # If key exists, update its value and 
            # move it to the end
            self.cache[key] = value
            self.cache.move_to_end(key)
        else:
            # If key doesn't exist, check if cache is full
            if len(self.cache) >= self.capacity:
                # If cache is full, remove the least 
                # recently used item (first item)
                self.cache.popitem(last=False)
            # Add the new key-value pair to the 
            # end of the OrderedDict
            self.cache[key] = value

# Example usage:
cache = LRUCache(2)
cache.put(1, 1)
cache.put(2, 2)
print(cache.get(1))  # Output: 1
cache.put(3, 3)       # Evicts key 2
print(cache.get(2))  # Output: -1 (not found)
cache.put(4, 4)       # Evicts key 1
print(cache.get(1))  # Output: -1 (not found)
print(cache.get(3))  # Output: 3
print(cache.get(4))  # Output: 4

1
-1
-1
3
4


In [60]:
# more Pythonic approach to LRU

class LRU(OrderedDict):
    'Limit size, evicting the least recently looked-up key when full'

    def __init__(self, maxsize=3, *args, **kwds):
        self.maxsize = maxsize
        super().__init__(*args, **kwds)

    def __getitem__(self, key):
        value = super().__getitem__(key)
        self.move_to_end(key)
        return value

    def __setitem__(self, key, value):
        super().__setitem__(key, value)
        if len(self) > self.maxsize:
            oldest = next(iter(self))
            del self[oldest]
        self.move_to_end(key)

my_cache = LRU()

my_cache[1] = 1
my_cache[2] = 2
my_cache[3] = 3

print(my_cache)

my_cache[4] = 4

# 1 is dropped 
print(my_cache)

LRU([(1, 1), (2, 2), (3, 3)])
LRU([(2, 2), (3, 3), (4, 4)])


Serialization and deserialization: When you serialize an OrderedDict (e.g., to JSON or YAML), the order of the items is preserved, which can be useful in certain applications.

This can be useful when you need to maintain the order of insertion, such as when you want to track the order of items in a configuration file or when processing data that depends on the order of keys.

## `defaultdict` of Python

In Python, `defaultdict` is a class available in the collections module that provides a way to make dictionaries where **missing keys automatically have a default value** assigned. 

This default value can be of any data type.

The main advantage of using defaultdict is that it **avoids key errors** when trying to access keys that are not present in the dictionary. 

Instead of raising a `KeyError`, defaultdict will make the key and initialize it with the default value specified when the defaultdict object was created.

In [1]:
from collections import defaultdict

# Create a defaultdict with default value as int (default is 0)
d = defaultdict(int)
d['a'] = 1
d['b'] = 2

print(d['a'])  # Output: 1
print(d['b'])  # Output: 2
print(d['c'])  # Output: 0 (default value for int)

1
2
0


In [7]:
"""
Given an array of strings strs, group the anagrams together. 

You can return the answer in any order.

An Anagram is a word or phrase formed by rearranging the letters
of a different word or phrase, typically using all the original 
letters exactly once.

Example 1:

    Input: strs = ["eat","tea","tan","ate","nat","bat"]
    Output: [["bat"],["nat","tan"],["ate","eat","tea"]]

Example 2:

    Input: strs = [""]
    Output: [[""]]

Example 3:

    Input: strs = ["a"]
    Output: [["a"]]

Constraints:

    1 <= strs.length <= 10^4
    0 <= strs[i].length <= 100
    strs[i] consists of lowercase English letters.


Takeaway:

    We can use a dictionary to keep key value relationships 
    between sequence elements and their Counters
    
    We can sort all elements in sequence to unify the frequency of characters
    
    if not seen before in the dictionary, we can initialize an empty list
    later to append on it.

"""

from collections import defaultdict

def groupAnagrams(strs: list[str]) -> list[list[str]]:
    
    # every key has a empty list as default
    res = defaultdict(list)

    for s in strs:
        # we can use a tuple for our dict keys
        # sorted returns a list
        # make it a tuple for it to be hashable
        res[tuple(sorted(s))].append(s)

    return list(res.values()) 
    
print(groupAnagrams(["eat","tea","tan","ate","nat","bat"]))

[['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']]


In [8]:
s = 'mississippi'
d = defaultdict(int)
for k in s:
    d[k] += 1

sorted(d.items())

# When a letter is first encountered, it is missing from 
# the mapping, so the default_factory function calls 
# int() to supply a default count of zero. 

# The increment operation then builds up the count for each letter.

[('i', 4), ('m', 1), ('p', 2), ('s', 4)]