# Algorithms by Yandex

[youtube playlist](https://www.youtube.com/playlist?list=PL6Wui14DvQPySdPv5NUqV3i8sDbHkCKC5)

## Lesson 4. Dictionaries, Counting sort

### Counting sort

What is counting sort:
- Suppose we need to sort an array of N integers, each ranging from 0 to K.
- The regular sorting will take O(NlogN) time.
- We will count the occurrences of each number, and then output each number as many times as it appeared. This will take O(N+k) time and O(k) additional memory.
- The range of values can be shifted so that it is not from 0 to K, but from the minimum to the maximum value in the array.

*The function performs counting sort on the input sequence seq. It first determines the range of the values in the sequence and creates an array count for counting their occurrences. Then it loops through the sequence and updates the count array. Finally, it loops through the count array and updates the sequence with the sorted values. The function returns the sorted sequence. The example usage shows how to call the function with an example sequence.*

In [11]:
def countsort(seq):
    # Get the minimum and maximum values in the sequence.
    minval = min(seq)
    maxval = max(seq)
    # Calculate the range of the values and create an array for counting their occurrences.
    k = maxval - minval + 1
    count = [0] * k
    # Count the occurrences of each value in the sequence.
    for now in seq:
        count[now - minval] += 1
    # Update the sequence with the sorted values based on the count array.
    nowpos = 0
    for val in range(0, k):
        for i in range(count[val]):
            seq[nowpos] = val + minval
            nowpos += 1
    # Return the sorted sequence.
    return seq


# Example usage
countsort([5, 3, 4, 5, 1, 5])

[1, 3, 4, 5, 5, 5]

 #### Task 1. 

Given two numbers X and Y without leading zeroes.  
It is necessary to check if the first number can be obtained from the second by permuting its digits.

*The function checks whether the number X can be obtained from Y by permuting its digits. It first defines a helper function countdigits that counts the occurrences of each digit in a number. The function uses this helper function to count the occurrences of digits in both X and Y. Then, it compares the counts of each digit between X and Y. If they differ for any digit, it returns False, indicating that X cannot be obtained from Y by permuting digits. Otherwise, it returns True. The example usage shows how to call the function with an example pair of numbers.*

In [13]:
def isdigitpermutation(x, y):
    # Helper function to count the occurrences of each digit in a number.
    def countdigits(num):
        digitcount = [0] * 10
        while num > 0:
            lastdigit = num % 10
            digitcount[lastdigit] += 1
            num //= 10
        return digitcount

    # Count the occurrences of each digit in X and Y.
    digitsx = countdigits(x)
    digitsy = countdigits(y)
    # Compare the digit counts to check if X can be obtained from Y by permuting digits.
    for digit in range(10):
        if digitsx[digit] != digitsy[digit]:
            return False
    return True


# Example usage
isdigitpermutation(2021, 1202)

True

#### Summary

Counting sort is the best choice when the input is a sequence of integers with a small range of values. It has a time complexity of O(n+k), where n is the length of the input sequence and k is the range of the values in the sequence. It can be faster than other comparison-based sorting algorithms, such as quicksort or mergesort, which have a time complexity of O(n log n).

Counting sort is particularly useful when the range of values is known and relatively small. For example, if you need to sort a list of test scores between 0 and 100, counting sort can be a very efficient sorting algorithm.

However, counting sort requires extra memory to store the counting array, which can make it impractical for sorting large sequences or sequences with a large range of values. In such cases, it might be better to use a different sorting algorithm with a higher time complexity, such as quicksort or mergesort.

### Dictionaries

- A dictionary is like a set, but each key is associated with a value.
- It's not possible to search a dictionary by value.
- The time complexity of dictionary operations is noticeably higher than that of arrays, so where possible, it's better to use counting sort.
- It's not reasonable to use counting sort when the data is sparse.

#### Task 2.

On an N x N chessboard, there are M rooks (a rook attacks squares on the same row or column until the first occupied square).  
Determine how many pairs of rooks attack each other.  
The rooks are represented by a pair of numbers I and J, indicating the coordinates of the square.

1 <= N <= 10^9  
0 <= M <= 2*10^5

In [15]:
def countbeatingrooks(rookcoords):
    # function to add a rook to a row or column
    def addrook(roworcol, key):
        if key not in roworcol:
            roworcol[key] = 0
        roworcol[key] += 1
    
    # function to count pairs of rooks in a row or column
    def countpairs(roworcol):
        pairs = 0
        for key in roworcol:
            # count pairs of rooks (two rooks make one pair)
            pairs += roworcol[key] - 1
        return pairs
    
    # initialize dictionaries to count rooks in each row and column
    rooksinrow = {}
    rooksincol = {}
    
    # count rooks in each row and column
    for row, col in rookcoords:
        addrook(rooksinrow, row)
        addrook(rooksincol, col)
    
    # count pairs of beating rooks in each row and column
    return countpairs(rooksinrow) + countpairs(rooksincol)


# Example usage
rookcoords = [(1, 1), (2, 2), (3, 1), (1, 3), (5, 5)]
print(countbeatingrooks(rookcoords))

2


The function works as follows:

1. It defines two helper functions, addrook and countpairs. addrook takes a dictionary roworcol and a key key, and adds the key to the dictionary with a value of 1 if the key does not already exist, or increments the value by 1 if the key already exists. countpairs takes a dictionary roworcol and counts the number of pairs of rooks that are in the same row or column.
2. The function initializes two dictionaries rooksinrow and rooksincol to keep track of the rooks in each row and column, respectively.
3. For each rook in the rookcoords list, the function adds the rook to the rooksinrow and rooksincol dictionaries using the addrook helper function.
4. The function then calculates the number of pairs of rooks in the same row or column by calling the countpairs function on rooksinrow and rooksincol, and returns the sum of the two counts.

#### Task 3. 

Given a string S.  
Print a histogram as in the example (character codes are sorted).

In [18]:
def printchar(s):
    # dictionary to store symbol counts
    symcount = {}
    # variable to store maximum count of a single symbol
    maxsymcount = 0
    # iterate over each symbol in the string
    for sym in s:
        # if symbol is not in symcount, add it with count 0
        if sym not in symcount:
            symcount[sym] = 0
        # increment the count of the current symbol
        symcount[sym] += 1
        # update maxsymcount if the current count is greater than the current max
        maxsymcount = max(maxsymcount, symcount[sym])
    # get a sorted list of unique symbols
    sorteduniqsyms = sorted(symcount.keys())
    # iterate over each row of the histogram
    for row in range(maxsymcount, 0, -1):
        # iterate over each symbol and print a # if its count is greater than or equal to the current row
        for sym in sorteduniqsyms:
            if symcount[sym] >= row:
                print('#', end='')
            else:
                print(' ', end='')
        # print a newline to start a new row
        print()
    # print the sorted list of unique symbols at the bottom of the histogram
    print(''.join(sorteduniqsyms))
    

# Example usage    
printchar('Hello, world!')

      #   
      ##  
##########
 !,Hdelorw


Here is a breakdown of how the function works:

1. symcount = {}: Create an empty dictionary to store the count of each character in the string.
2. maxsymcount = 0: Initialize a variable to keep track of the maximum count of any character.
3. for sym in s:: Loop over each character in the string.
4. if sym not in symcount:: If the character is not already in the dictionary, add it and set the count to zero.
5. symcount[sym] += 1: Increment the count of the character in the dictionary.
6. maxsymcount = max(maxsymcount, symcount[sym]): Update the maximum count if necessary.
7. sorteduniqsyms = sorted(symcount.keys()): Get a sorted list of the unique characters in the string.
8. for row in range(maxsymcount, 0, -1):: Loop over the rows of the histogram, starting from the maximum count and going down to 1.
9. for sym in sorteduniqsyms:: Loop over the characters in the sorted list.
10. if symcount[sym] >= row:: If the count of the character is greater than or equal to the current row, print a '#' symbol.
11. else:: Otherwise, print a space.
12. print(): Print a newline character to move to the next row of the histogram.
13. print(''.join(sorteduniqsyms)): Print the sorted list of characters at the bottom of the histogram.


### Optimization

Criteria for algorithm quality:
- Time to run
- Memory usage
- Time to code it
- Maintainability complexity
- Parallelization feasibility
- Required employee expertise for maintenance
- Cost of equipment

#### Task 4.

Сгруппировать слова по общим буквам.  

Sample input: ['eat', 'tea', 'tan', 'ate', 'nat', 'bat']  
Sample output: [['ate', 'eat', 'tea'], ['nat', 'tan'], ['bat']]

In [22]:
def groupwords(words):
    # create an empty dictionary to store the groups
    groups = {}
    
    # for each word in the list of words
    for word in words:
        # sort the letters of the word and join them back into a string
        sortedword = ''.join(sorted(word))
        
        # if the sorted word is not in the groups dictionary, add it with an empty list
        if sortedword not in groups:
            groups[sortedword] = []
        
        # add the original word to the list of words for the corresponding sorted word
        groups[sortedword].append(word)
    
    # create a list to store the final answer
    ans = []
    
    # for each sorted word in the groups dictionary, append the list of words to the answer list
    for sortedword in groups:
        ans.append(groups[sortedword])
    
    # return the answer list
    return ans


# Example usage
groupwords(['eat', 'tea', 'tan', 'ate', 'nat', 'bat'])

[['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']]