# Algorithms by Yandex

[youtube playlist](https://www.youtube.com/playlist?list=PL6Wui14DvQPySdPv5NUqV3i8sDbHkCKC5)

## Lesson 2. Linear search

Linear search is a simple search algorithm used to find a particular item in a collection of items. It works by checking each element in the collection one by one until the target item is found or all elements have been checked.

Here's how the algorithm works:

1. Start from the first element in the collection.
1. Compare the current element with the target item.
1. If the current element is equal to the target item, return its position.
1. If the current element is not equal to the target item, move on to the next element and repeat steps 2 and 3.
1. If all elements have been checked and the target item has not been found, return -1 to indicate that the item was not found.

In [1]:
def linear_search(arr, target):
    for i in range(len(arr)):
        if arr[i] == target:
            return i
    return -1

The time complexity of linear search is O(n), where n is the number of elements in the collection. This means that the algorithm takes longer to search through larger collections, but it is still efficient for small collections. However, it is not as efficient as other search algorithms, such as binary search, which has a time complexity of O(log n).

Linear search summary:
- Linear search is a search method where all elements are checked.
- The complexity of linear search is linear, i.e. O(N).
- Usually, the suitable or most suitable element is searched for.

#### Task 1. The first leftmost occurrence

Given a sequence of numbers of length N, find the first (leftmost) occurrence of the positive number X in it or output -1 if the number X has not been encountered.

*The function findx takes two arguments: seq and x. The seq is a list of numbers and x is the number we want to search for in the list.*

In [4]:
def findx(seq, x):
    # Initialize the variable "ans" with -1, which represents the position of the first occurrence of "x"
    ans = -1

    # Iterate over the elements in the list "seq"
    for i in range(len(seq)):
        # Check if the current element in "seq" is equal to "x" and "ans" is still -1
        if ans == -1 and seq[i] == x:
            # If both conditions are true, update the value of "ans" to the current index "i"
            ans = i

    # Return the value of "ans" as the result, which represents the position of the first occurrence of "x"
    return ans

# Call the function "findx" with a list of numbers and a number to search for
findx([1, 2, 3, 2, 1], 2)

1

#### Task 2. The last rightmost occurrence

Given a sequence of numbers of length N, find the last (rightmost) occurrence of the positive number X in it or output -1 if the number X has not been encountered.

In [7]:
def findrightx(seq, x):
    # Initialize the variable "ans" with -1, which represents the position of the last occurrence of "x"
    ans = -1

    # Iterate over the elements in the list "seq"
    for i in range(len(seq)):
        # Check if the current element in "seq" is equal to "x"
        if seq[i] == x:
            # If the condition is true, update the value of "ans" to the current index "i"
            ans = i

    # Return the value of "ans" as the result, which represents the position of the last occurrence of "x"
    return ans

# Call the function "findrightx" with a list of numbers and a number to search for
findrightx([1, 2, 3, 2, 1], 2)

3

#### Task 3. Find MAX value

Given a sequence of numbers with length N (N>0), find the maximum number in the sequence.

This one will be a fast version of the solution, but not optimized by memory consumtion. 

In [13]:
def findmax(seq):
    # Initialize the variable "ans" with the first element of the list "seq"
    ans = seq[0]

    # Iterate over the elements in the list "seq" starting from the second element
    for i in range(1, len(seq)):
        # Check if the current element in "seq" is greater than the current value of "ans"
        if seq[i] > ans:
            # If the condition is true, update the value of "ans" to the current element
            ans = seq[i]

    # Return the value of "ans" as the result, which represents the maximum number in the list "seq"
    return ans

# Call the function "findmax" with a list of numbers
findmax([1, 2, 3, 1, 2])

3

This one is better (in terms of memory consumption) will be to store not the value, but the index of an element with this value. Let's have a look how to do it:

In [14]:
def findmax(seq):
    # Initialize the variable "ans" with the index 0
    ans = 0

    # Iterate over the elements in the list "seq" starting from the second element
    for i in range(1, len(seq)):
        # Check if the current element in "seq" is greater than the element at the index "ans"
        if seq[i] > seq[ans]:
            # If the condition is true, update the value of "ans" to the current index "i"
            ans = i

    # Return the element at the index "ans" as the result, which represents the maximum number in the list "seq"
    return seq[ans]

# Call the function "findmax" with a list of numbers
findmax([1, 2, 3, 1, 2])

3

The lexicographic order is almost the same as the alphabetical order, with one difference being that in the former, uppercase and lowercase letters are treated as different, while in the latter they are not. It is used to sort words in dictionaries and is often used in computer science when sorting data structures such as lists and arrays. In general, the lexicographic order is used to order words and symbols based on the sequence of their characters, rather than their numerical values.

#### Task 4. Find MAX and penultimate MAX value

Given a sequence of numbers of length N (N > 1). Find the largest number in the sequence and the second largest number (the one that will be the largest if one maximum number is removed from the sequence).

In [17]:
def findmax2(seq):
    # initialize max1 with the maximum of first two elements
    max1 = max(seq[0], seq[1])
    # initialize max2 with the minimum of first two elements
    max2 = min(seq[0], seq[1])
    
    # loop through the remaining elements
    for i in range(2, len(seq)):
        # if the current element is greater than max1, update max2 to the previous max1 value, and max1 to the current element
        if seq[i] > max1:
            max2 = max1
            max1 = seq[i]
        # if the current element is greater than max2 but not greater than max1, update max2 to the current element
        elif seq[i] > max2:
            max2 = seq[i]
    
    # return max1 and max2 as a tuple
    return (max1, max2)

# example usage
findmax2([2, 1, 3, 2, 1])

(3, 2)

#### Task 5. Find MIN even number

Find the minimum even number in a sequence of numbers of length N, or output -1 if such a number does not exist.

In [21]:
def findmineven(seq):
    # Initialize the answer to -1, to indicate that no even number is found yet
    ans = -1
    
    # Loop through all the elements in the sequence
    for i in range(len(seq)):
        # Check if the current element is even and either ans is still -1 or the current element is smaller than the current value of ans
        if seq[i] % 2 == 0 and (ans == -1 or seq[i] < ans):
            ans = seq[i]
    
    # Return the answer
    return ans

# Example usage
findmineven([1, 3, 4, 5, 12])

4

Same, but using a flag boolean variable:   
*This code uses a flag boolean variable to keep track of whether or not an even number has been found in the sequence. The flag variable is initialized to False at the start of the function, indicating that no even number has been found yet. In each iteration of the loop, if the current number is even and either no even number has been found yet (indicated by not flag) or the current number is smaller than the current minimum even number (seq\[i\] < ans), then the minimum even number is updated to the current number and the flag is set to True to indicate that an even number has been found. If the loop finishes without finding an even number, then the function returns -1.*

In [22]:
def findmineven(seq):
    # Initialize the flag variable to False, indicating that we have not found an even number yet.
    flag = False
    # Iterate through the sequence of numbers.
    for i in range(len(seq)):
        # If the current number is even and (either the flag is False, meaning we haven't found an even number yet,
        # or the current number is smaller than the current minimum even number), then update the minimum even number.
        if seq[i] % 2 == 0 and (not flag or seq[i] < ans):
            ans = seq[i]
            # Set the flag to True to indicate that we have found an even number.
            flag = True
    # Return the minimum even number, or -1 if no even number was found.
    return ans

# Example usage
print(findmineven([1, 3, 4, 5, 12]))

4


#### Task 6. Find the shortest words

Given a sequence of words, output all the shortest words separated by spaces.

The solution is in two passes:  
*The shortwords function takes a list of strings words as input. It first finds the minimum length of the words in the list by initializing minlen to the length of the first word in the list and then iterating through the rest of the words, updating minlen if a shorter word is found.*

*Next, the function creates a list ans of all words with the minimum length. This is done by iterating through the words list again and checking if each word has a length equal to minlen. If a word with the minimum length is found, it is appended to ans.*

*Finally, the function returns the list of words with the minimum length as a string, separated by spaces, by using the join method on the list ans.*

In [29]:
def shortwords(words):
    # Find the minimum length of words in the list
    minlen = len(words[0])
    for word in words:
        if len(word) < minlen:
            minlen = len(word)
    
    # Create a list of all words with the minimum length
    ans = []
    for word in words:
        if len(word) == minlen:
            ans.append(word)
    
    # Return the list of words with the minimum length as a string, separated by spaces
    return ' '.join(ans)

shortwords(['aa', 'b', 'cc', 'd'])

'b d'

#### Task 7. Define the volume of water

The game PitCraft takes place in a two-dimensional world consisting of 1 by 1 meter blocks. The player's island is represented by a set of columns of different heights, made up of stone blocks and surrounded by sea. A strong rain has passed over the island, filling all the valleys with water and any excess water flowed into the sea without increasing its level.
Determine the amount of water blocks left in the valleys on the island landscape.

<img src=\"./pics/2_1_pic.png\">

In [35]:
def isleflood(h):
    # Find the position of the highest column
    maxpos = 0
    for i in range(len(h)):
        if h[i] > h[maxpos]:
            maxpos = i

    ans = 0
    nowm = 0
    # Count the trapped water in the columns before the highest one
    for i in range(maxpos):
        if h[i] > nowm:
            nowm = h[i]
        ans += nowm - h[i]

    nowm = 0
    # Count the trapped water in the columns after the highest one
    for i in range(len(h) - 1, maxpos, -1):
        if h[i] > nowm:
            nowm = h[i]
        ans += nowm - h[i]

    return ans


isleflood([3, 1, 4, 3, 5, 1, 1, 3, 1])

7

#### Task 8. RLE

Given a string (possibly empty), consisting of letters A-Z:  
`AAAABBBCCXYZDDDDEEEFFFAAAAAABBBBBBBBBBBBBBBBBBBBBBBBBBBB`    
You need to write a function RLE, which will output a string in the form:  
`A4B3C2XYZD4E3F3A6B28`  
Explanations: If a symbol appears once, it remains unchanged, if a symbol repeats more than once, the number of repetitions is added to it.

Firstly, let's practise to solve an easy type of this task. We need to return an output string with removed duplicated latters. 

In [53]:
def easypeasy(s):
    # initializing variables to store the last symbol, and the answer list
    lastsym = s[0]
    ans = []
    # looping through the string starting from the second symbol (index 1)
    for i in range(1, len(s)):
        # if the current symbol is different from the last symbol
        if s[i] != lastsym:
            # append the last symbol to the answer list
            ans.append(lastsym)
            # update the last symbol to be the current symbol
            lastsym = s[i]
    # after the loop is done, append the last symbol to the answer list
    ans.append(lastsym)
    # join all the elements in the answer list into a single string and return it
    return ''.join(ans)


# testing the function
easypeasy("AAAABBBCCXYZDDDDEEEFFFAAAAAABBBBBBBBBBBBBBBBBBBBBBBBBBBB")

'ABCXYZDEFAB'

And now let's solve the task itself. 

In [54]:
def rle(s):
    """
    This function implements the RLE algorithm for compressing a string.
    If a character occurs more than once in the string, it will be compressed into the character followed by the number of repetitions.

    Parameters:
    s (str): The string to be compressed.

    Returns:
    str: The compressed string.
    """
    
    def pack(s, cnt):
        """
        Helper function to format a character and its count into a string.

        Parameters:
        s (str): The character to be packed.
        cnt (int): The number of repetitions of the character.

        Returns:
        str: The packed string.
        """
        if cnt > 1:
            return s + str(cnt)
        return s
    
    lastsym = s[0]  # Store the last symbol
    lastpos = 0  # Store the last position of the symbol
    ans = []  # Create an empty list for the answer
    
    # Iterate over all characters in the string
    for i in range(len(s)):
        # If the current character is different from the last symbol
        if s[i] != lastsym:
            # Add the packed form of the last symbol to the answer list
            ans.append(pack(lastsym, i - lastpos))
            lastpos = i  # Update the last position
            lastsym = s[i]  # Update the last symbol
    
    # Add the packed form of the last symbol to the answer list
    ans.append(pack(s[lastpos], len(s) - lastpos))
    
    # Join all elements in the answer list into a single string
    return ''.join(ans)

rle('AAAABBBCCXYZDDDDEEEFFFAAAAAABBBBBBBBBBBBBBBBBBBBBBBBBBBB')


'A4B3C2XYZD4E3F3A6B28'