## Find median in a stream

                            
Given an input stream of n integers the task is to insert integers to stream and print the median of the new stream formed by each insertion of x to the stream.



```Flow in stream : 5, 15, 1, 3 
5 goes to stream --> median 5 (5)
15 goes to stream --> median 10 (5, 15)
1 goes to stream --> median 5 (5, 15, 1)
3 goes to stream --> median 4 (5, 15, 1, 3)```

Simple brute force solution is to keep on appending new elemnts to list -> sorting them -> finding median based on whether even elements or odd elements. This is n.log(n) for sorting + n for median search. therefore n times elements are being added. therefore n.(n.log(n)) time complexity solution. let's try to optimize it

In [4]:
def insert_sorted(arr,x):
    """
    Insert item x in list arr, and keep it sorted assuming arr is sorted.
    """
    arr.append(x)
    i = len(arr)-1
    key = arr[i]
    j = i-1
    
    while j>=0 and arr[j]>key:
        arr[j+1] = arr[j]
        j=j-1
    arr[j+1] = key
    return arr

In [7]:
new_ar = insert_sorted([3,5,15,25],10)

In [8]:
new_ar

[3, 5, 10, 15, 25]

In [13]:
if len(new_ar)%2 == 0:
    med = (new_ar[len(new_ar)//2] + new_ar[len(new_ar)//2-1])/2.
else:
    med = new_ar[len(new_ar)//2]

In [14]:
med

10

time complexity of this should be n^2 (n for each step of insertion sort and n elements in stream). We can optimize it even better

can use `insort` inbuilt function from `bisect` library instead of custom insert_sorted to make it faster. 


In [16]:
from bisect import insort

## Is palindrome

In [37]:
st = '12ISAasw; 1'

In [38]:
import re
import string

In [47]:
re.sub(f"([{string.punctuation}] )",'', st.strip(''))

'12ISAasw1'

In [52]:
re.sub("([%s] )"%(string.punctuation),'', st.strip(''))

'12ISAasw1'

In [76]:
#code
import re
import string

def is_palindrome(st):
    st2 = re.sub("([{}“”¨«»®´·º½¾¿¡§£₤‘’ ])".format(string.punctuation),'', st.strip(''))
    st2 = st2.lower()
    if st2 == st2[::-1]: return 'YES'
    else: return 'NO'

In [77]:
is_palindrome('I am :IronnorI Ma, i')

'YES'

## k largest(or smallest) elements in an array | added Min Heap method
Question: Write an efficient program for printing k largest elements in an array. Elements in array can be in any order.
For example, if given array is [1, 23, 12, 9, 30, 2, 50] and you are asked for the largest 3 elements i.e., k = 3 then your program should print 50, 30 and 23.

* Simplest method is to sort is (n.log(n)) and pick first k (k). overall time complexity will be `n.log(n)`. Let's optimize it. 
* Other apporach is to do bubble sort instead of quicksort. but run outer loop only k times as we are done after k iterations as we only need k smallest/largest which will be thrown towards right. time complexity for this will be `k.n` (better than quicksort if k < log(n))
* using min/max heap (heapify)

This method is mainly an optimization of method 1. Instead of using temp[] array, use Min Heap.

1) Build a Min Heap MH of the first k elements (arr[0] to arr[k-1]) of the given array. O(k)

2) For each element, after the kth element (arr[k] to arr[n-1]), compare it with root of MH.
……a) If the element is greater than the root then make it root and call heapify for MH
……b) Else ignore it.
// The step 2 is O((n-k)*logk)

3) Finally, MH has k largest elements and root of the MH is the kth largest element.

Time Complexity: O(k + (n-k)Logk) without sorted output. If sorted output is needed then O(k + (n-k)Logk + kLogk)

All of the above methods can also be used to find the kth largest (or smallest) element.

In [15]:
import heapq

In [24]:
arr = [1, 13, 12, 9, 30, 23, 50] 

In [25]:
k=3

In [26]:
arr_sub = arr[:k]

In [27]:
heapq.heapify(arr_sub)

In [28]:
arr_sub

[1, 13, 12]

In [29]:
arr_rem = arr[k:]

In [30]:
for i in range(len(arr_rem)):
    if arr_rem[i] > arr_sub[0]:
        arr_sub[0] = arr_rem[i]
        heapq.heapify(arr_sub)
        print(arr_sub)

[9, 13, 12]
[12, 13, 30]
[13, 23, 30]
[23, 50, 30]


In [23]:
arr_sub

[23, 50, 30]

## Find triplets with zero sum
Given an array A of N elements. The task is to complete the function which returns true if triplets exists in array A whose sum is zero else returns false.
Output:
For each test case, output will be 1 if triplet exists else 0.

search for 0-(sum of 2 elems), if found, return true. but this will have n.n-1 for different possible combinations and o(n) for search in list -> n^3 by brute force. 

not optimized though

In [209]:
st_ar = '6 56 93 -12 26 78 79 58 53 52 51 55 77 -2 61 -26 91 16 100 -8 72'

In [210]:
arr = st_ar.split(' ')

In [211]:
arr = [int(a) for a in arr]

In [197]:
def findTriplets(arr):
    for i in range(len(arr)):
        for j in range(i+1,len(arr)):
            su = arr[i] + arr[j]
            if -su in arr[j+1:]: return True
    return False

In [199]:
findTriplets(arr)

False

Can optimize it by first sorting the list (n.log(n)). now when we go into first loop. rather than using a 2nd innner loop and then looking up in list (which adds n^2), we do binary search which does look up and loop together (only adds n). total n^2

In [214]:
def find_triplets_optimized(arr):
    arr.sort()
    for i in range(len(arr)-2):  # 2 spaces left - one for left, i.e i+1 and one for right n
        l=i+1
        r=len(arr)-1
        while l<r:
            if arr[i]+arr[l]+arr[r] < 0: 
                l+=1
            elif arr[i]+arr[l]+arr[r] > 0:
                r-=1
            else:
                return True
    return False

In [215]:
find_triplets_optimized(arr)

False

## Longest valid Parentheses
Given a string S consisting of opening and closing parenthesis '(' and ')'. Find length of the longest valid parenthesis substring.



can be solved with O(n) with indexing

In [228]:
st = ')()())'

In [233]:
def longest_valid_par(st):
    stck = []
    ctr = 0
    for i,s in enumerate(st):
        stck.append(s)
        if i>0 and stck[-2] == '(' and stck[-1] == ')':
            print(stck)
            ctr+=2
    return ctr

In [234]:
longest_valid_par(st)

[')', '(', ')']
[')', '(', ')', '(', ')']


4

Above code is wrong and will not work for some cases like `(())` : valid substring are 4. but it will give 2

In [239]:
longest_valid_par('(())')

['(', '(', ')']


2

can be solved using `stacks`, keep on stacking, pop the last one and see if it matches `)` after `(`

In [243]:
def longest_valid_par(st):
    stck = []
    ctr = 0
    for i,s in enumerate(st):
        if s == '(':
            stck.append(s)
        elif s == ')' and len(stck)>0:  
            stck.pop()
            ctr+=2
    return ctr

In [262]:
longest_valid_par(')(((())(()()(()))))()(()))(((()))((()(')

32

In [247]:
longest_valid_par(')()')

2

Above answer is still wrong as it will count cases where it encounters invalid parenthesis in between, then again gets some valid ones. Question is not to find number of valid parenthesis. But longest length upto which all parenthesis is valid, without encountering any invalid one. So, we will restart the count when it encounters an invalid one.

In [267]:
def solve(parenthesis):
    stack = []
    cur = 0
    ret = 0
    for i, e in enumerate(parenthesis):
        if e == '(':
            stack.append(cur)
            cur = 0
        elif e == ')' and len(stack) > 0:
            cur += stack.pop() + 2
            ret = max(ret, cur)
        elif e == ')' and len(stack) == 0:  # restarts
            cur = 0
    return ret

In [268]:
solve(')(((())(()()(()))))()(()))(((()))((()(')

24