# Longest substring of limited content

You get an input string. Find the longest substring that uses a limited dictionary of fewer than k different characters.

For example, for k = 2, and input string of 'qwaaabbty', a correct answer would be 'aaabb'.

In [3]:
# Tracing first, to practice tracing
"""
We'll use a sliding window with L and R pointers.
(R is not inclusive)
In the beginning, both pointers on the left

If the substring between L and R is valid:
    check if current best (if yes, update current best by saving L and R)
    check if need to stop
    increment R
if not valid:
    check if need to stop
    increment L
    
Stop if R is at the right border of input string and the string is valid
    or if L reaches right border of the string
    (technically we can stop earlier, when L is "current best" away from the end of the string)
    
Checking that the string is valid:
    keep track of the count of chars within the window
    (update them when R and L move)
    if more than k chars have counts >0, it's invalid
""";

In [17]:
def find(s,k):
    if len(s)==0: return ''
    l = r = 0
    bl = br = 0 # Current best
    freq = {}
    
    while True:
        # print(l,r,s[l:r],freq)
        if len([i for i,v in freq.items() if v>0])<=k: # Valid current string
            if r-l>br-bl:
                (br,bl) = (r,l)
            if r==len(s):
                break
            freq[s[r]] = freq.get(s[r],0) + 1
            r += 1
        else:
            if l==len(s):
                break
            freq[s[l]] -= 1
            l += 1
    return s[bl:br]

print(find('qwaabbr',2))
print(find('1',2))
print(find('abtry',2))
print(find('gabababtttry',1))
print(find('gabababtttry',2))
print(find('gabababtttry',3))

aabb
1
ab
ttt
ababab
abababttt
