# 3. Longest Substring Without Repeating Characters

### Difficulty: <font color = green> Easy </font>

---

Given a string s, find the length of the longest substring without repeating characters.

---

**Example 1:**



Input: s = "abcabcbb"

Output: 3

Explanation: The answer is "abc", with the length of 3.

---

**Example 2:**

Input: s = "bbbbb"

Output: 1

Explanation: The answer is "b", with the length of 1.

---

**Example 3:**

Input: s = "pwwkew"

Output: 3

Explanation: The answer is "wke", with the length of 3.

Notice that the answer must be a substring, "pwke" is a subsequence and not a substring.

---

Constraints:

- $0 <= s.length <= 5 * 104$

- s consists of English letters, digits, symbols and spaces

---

## Approach Overview: 

Loop through every character in string and each time we encounter a new character, we add it to a hashset of substring and then compute its length. Each time we encounter a duplicate character (i.e. a character that's previosuly been added in substring hashset), we start removing characters from the substring hashset until we reach where the duplicate character is and finally remove it. 

This approach involves us increasing the size of the sliding window each time we encounter a new character and we decrease the size of the sliding window each time we encounter a duplicate, we keep decreasing it until the duplicate is finally removed. What's interesting is that we don't make the mistake of directly removing the character from the hashset in one go. No, no, no. See we perform this operation in a very smart way. Such that we don't lose track of the longest possible non-repeating substring, we start removing characters starting with the leftmost character in the original string then move to the next one and the next one until we reach the position for where the duplciate character is in string. This ensures that the left pointer always points to the very start of a non-repeating substring. 

Example:

say `s = "abseopsrtakox"`

After first complete substring pass: `substring = "abseop"` , `length of substring = 6`

Once we reach the `'s'` at index `position 6`, we will realize that `'s'` is already in hashset. So we can't add it in yet.

What we do now is starting remove characters from the hashset starting from the very first character we added (i.e. first character in string s) and second added character and so on, until we reach the duplicate character (in this case `'s'`). This method ensures we are correctly calculating the longest consecutive non-repeating substring. I mean say we just removed 's' directly and updated the the left pointer to `position 6` and start counting a new consecutive substring from there, this would be incorrect. Because clearly the next consecutive substring in s starts from `index position 3` (where character `'e'` is) : `" e o p s"` and not at `position 6`. 


Our algo knows this and ensures it never loses track of the next consecutive non repeating substring in s. 

next consecutive non repeating substring is now : `"eopsrtakox"` (which is actually the longest one with at length of 10!)

Very sharp algo! 



## Key Steps:

**1. As always we start by initializing critical variables that will help us to do our dirty effective work (*inserts evil laugh*).**

These include a right pointer, left pointer, a hashset to store unique characters and maxlength (to store length of longest possible non-repeating substring)

Initialize both left and right to 0 (such that it points at the index of the very first char in string s)

  `left, right = 0, 0` 
  
  `substring = set()`
  
  `maxlength = 0`
  
**2. Now we shall being traversing through each char in string s (in search of the longest possible non repeating substring)**

To do this we need to loop through every character in s, let's use a while loop to achieve this.

we shall continue to loop until the right pointer is no longer smaller than the length of the string s (until right pointer exceeds the length of string s).

   `while right < len(s):`
   
**3. Next we shall check if current character s[right] is NOT in susbtring hashset**
    
   `if s[right] not in substring:`
    
**4. If current character is not in the substring hashset then add it in there, then calculate the length of length of current substring and THEN increment right**
   
   
   *add the current character to substring hashset*
   
   `susbtring.add( s[right] )`
   
   *calculate the length of current substring and check if its larger than longest one we've seen so far*
   
   `maxlength = max(maxlength, (right - left) + 1 )`
   
   *move to next consecutive character in string s*
   
   `right += 1`
     
**5. Else if character is in susbtring hashset (current character is a duplicate), then being truncating (removing from) the hashset, starting with the leftmost character (until we reach and remove the duplicate character).**
    
   *if character is in susbtring hashset*
    
   `else:`
    
   *remove leftmost character in string s from the substring s*
    
   `substring.remove(s[left])`
    
   *move left pointer to next leftmost character in string s / increment left pointer*
    
   `left += 1`

**6. Return length of Longest Substring Without Repeating Characters**

   `return maxlength`

In [None]:
class Solution:
    def lengthOfLongestSubstring(self, s: str) -> int:
        
        # variable to keep track and store length of longest substring seen so far. 
        # initialize maxlength to 0
        maxlength = 0
        
        # initialize left and right pointer to zero
        left, right = 0, 0
        
        # hashset to store substring of non repeating characters
        subString = set()

        # continue until right pointer reaches last character in string s
        # loop through each character in string s, exit loop after we surpass last character in string s
        while right < len(s):
            
            # check if character is not in substring hashset
            if s[right] not in subString:
                
                # if not in hashset, then add character to hashset
                subString.add(s[right])
                
                # compute the current length of substring and update maxlength
                maxlength = max((right - left) + 1, maxlength)
                
                # move on to next character in string by incrementing right pointer
                right += 1
            
            
            # if character is in hashset
            else: 

                # remove leftmost string character from hashset
                # e.g if s = "abceq"  
                # s[left] = "a" (since left = 0 , on the first pass)
                # basically we are going to remove each character from the string 
                # that's inside the hashset until we reach the position where the duplicate character is, 
                # then we'll remove the duplicate character
                subString.remove(s[left])
                
                # move left pointer such that it points to next character in string s by incrementing left 
                left += 1
            
        # return length of longest substring without repeating characters
        return maxlength   

# Alternative solution: Using a `for loop` to iterate through every character in string `s`

This one is actually the most time optimal solution I believe, a common theme with soliding window algo I'm noticing is that we can use a for loop to implement the sliding window algo. And in all the cases I've encountered so far the `for loop` sliding window implementation is generally more clean and efficient.

In this one we discard the right pointer and only declare and initialize the left pointer and the character / substring hashset.

We use a `for loop` to loop through every character in string s. The logic of the algo changes a bit.

**Key Changes:**

- Instead of checking if current character is unique / not in substring hashset. We firstly check if it's in substring hashset. 
 
If it's substring then we start removing characters from the hashset. starting with the leftmost character in string s and we stop once we reach the position of the duplicate character and then remove it.
 
After we remove the leftmost character we then increment left, such that it now points to next consecutive leftmost character in string `s`.
 
We will keep repeating this block of code until the left pointer reaches the duplicate character position and removes it from the hashset. 

---
Very interesting and deep problem. Surprised this is an easy difficulty problem.  

In [None]:
class Solution:
    def lengthOfLongestSubstring(self, s: str) -> int:

        # initialize left pointer to 0 (such that it points to leftmost / first char in s)
        left = 0

        # initialize maxlength to 0 (maxlength to keep store of length of longest unique substring)
        maxlength = 0
        
        # hashset to keep store of unique characters / unique substring 
        substring = set()
        
        # iterate through every character in string s 
        for i in range(len(s)):

            # repeat while current character is in substring hashset      
            while s[i] in substring:

                # remove >leftmost character in s< from the subbstring hashset
                substring.remove( s[left] )

                # increment left pointer such that it now points to the next leftmost char in s
                left += 1
            
            # add character to substring hashset
            substring.add( s[i] )

    # compute the length of the current substring 
    # and update maxlength if its the larger than the longest one we've seen so far
            maxlength = max(maxlength, i - left + 1) 
        
        # return the length of the longest substring without repeating characters
        return maxlength   