<a href="https://colab.research.google.com/github/walkerjian/DailyCode/blob/main/Substring_con_k_caratteri.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Given an integer k and a string s, find the length of the longest substring that contains at most k distinct characters.

For example, given s = "abcba" and k = 2, the longest substring with k distinct characters is "bcb".

To tackle this problem using the Model-View-Controller (MVC) paradigm, we'll break it down into three parts:

1. **Model**: This will contain the core logic to find the longest substring with at most \( k \) distinct characters.
2. **View**: This will handle the representation of the solution, i.e., displaying the result to the user.
3. **Controller**: This will act as an interface between the model and the view, orchestrating the flow of data between them.


In [1]:
def longest_substring(s, k):
    """
    Find the length of the longest substring that contains at most k distinct characters.

    Parameters:
    - s (str): The input string.
    - k (int): The maximum number of distinct characters allowed in the substring.

    Returns:
    - str: The longest substring with at most k distinct characters.
    """

    # Edge case
    if not s or k == 0:
        return ""

    # Initialize pointers and storage
    start = 0
    max_len = 0
    max_start = 0
    char_frequency = {}

    # Sliding window technique
    for end in range(len(s)):
        if s[end] not in char_frequency:
            char_frequency[s[end]] = 0
        char_frequency[s[end]] += 1

        # If there are more than k distinct characters, slide the window
        while len(char_frequency) > k:
            char_frequency[s[start]] -= 1
            if char_frequency[s[start]] == 0:
                del char_frequency[s[start]]
            start += 1

        # Update the max substring details if needed
        if end - start + 1 > max_len:
            max_len = end - start + 1
            max_start = start

    return s[max_start:max_start + max_len]


def display_result(s, k, result):
    """
    Display the result of the longest substring problem.

    Parameters:
    - s (str): The input string.
    - k (int): The maximum number of distinct characters allowed in the substring.
    - result (str): The solution to the problem.

    Returns:
    - str: A formatted string showing the input and the solution.
    """
    return f"For the string '{s}' and k = {k}, the longest substring with at most {k} distinct characters is '{result}'."


def find_longest_substring(s, k):
    """
    Controller function that uses the model to find the solution and the view to display it.

    Parameters:
    - s (str): The input string.
    - k (int): The maximum number of distinct characters allowed in the substring.

    Returns:
    - str: A formatted string showing the input and the solution.
    """
    result = longest_substring(s, k)
    return display_result(s, k, result)


# Test harness
def test_longest_substring():
    """
    Test the solution with various test cases.
    """
    test_cases = [
        ("abcba", 2),
        ("abcba", 3),
        ("aa", 1),
        ("a", 1),
        ("", 2),
        ("abcde", 2),
        ("abaccc", 2),
        ("aaaaaaa", 2),
        ("abcdef", 1),
        ("abba", 2)
    ]

    results = []
    for s, k in test_cases:
        results.append(find_longest_substring(s, k))
    return results


# Run the test harness
test_longest_substring()


["For the string 'abcba' and k = 2, the longest substring with at most 2 distinct characters is 'bcb'.",
 "For the string 'abcba' and k = 3, the longest substring with at most 3 distinct characters is 'abcba'.",
 "For the string 'aa' and k = 1, the longest substring with at most 1 distinct characters is 'aa'.",
 "For the string 'a' and k = 1, the longest substring with at most 1 distinct characters is 'a'.",
 "For the string '' and k = 2, the longest substring with at most 2 distinct characters is ''.",
 "For the string 'abcde' and k = 2, the longest substring with at most 2 distinct characters is 'ab'.",
 "For the string 'abaccc' and k = 2, the longest substring with at most 2 distinct characters is 'accc'.",
 "For the string 'aaaaaaa' and k = 2, the longest substring with at most 2 distinct characters is 'aaaaaaa'.",
 "For the string 'abcdef' and k = 1, the longest substring with at most 1 distinct characters is 'a'.",
 "For the string 'abba' and k = 2, the longest substring with at 

 We can use the sliding window technique (also known as the two-pointer technique) to solve this problem with maximal efficiency. The idea is to maintain two pointers that define the maximum substring with \( k \) unique characters. The right pointer will always move to the right, while the left pointer will move to the right only when the number of unique characters between the two pointers exceeds \( k \).

The time complexity of this approach is \( O(n) \), where \( n \) is the length of the string.

Let's implement this efficient solution:

1. Start both pointers at the beginning of the string.
2. Move the right pointer to the right until the number of distinct characters between the two pointers exceeds \( k \).
3. When this happens, move the left pointer to the right until the number of distinct characters is back to \( k \).
4. Repeat until the right pointer reaches the end of the string.

We'll use a dictionary to maintain the frequency count of characters between the two pointers.

In [2]:
def longest_substring_efficient(s, k):
    """
    Efficiently find the length of the longest substring that contains at most k distinct characters.

    Parameters:
    - s (str): The input string.
    - k (int): The maximum number of distinct characters allowed in the substring.

    Returns:
    - str: The longest substring with at most k distinct characters.
    """

    # Edge case
    if not s or k == 0:
        return ""

    # Initialize pointers and storage
    left = 0
    right = 0
    max_len = 0
    max_start = 0
    char_frequency = {}

    while right < len(s):
        # If the current character is not in char_frequency, add it
        if s[right] not in char_frequency:
            char_frequency[s[right]] = 0

        # Increment the frequency count of the current character
        char_frequency[s[right]] += 1

        # If there are more than k distinct characters, move the left pointer to the right
        while len(char_frequency) > k:
            char_frequency[s[left]] -= 1
            if char_frequency[s[left]] == 0:
                del char_frequency[s[left]]
            left += 1

        # Update the max substring details if needed
        if right - left + 1 > max_len:
            max_len = right - left + 1
            max_start = left

        right += 1

    return s[max_start:max_start + max_len]

# Testing the efficient function
test_results_efficient = [longest_substring_efficient(s, k) for s, k in [
    ("abcba", 2),
    ("abcba", 3),
    ("aa", 1),
    ("a", 1),
    ("", 2),
    ("abcde", 2),
    ("abaccc", 2),
    ("aaaaaaa", 2),
    ("abcdef", 1),
    ("abba", 2)
]]

test_results_efficient


['bcb', 'abcba', 'aa', 'a', '', 'ab', 'accc', 'aaaaaaa', 'a', 'abba']

In [5]:
%%writefile longest_substring.cpp
#include <iostream>
#include <string>
#include <unordered_map>

std::string longest_substring_efficient(const std::string& s, int k) {
    if (s.empty() || k == 0) {
        return "";
    }

    int left = 0;
    int right = 0;
    int max_len = 0;
    int max_start = 0;
    std::unordered_map<char, int> char_frequency;

    while (right < s.size()) {
        char_frequency[s[right]]++;

        while (char_frequency.size() > k) {
            char_frequency[s[left]]--;
            if (char_frequency[s[left]] == 0) {
                char_frequency.erase(s[left]);
            }
            left++;
        }

        if (right - left + 1 > max_len) {
            max_len = right - left + 1;
            max_start = left;
        }

        right++;
    }

    return s.substr(max_start, max_len);
}

int main() {
    std::pair<std::string, int> test_cases[] = {
        {"abcba", 2},
        {"abcba", 3},
        {"aa", 1},
        {"a", 1},
        {"", 2},
        {"abcde", 2},
        {"abaccc", 2},
        {"aaaaaaa", 2},
        {"abcdef", 1},
        {"abba", 2}
    };

    for (const auto& [s, k] : test_cases) {
        std::cout << "For the string '" << s << "' and k = " << k << ", ";
        std::cout << "the longest substring with at most " << k << " distinct characters is '";
        std::cout << longest_substring_efficient(s, k) << "'." << std::endl;
    }

    return 0;
}



Overwriting longest_substring.cpp


In [4]:
!g++-10 -std=c++20 -O3 longest_substring.cpp -o longest_substring
!./longest_substring


/bin/bash: line 1: g++-10: command not found
/bin/bash: line 1: ./longest_substring: No such file or directory


In [7]:
!sudo apt update
!sudo apt install -y g++-10
!g++-10 -std=c++20 -O3 longest_substring.cpp -o longest_substring
!./longest_substring


[33m0% [Working][0m            Hit:1 https://cloud.r-project.org/bin/linux/ubuntu jammy-cran40/ InRelease
[33m0% [Connecting to archive.ubuntu.com] [Connecting to security.ubuntu.com (185.1[0m                                                                               Hit:2 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64  InRelease
Hit:3 http://security.ubuntu.com/ubuntu jammy-security InRelease
Hit:4 http://archive.ubuntu.com/ubuntu jammy InRelease
Hit:5 http://archive.ubuntu.com/ubuntu jammy-updates InRelease
Hit:6 http://archive.ubuntu.com/ubuntu jammy-backports InRelease
Hit:7 https://ppa.launchpadcontent.net/c2d4u.team/c2d4u4.0+/ubuntu jammy InRelease
Hit:8 https://ppa.launchpadcontent.net/deadsnakes/ppa/ubuntu jammy InRelease
0% [Working][0m^C
Reading package lists... Done
^C
^C
Enter the string: ^C


In [1]:
!sudo apt update

[33m0% [Working][0m            Hit:1 https://cloud.r-project.org/bin/linux/ubuntu jammy-cran40/ InRelease
[33m0% [Connecting to archive.ubuntu.com (185.125.190.36)] [Connecting to security.[0m                                                                               Hit:2 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64  InRelease
[33m0% [Waiting for headers] [Connecting to security.ubuntu.com (91.189.91.83)] [Co[0m[33m0% [Waiting for headers] [Waiting for headers] [Connected to ppa.launchpadconte[0m                                                                               Hit:3 http://security.ubuntu.com/ubuntu jammy-security InRelease
Hit:4 http://archive.ubuntu.com/ubuntu jammy InRelease
Hit:5 http://archive.ubuntu.com/ubuntu jammy-updates InRelease
Hit:6 http://archive.ubuntu.com/ubuntu jammy-backports InRelease
Hit:7 https://ppa.launchpadcontent.net/c2d4u.team/c2d4u4.0+/ubuntu jammy InRelease
Hit:8 https://ppa.launchpadcontent.ne

In [2]:
!sudo apt install -y g++-10

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
g++-10 is already the newest version (10.5.0-1ubuntu1~22.04).
0 upgraded, 0 newly installed, 0 to remove and 19 not upgraded.


In [6]:
!g++-10 -std=c++20 -O3 longest_substring.cpp -o longest_substring

In [7]:
!./longest_substring

For the string 'abcba' and k = 2, the longest substring with at most 2 distinct characters is 'bcb'.
For the string 'abcba' and k = 3, the longest substring with at most 3 distinct characters is 'abcba'.
For the string 'aa' and k = 1, the longest substring with at most 1 distinct characters is 'aa'.
For the string 'a' and k = 1, the longest substring with at most 1 distinct characters is 'a'.
For the string '' and k = 2, the longest substring with at most 2 distinct characters is ''.
For the string 'abcde' and k = 2, the longest substring with at most 2 distinct characters is 'ab'.
For the string 'abaccc' and k = 2, the longest substring with at most 2 distinct characters is 'accc'.
For the string 'aaaaaaa' and k = 2, the longest substring with at most 2 distinct characters is 'aaaaaaa'.
For the string 'abcdef' and k = 1, the longest substring with at most 1 distinct characters is 'a'.
For the string 'abba' and k = 2, the longest substring with at most 2 distinct characters is 'abba'.
