Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions DIRECTORY.md
Original file line number Diff line number Diff line change
Expand Up @@ -302,6 +302,8 @@
* [Test Max Sum Sub Array](https://github.com/BrianLusina/PythonSnips/blob/master/algorithms/sliding_window/max_sum_of_subarray/test_max_sum_sub_array.py)
* Minimum Window Substring
* [Test Min Window Substring](https://github.com/BrianLusina/PythonSnips/blob/master/algorithms/sliding_window/minimum_window_substring/test_min_window_substring.py)
* Permutation In String
* [Test Permutation In String](https://github.com/BrianLusina/PythonSnips/blob/master/algorithms/sliding_window/permutation_in_string/test_permutation_in_string.py)
* Repeated Dna Sequences
* [Test Repeated Dna Sequences](https://github.com/BrianLusina/PythonSnips/blob/master/algorithms/sliding_window/repeated_dna_sequences/test_repeated_dna_sequences.py)
* Substring Concatenation
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
min_cost_dijkstra,
min_cost_0_1_bfs,
min_cost_0_1_bfs_2,
min_cost_dfs_and_bfs
min_cost_dfs_and_bfs,
)

MIN_COST_TO_MAKE_VALID_PATH_IN_GRID_TEST_CASES = [
Expand Down
159 changes: 159 additions & 0 deletions algorithms/sliding_window/permutation_in_string/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,159 @@
# Permutation in String

Given two strings s1 and s2, return true if s2 contains a permutation of s1, or false otherwise.

In other words, return true if one of s1's permutations is the substring of s2.

## Examples

Example 1:

```text
Input: s1 = "ab", s2 = "eidbaooo"
Output: true
Explanation: s2 contains one permutation of s1 ("ba").
```

Example 2:
```text
Input: s1 = "ab", s2 = "eidboaoo"
Output: false
```

## Constraints

- 1 <= s1.length, s2.length <= 10^4
- s1 and s2 consist of lowercase English letters.

## Topics

- Hash Table
- Two Pointers
- String
- Sliding Window

## Hints

- Obviously, brute force will result in TLE. Think of something else.
- How will you check whether one string is a permutation of another string?
- One way is to sort the string and then compare. But, Is there a better way?
- If one string is a permutation of another string then they must have one common metric. What is that?
- Both strings must have same character frequencies, if one is permutation of another. Which data structure should be
used to store frequencies?
- What about hash table? An array of size 26?

## Solution(s)

1. [Sliding Window](#sliding-window)
2. [Optimized Sliding Window](#optimized-sliding-window)

### Sliding Window

A naive approach would be to generate all possible rearrangements of s1 and check whether any appear in s2. But that
would quickly become inefficient for longer strings, because the number of permutations grows extremely fast. For a
three-letter string like "abc", there are already six permutations; for "abcdef", there are hundreds. So, instead of
trying to rearrange s1, we look for an efficient alternative.

Two strings are permutations of each other if and only if they contain the same characters the same number of times. For
example, "abc" and "bca" are permutations because both have one a, one b, and one c. This means that if we find any
substring of s2 that is the same length as s1 and has the same character counts, then that substring must be a
permutation of s1.

> Quick recall
> A substring is a contiguous sequence of characters within a larger string. So, if s1 = "ab" and s2 = "acb", even
> though both strings contain the same counts of letters a and b, there’s no two-letter block in s2 that contains both
> together so that the result would be FALSE.

The same idea is used in this solution, keeping track of the character counts of s1 inside s2. To do this efficiently,
the algorithm uses a sliding window over s2. At any given moment, this window represents a substring of s2 that’s the
same length as s1 and could potentially be one of its permutations. To check that, compare the frequency of each
character in s1 with the frequency of the characters inside this window.

If the counts match, it means the current window contains the same letters as s1, just possibly in a different order,
so a valid permutation is found. In that case, return TRUE. If the counts don’t match, and there are still characters
left to examine in s2, slide the window forward by one character. That means one new character from the right side of
s2 is added to the window, and one old character from the left side is removed. This is important because the window
must always stay the same length as s1. Keep doing this until either a match is found or the end of s2 is reached. If
all possible windows have been checked and none match the character frequencies of s1, then return FALSE.

Let's look at the algorithm steps:

1. Store the lengths of s1 and s2 in n1 and n2.
2. If n1 > n2, return FALSE, as a longer string can’t fit as a substring in some other string.
3. Create two arrays, s1Counts and windowCounts, of length 26. The former stores the frequencies of characters in s1,
and the latter stores the frequencies in the current window of s2.
4. If the two frequency arrays are identical at this point, s1Counts == windowCounts, the function returns TRUE because
the first window is already a permutation of s1.
5. If they do not match, the window begins sliding across s2 one character at a time. For each position i from n1 to
n2-1 in s2:
- Add the count of characters at the right end of s2 (s2[i]) in the current window.
- Remove the count of characters at the right end of s2 (s2[i - n1]) from the current window.
- If s1Counts == windowCounts, return TRUE.
6. Once the window has moved across the entire s2, and no match is found, return FALSE.

![Example 1](./images/solutions/permutation_in_string_solution_1.png)
![Example 2](./images/solutions/permutation_in_string_solution_2.png)
![Example 3](./images/solutions/permutation_in_string_solution_3.png)
![Example 4](./images/solutions/permutation_in_string_solution_4.png)
![Example 5](./images/solutions/permutation_in_string_solution_5.png)
![Example 6](./images/solutions/permutation_in_string_solution_6.png)
![Example 7](./images/solutions/permutation_in_string_solution_7.png)
![Example 8](./images/solutions/permutation_in_string_solution_8.png)
![Example 9](./images/solutions/permutation_in_string_solution_9.png)
![Example 10](./images/solutions/permutation_in_string_solution_10.png)

#### Complexity Analysis

##### Time Complexity

The function runs in linear time because it builds character-frequency counts once, then slides a fixed-size window
across s2 while doing only constant work per step. Each step contributes to the time complexity as follows:

- Counting frequencies in s1 takes O(n1) time (one pass over s1).
- Building the first window in s2 (size of s1.length) takes O(n1) time.
- Sliding the window across s2 takes O(n2). This is because each slide:
- Adds one character and removes one character (O(1) updates).
- Compares two frequency arrays of size 26 (O(1), as 26 is constant).

This makes the total time complexity O(n1 + n2). As the length of s2 is greater than or equal to that of s1 in all valid
cases, the overall time complexity is dominated by the length of s2, with a special O(1) case when s1 is longer than s2.
Therefore, the final time complexity is O(n2).

##### Space Complexity

The space complexity of this solution is O(1), meaning it uses constant extra memory. This is because it only maintains
two fixed-size arrays (of size 26) to store letter frequencies and a few simple integer variables.

### Optimized Sliding Window

The last approach can be optimized, if instead of comparing all the elements of the s1arr for every updated s2arr
corresponding to every window of s2 considered, we keep a track of the number of elements which were already matching
in the s1arr and update just the count of matching elements when we shift the window towards the right.

To do so, we maintain a count variable, which stores the number of characters(out of the 26 alphabets), which have the
same frequency of occurence in s1 and the current window in s2. When we slide the window, if the deduction of the last
element and the addition of the new element leads to a new frequency match of any of the characters, we increment the
count by 1. If not, we keep the count intact. But, if a character whose frequency was the same earlier(prior to addition
and removal) is added, it now leads to a frequency mismatch which is taken into account by decrementing the same count
variable. If, after the shifting of the window, the count evaluates to 26, it means all the characters match in frequency
totally. So, we return a True in that case immediately.

#### Complexity Analysis

Let l1 be the length of string s1 and l2 be the length of string s2.

##### Time complexity: O(l1 + (l2 −l1))≈O(l2)

Populating s1arr and s2arr takes O(l1) time since we iterate over the first l1 characters of both strings.

The outer loop runs l2 −l1 times. In each iteration, we update two characters (one entering and one leaving the window)
in constant time O(1), and we maintain a count of matches. This step takes O(l2 −l1).

Checking if count == 26 also happens in O(1), since it's a constant comparison.

Thus, the total time complexity is: O(l1 +(l2 −l1))≈O(l2)

##### Space complexity: O(1)

Two fixed-size arrays (s1arr and s2arr) of size 26 are used for counting character frequencies. No additional space that
grows with the input size is used.
81 changes: 81 additions & 0 deletions algorithms/sliding_window/permutation_in_string/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
def check_inclusion_optimized_sliding_window(s1: str, s2: str) -> bool:
s1_len, s2_len = len(s1), len(s2)
if s1_len > s2_len:
return False

s1_arr = [0] * 26
s2_arr = [0] * 26

for i in range(s1_len):
x = ord(s1[i]) - ord("a")
y = ord(s2[i]) - ord("a")
s1_arr[x] += 1
s2_arr[y] += 1

count = 0
for i in range(26):
if s1_arr[i] == s2_arr[i]:
count += 1

for i in range(s2_len - s1_len):
r = ord(s2[i + s1_len]) - ord("a")
l = ord(s2[i]) - ord("a")

if count == 26:
return True
s2_arr[r] += 1
if s2_arr[r] == s1_arr[r]:
count += 1
elif s2_arr[r] == s1_arr[r] + 1:
count -= 1

s2_arr[l] -= 1
if s2_arr[l] == s1_arr[l]:
count += 1
elif s2_arr[l] == s1_arr[l] - 1:
count -= 1
return count == 26


def check_inclusion_sliding_window(s1: str, s2: str) -> bool:
n1 = len(s1)
n2 = len(s2)

# If s1 is longer than s2, a permutation of s1 cannot be a substring of s2
if n1 > n2:
return False

# Initialize frequency arrays for s1 and the current sliding window in s2
# Use 26-element arrays for lowercase English letters 'a' through 'z'
s1_counts = [0] * 26
window_counts = [0] * 26

# Populate s1_counts with character frequencies from s1
for c in s1:
s1_counts[ord(c) - ord("a")] += 1

# Populate window_counts for the initial sliding window (first n1 characters of s2)
for i in range(n1):
window_counts[ord(s2[i]) - ord("a")] += 1

# Check if the initial window is a permutation
# This can be done by comparing the two frequency arrays
if s1_counts == window_counts:
return True

# Slide the window across the rest of s2
for i in range(n1, n2):
# Character entering the window (at index i)
char_added = ord(s2[i]) - ord("a")
window_counts[char_added] += 1

# Character leaving the window (at index i - n1)
char_removed = ord(s2[i - n1]) - ord("a")
window_counts[char_removed] -= 1

# After updating the window, check if the frequencies match
if s1_counts == window_counts:
return True

# If no permutation is found after checking all windows
return False
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
import unittest
from parameterized import parameterized
from algorithms.sliding_window.permutation_in_string import (
check_inclusion_optimized_sliding_window,
check_inclusion_sliding_window,
)

PERMUTATION_IN_STRING_TEST_CASES = [
("ab", "eidbaooo", True),
("ab", "eidboaoo", False),
]


class PermutationInStringTestCase(unittest.TestCase):
@parameterized.expand(PERMUTATION_IN_STRING_TEST_CASES)
def test_check_inclusion_optimized_sliding_window(
self, s1: str, s2: str, expected: bool
):
actual = check_inclusion_optimized_sliding_window(s1, s2)
self.assertEqual(expected, actual)

@parameterized.expand(PERMUTATION_IN_STRING_TEST_CASES)
def test_check_inclusion_sliding_window(self, s1: str, s2: str, expected: bool):
actual = check_inclusion_sliding_window(s1, s2)
self.assertEqual(expected, actual)


if __name__ == "__main__":
unittest.main()
Loading