# String Processing

## Look-and-Say Sequence

In this lesson, we will be considering the so-called “Look-and-Say” sequence. The first few terms of the sequence are:

1, 11, 21, 1211, 111221, 312211, 13112221, 1113213211, ... 

In [1]:
# Function to get next number in a look-and-say sequence

def next_number(s):
    result = []
    i = 0
    while i < len(s):
        count = 1
        while i + 1 < len(s) and s[i] == s[i+1]:
            i += 1
            count += 1
        result.append(str(count) + s[i])
        i += 1
    return ''.join(result)

In [2]:
print(next_number('1211'))

111221


In [3]:
s = "1"
print(s)
n = 6
for i in range(n-1):
    s = next_number(s)
    print(s)

1
11
21
1211
111221
312211


## Spreadsheet Encoding

In this lesson, we will be considering how to solve the problem of implementing a function that converts a spreadsheet column ID (i.e., “A”, “B”, “C”, …, “Z”, “AA”, etc.) to the corresponding integer. For example, “A” equals 1 because it represents the first column, while “AA” equals 27 because it represents the 27th column.

In [4]:
def spreadsheet_encode_column(col_str):
    num = 0
    count = len(col_str)-1
    for s in col_str:
        num += 26**count * (ord(s) - ord('A') + 1)
        count -= 1
    return num

In [5]:
print(spreadsheet_encode_column("ZZ"))

702


## Is Palindrome

In this lesson, we will be considering how to test whether a string is a palindrome in Python. At first, we’ll come up with a concise solution that takes extra space, but we’ll eventually code a solution that takes a linear amount of time and a constant amount of space.

In [6]:
s1 = "Was it a cat I saw?"
s2 = "Racecar"
s3 = "Harry Potter"

In [7]:
# Solution 1

# Solution uses extra space proportional to size of string "s"
def is_palindrome_pythonic(input_str):
    s = ''.join([i for i in input_str if i.isalnum()]).replace(" ", "").lower()
    return (s == s[::-1])

In [8]:
print(is_palindrome_pythonic(s1))
print(is_palindrome_pythonic(s2))
print(is_palindrome_pythonic(s3))

True
True
False


In [9]:
# Solution 2 (Better Solution)

def is_palindrome_efficient(s):
    i = 0
    j = len(s) - 1
    
    while i<j:
        while not s[i].isalnum() and i < j:
            i += 1
        while not s[j].isalnum() and i < j:
            j -= 1
            
        if s[i].lower() != s[j].lower():
            return False
        i += 1
        j -= 1
    return True            

In [10]:
print(is_palindrome_efficient(s1))
print(is_palindrome_efficient(s2))
print(is_palindrome_efficient(s3))

True
True
False


## Is Anagram

In this lesson, we will determine whether two strings are anagrams of each other.

Simply put, an anagram is when two strings can be written using the same letters.

In [11]:
s1a = 'fairy tales'
s1b = 'rail safety'

s2a = 'shrek drums'
s2b = 'rum shreds'

In [12]:
# Solution 1: Pythonic
# O(nlogn)

def is_anagram_pythonic(str1,str2):
    str1 = str1.replace(" ","").lower()
    str2 = str2.replace(" ","").lower()
    
    return (sorted(str1) == sorted(str2))

In [13]:
print(is_anagram_pythonic(s1a,s1b))
print(is_anagram_pythonic(s2a,s2b))

True
False


In [14]:
# Solution 2: Efficient
# This solution is of linear time complexity which is an improvement on O(nlogn).


def is_anagram_linear(str1, str2):
    str1 = str1.replace(" ", "").lower()
    str2 = str2.replace(" ", "").lower()

    ht = dict()

    if len(str1) != len(str2):
        return False

    for i in str1:
        if i in ht:
            ht[i] += 1
        else:
            ht[i] = 1
    for i in str2:
        if i in ht:
            ht[i] -= 1
        else:
            ht[i] = 1
    for i in ht:
        if ht[i] != 0:
            return False
    return True

In [15]:
print(is_anagram_linear(s1a,s1b))
print(is_anagram_linear(s2a,s2b))

True
False


## Is Palindrome Permutation

In this lesson, we will consider how to determine if a string is a palindrome permutation.

Example: 'taco cat'

In [16]:
palin_perm = "Taco Cat"
not_palin_perm = "This slime is green"

In [17]:
def is_palin_perm(input_str):
    input_str = input_str.replace(" ", "")
    input_str = input_str.lower()

    d = dict()

    for i in input_str:
        if i in d:
            d[i] += 1
        else:
            d[i] = 1

    odd_count = 0
    for k, v in d.items():
        if v % 2 != 0 and odd_count == 0:
            odd_count += 1
        elif v % 2 != 0 and odd_count != 0:
            return False
    return True

In [18]:
print(is_palin_perm(palin_perm))
print(is_palin_perm(not_palin_perm))

True
False


## Check Permutation

Given two strings, write a function to determine if one is a permutation of the other.

In [19]:
is_permutation_1 = "google"
is_permutation_2 = "ooggle"

not_permutation_1 = "not"
not_permutation_2 = "top"

In [20]:
# Approach 1: Sorting
# Time Complexity: O(n log n)
# Space Complexity: O(1)
def is_perm_1(str_1, str_2):
    str_1 = str_1.lower()
    str_2 = str_2.lower()

    if len(str_1) != len(str_2):
        return False

    str_1 = ''.join(sorted(str_1))
    str_2 = ''.join(sorted(str_2))

    n = len(str_1)

    for i in range(n):
        if str_1[i] != str_2[i]:
            return False
    return True

In [21]:
print(is_perm_1(is_permutation_1, is_permutation_2))
print(is_perm_1(not_permutation_1, not_permutation_2))

True
False


In [22]:
# Approach 2: Hash Table
# Time Complexity: O(n)
# Space Complexity: O(n)
def is_perm_2(str_1, str_2):
    str_1 = str_1.lower()
    str_2 = str_2.lower()

    if len(str_1) != len(str_2):
        return False

    d = dict()
    
    for i in str_1:
        if i in d:
            d[i] += 1
        else:
            d[i] = 1
    for i in str_2:
        if i in d:
            d[i] -= 1
        else:
            return False

    return all(value == 0 for value in d.values())

In [23]:
print(is_perm_2(is_permutation_1, is_permutation_2))
print(is_perm_2(not_permutation_1, not_permutation_2))

True
False


## Exercise: Is Unique

Your task is to implement an algorithm to determine if a string has all unique characters.

Assume that the input string will only contain alphabets or spaces.

In [24]:
s1 = "I Am Not Unique"
s2 = "heythere"
s3 = "abCedFghI"
s4 = "hi"

In [25]:
# My Solution, Linear Time Complexity
def is_unique(input_str):
    d = {}
    for i in input_str:
        if i in d.keys():
            return False
        else:
            d[i] = 1
    return True

In [26]:
print(is_unique(s1))
print(is_unique(s2))
print(is_unique(s3))
print(is_unique(s4))

False
False
True
True


In [27]:
# Solution 2
def is_unique_2(input_str):
    return len(set(input_str)) == len(input_str)

In [28]:
# Solution 3
def is_unique_3(input_str):
    alpha = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz "
    for i in input_str:
        if i in alpha:
            alpha = alpha.replace(i, "")
        else:
            return False
    return True

## Integer to String

In this lesson, we will solve the following problem:

You are given some integer as input, (i.e. … -3, -2, -1, 0, 1, 2, 3 …) and you have to convert the integer you are given to a string. Examples:

In [29]:
def int_to_str(input_int):
    
    if input_int < 0:
        is_negative = True
        input_int *= -1
    else:
        is_negative = False

    output_str = []

    if input_int == 0:
        output_str.append('0')
    else:   
        while input_int > 0:
            output_str.append(chr(ord('0') + input_int % 10))
            input_int //= 10
        output_str = output_str[::-1]

    output_str = ''.join(output_str)

    if is_negative:
        return '-' + output_str
    else:
        return output_str

In [30]:
input_int = 123
print(input_int)
print(type(input_int))

print('\n')

output_str = int_to_str(input_int)
print(output_str)
print(type(output_str))

123
<class 'int'>


123
<class 'str'>


## Excersise: String to Integer

In [31]:
def str_to_int(input_str):

    output_int = 0

    if input_str[0] == '-':
        start_idx = 1
        is_negative = True
    else:
        start_idx = 0
        is_negative = False

    for i in range(start_idx, len(input_str)):
        place = 10**(len(input_str) - (i+1))
        digit = ord(input_str[i]) - ord('0')
        output_int += place * digit

    if is_negative:
        return -1 * output_int
    else:
        return output_int

In [32]:
s = "554"
x = str_to_int(s)
print(type(x))

s = "123"
print(str_to_int(s))

s = "-123"
print(str_to_int(s))

<class 'int'>
123
-123
