# Strings
- A string is like a special kind of array but is immutable

## Tips
- Similar to arrays, string problems often have simple brute-force solutions that use $O(n)$ space, but subtler solutions that use the string itself to **reduce the complexit** to $O(1)$
- Understand the **implications** of a string type which is **immutable**, e.g., the need to allocate a new string when concatenating immutable strings. Know **alternatives** to immutable strings, e.g., a **list** in Python
- Updating a mutable string from the font is slow, so see if it's possible to **write values from the back**
- indexing works the same as lists

In [39]:
from typing import List, Iterator, Tuple
import bisect
import collections
import math
import functools
import random

from utils import run_tests

## Libraries

In [14]:
s = 'The cow jumped over the moon'
t = 'The moon is made of cheese'
print(s)
print(t)

print('\ns.startswith("The"):       ', s.startswith("The"))

print('\ns.endswith(("moo", "moon"):', s.endswith(("moo", "moon")))     # tuple of string to try

print('\ns + t:                     ', s + t)

strings = ['the', 'cat', 'and', 'the', 'hat']
print('\n', strings)
print('" ".join(strings):           ' , " ".join(strings))

The cow jumped over the moon
The moon is made of cheese

s.startswith("The"):        True

s.endswith(("moo", "moon"): True

s + t:                      The cow jumped over the moonThe moon is made of cheese

 ['the', 'cat', 'and', 'the', 'hat']
" ".join(strings):            the cat and the hat


### Is Palindrome?
A palindrome is a string the reads the same forwards and backwards.  
Key to optimal solution is to traverse string forward and backwards to simultaneously

In [17]:
def is_palindrome(s: str) -> bool:
    # note: ~i = -(i+1)
    return all(s[i] == s[~i] for i in range(len(s) // 2))

inputs, outputs = ('cat', 'aabbaa', 'aba', 'abca'), (False, True, True, False)
run_tests(is_palindrome, inputs, outputs)

Time complexity is $O(n)$ and space complexity $O(1)$

### 6.1: Interconvert Strings and Integers

In [43]:
def int_to_string(num: int) -> str:

    if num < 0:
        is_negative, num = True, abs(num)
    else:
        is_negative = False
    
    digits = []
    # process one digit at a time
    # processing digits in reverse order
    while True:
        num, digit = num // 10, num % 10
        digits.append(chr(ord('0') + digit))   # get code 0, add digit, then convert to character
        if num == 0:
            break

    if is_negative:
        digits.append('-')

    return ''.join(reversed(digits))


# sould be able to handle '314', '+314' or '-314'
def string_to_int(s: str) -> int:
    string_digits = {s:d for s, d, in zip(['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'], range(10))}

    sign = -1 if s[0] == '-' else 1

    running_sum = 0
    for i in s[s[0] in '+-':]:             # this skips first entry if has symbol
        running_sum = running_sum * 10 + string_digits[i]  # mutliplying by 10 shift place value to left

    return sign * running_sum


def string_to_int_v2(s: str) -> int:
    string_digits = {s:d for s, d, in zip(['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'], range(10))}
    return (-1 if s[0] == '-' else 1) * functools.reduce(
        lambda running_sum, c: running_sum * 10 + string_digits[c],
        s[s[0] in '+-':], 0
    )

inputs, outputs = (123, -123, 4, -4, 0), ('123', '-123', '4', '-4', '0')
run_tests(int_to_string, inputs, outputs)

inputs, outputs = ('123', '-123', '+123', '4', '-4', '+4', '0'), (123, -123, 123, 4, -4, 4, 0)
run_tests(string_to_int, inputs, outputs)

run_tests(string_to_int_v2, inputs, outputs)

$O(n)$ time and space complexity

In [28]:
print('ord("0") - returns Unicode code point one-character string,      e.g.:',  ord('0'))
print('chr(ord("0")) - returns Unicode string for one-character string, e.g.:',  chr(ord('0')))
print('ord("0") + 5:     ',  ord('0') + 5)
print('chr(ord("0") + 5):',  chr(ord('0') + 5))


ord("0") - returns Unicode code point one-character string,      e.g.: 48
chr(ord("0")) - returns Unicode string for one-character string, e.g.: 0
ord("0") + 5:      53
chr(ord("0") + 5): 5


In [31]:
{s:d for s, d, in zip(['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'], range(10))}

{'0': 0,
 '1': 1,
 '2': 2,
 '3': 3,
 '4': 4,
 '5': 5,
 '6': 6,
 '7': 7,
 '8': 8,
 '9': 9}

### 6.2: Base Conversion
Generalized decimal number system: $a_{k-1}a_{k-2}\cdots a_1a_0$, where $0 \leq a_i < b$, denotes in base-*b* the integer $a \times b^0 + a_1 \times b^1 + a_2 \times b^2 + \cdot + a_{k-1} \times b^{k-1}$          

Write a function that converts a string integer in $b_1$ to $b_2$   
e.g.: '615', $b_1 = 7$ and $b_2 = 13$ --> '1A7'    

Assume $b_1 \geq 2$ and $b_2 \leq 16$

In [53]:

def base_conversion(num_as_str: str, b1: int, b2: int) -> str:
    string_int_map = {s:d for s, d, in zip(['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F'], range(17))}
    int_string_map = {d:s for s, d in string_int_map.items()}

    # convert each digit to integer
    is_negative = num_as_str[0] == '-'
    num_as_int = functools.reduce(
        lambda sum_so_far, c: sum_so_far * b1 + string_int_map[c],
        num_as_str[is_negative:], 0
    )

    # process integers in revese order converting to b2
    digits_as_string = []
    while True:
        num_as_int, d = num_as_int // b2, num_as_int % b2 
        digits_as_string.append(int_string_map[d])
        if num_as_int == 0:
            break
    
    if is_negative:
        digits_as_string.append('-')

    return ''.join(reversed(digits_as_string))


assert base_conversion('615', b1=7, b2=13) == '1A7'
assert base_conversion('-615', b1=7, b2=13) == '-1A7'



In [51]:
# 615 Base-7 as decimal
5 + (1 * 7) + (6 * 7**2)

306

$Time complexity in $O(n(1 + \log_{b_2}{b_1}))$   
First, perform $n$ multply-and-adds to get $x$ from s.   
Then, perform $\log_{b_2}x$ multiply-and-adds.   
$x$ is upper-bounded by $b_1^n$, so $\log_{b_2}(b_1^n)$


### 6.3: Compute the Spreadsheet Column Encoding

In [75]:
def spreadsheet_column_decoder(s: str) -> int:
    num_letters = 26
    string_int_map = {chr(ord('A') + i):i+1 for i in range(num_letters)}

    return functools.reduce(
        lambda sum_so_far, c: sum_so_far * num_letters + string_int_map[c],
        s, 0
    )

inputs, outputs = ('A', 'D', 'Z', 'AA', 'AC', 'DB', 'AZ', 'EZ', 'ZZ'), (1, 4, 26, 27, 29, 26*4+2, 52, 26*5+26, 702)
run_tests(spreadsheet_column_decoder, inputs, outputs)


def spreadsheet_column_decoder_v2(s: str) -> int:
    return functools.reduce(
        lambda sum_so_far, c: sum_so_far * 26 + ord(c) - ord('A') + 1,
        s, 0
    )

inputs, outputs = ('A', 'D', 'Z', 'AA', 'AC', 'DB', 'AZ', 'EZ', 'ZZ'), (1, 4, 26, 27, 29, 26*4+2, 52, 26*5+26, 702)
run_tests(spreadsheet_column_decoder_v2, inputs, outputs)

$O(n)$ time complexity

#### Variant: Solve the same problem  with "A" corresponding to 0

In [77]:
# i think this is right
def spreadsheet_column_decoder_A0(s: str) -> int:

    return spreadsheet_column_decoder_v2(s) - 1

inputs, outputs = ('A', 'D', 'Z', 'AA', 'AC', 'DB', 'AZ', 'EZ', 'ZZ'), (0, 3, 25, 26, 28, 25+(26*3)+2, 51, 25+26*4+26, 701)
run_tests(spreadsheet_column_decoder_A0, inputs, outputs)

#### Variant: Convert Integer to Spreadsheet Index

In [116]:
def spreadsheet_column_encoder(num: int) -> str:

    encoding = []
    # process in reverse order
    while True:
        digit = num % 26
        if digit == 0:           # handle case for Z separately
            encoding.append('Z')
            num = num // 26 - 1   
        else:
            encoding.append(chr(ord('A') + digit - 1))
            num = num // 26
        if num == 0:
            break
    
    return ''.join(reversed(encoding))

inputs, outputs = (1, 4, 26, 27, 29, 26*4+2, 52, 26*5+26, 702), ('A', 'D', 'Z', 'AA', 'AC', 'DB', 'AZ', 'EZ', 'ZZ')
run_tests(spreadsheet_column_encoder, inputs, outputs)

In [56]:
ord('A')
for i in range(26):
    print(chr(ord('A') + i))

A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Y
Z


In [79]:
chr(ord('A') + 1)

'B'