# Strings
<a href="https://colab.research.google.com/github/rambasnet/FDSPython-Notebooks/blob/master/Ch06-Strings.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

- http://openbookproject.net/thinkcs/python/english3e/strings.html

## Topics
- learn in-depth about string data types
- methods provided in string objects/data
- operators and various operations
- ways to traverse/step through characters in string

## Introduction
- strings are text data; not numeric!
- breifely covered string while covering data and types in an earlier chapter
- unlike numbers, strings are compound data type; sequence of characters
    - can also work with string as a single thing
- string variables are objects with their own attributes and methods
- help(str) to see all the methods
- commonly used methods: `upper(), lower(), swapcase(), capitalize(), endswith(), isdigit(), find(), center(), count(), split()`, etc.

## String methods and operations
- string objects/data come with dozens of methods you can invoke

In [None]:
# let's see built-in documentation on str
help(str)

In [None]:
# let's use some of the methods provided for string objects/data
ss = "Hello there beautiful World!"
tt = ss.upper()
print(tt)

In [None]:
print(ss)

In [None]:
print(tt.capitalize())

In [None]:
alist = tt.split()
print(alist)

In [None]:
# examples of some methods
ss.count('o')

In [None]:
print(ss.swapcase())

## Operators on string
- `*` and `+` operators work on string data type as well
- `+` : concatenates
- `*` : repeats/multiplies

In [None]:
a = "ABC"

In [None]:
a = a + "D" + "E " + "F " + "1 " + "2" + a

In [None]:
print(a)

In [None]:
"A"*5

In [None]:
gene = "AGT"*10
print(gene)

In [None]:
print("hello world!"*10)

## working with parts of a string
- string can be sliced using [ index ] or [ inclStartIndex : ExclEndIndex : step] bracket operator
- negative indices are allowed
- len( ) built-in function gives the length of a string

In [None]:
# examples
s = "Pirates of the Caribbean"

In [None]:
# access the second character
s[1]

In [None]:
s[:]

In [None]:
s[1:]

In [None]:
# print just the pirates 
print(s[0:7])

In [None]:
# print "the" from string s
theIndex = s.find("the")
print('the startst at', theIndex)
print(s[theIndex:theIndex+4])

In [None]:
# TODO
# print Caribbean from string s - hint use find function

In [None]:
lastSpace = s.rfind(' ')
print(lastSpace)

In [None]:
# print the last character
print(s[-1])

In [None]:
# print string in reverse order
reversedS = s[-1::-1]

In [None]:
ss = "Pirates of the Caribbean."
print(ss[len(ss)-1])

In [None]:
# test whether a given string is palindrome
a= "racecar"
print ('palindrome') if a==a[::-1] else print('not palindrome')

### string is immutable
- string objects/variables can't be modified in place
- you must reassign or make a copy to update strings

In [None]:
a = 'hello'

In [None]:
a[0] = 'H'

## Length of a string

- `len(str)` gives length/no. of characters in a string

In [None]:
# find length of string stored in variable a
len(a)

## string traversal 
- it's a common practice to go through every character of a given string
- can use both for and while loop to traverse a string

In [None]:
# example using for loop
# traversing string using index
for i in range(len(s)):
    print(s[i], end=' ')

In [None]:
# range-based loop traversing each character
for c in s:
    print(c, end=' ')

In [None]:
someStr = """afAdf@#456'"""

In [None]:
# example using while loop
i = 0
while i < len(someStr):
    print(someStr[i], end=' ')
    i += 1

## string comparison
- strings can be compared using comparison operators
- comparison operators are `==, !=, <=, >=, <, >`
- compares lexicographically using ASCII values
    - see [ASCII table](http://www.asciitable.com/) for values
- `ord('c')` provides ASCII value of the given character
- two strings are compared character by character in corresponding positions

In [None]:
# find ascii values of lower a and upper A
print(ord('A'), ord('a'))

In [None]:
# string comparison examples
print("apple" == "Apple")

In [None]:
print("apple" >= "ball")

In [None]:
# greater and less than returns True if first two corresponding
#characters have valid order
print("apple" >= "Apple")

In [None]:
# since A is <= a; result is True
# eventhough b is not less than B or c
print('Abc' <= 'aBC')

In [None]:
# for equality all characters have to match
print("apple" == "Apple") # false

In [None]:
# for inequalify any one pair of unmatched characters will do
print('apple' != 'applE')

## substring memberships
- in and not in operators can be used to check if substring appears in a string
- help quickly test for membership

In [None]:
print("p" in "apple")

In [None]:
print("pe" in "apple")

In [None]:
print("aple" not in "apple")

## cleaning up strings
- often times working with strings involve removing punctuations and unwanted characters
- traverse the string by removing any encountered punctations

In [None]:
# create a new string removing punctuations from the following string
ss = '"Well, I never did!", said Alice.'
print(ss)

In [None]:
# one solution is to use ASCII value of each character between A..Z and a..z
print(ord('a'), ord('z'), ord('A'), ord('Z'))

In [None]:
newStr = ''
for c in ss:
    if ord(c) == ord(' '): # keep space
        newStr += c
    elif ord(c) >= ord('A') and ord(c) <= ord('Z'):
        newStr += c
    elif ord(c) >= ord('a') and ord(c) <= ord('z'):
        newStr += c
print(newStr)

In [None]:
# convert newStr to lowercase for case insensitive operations
newStr1 = newStr.lower()
print(newStr1)

In [None]:
# convert sentence into list of tokens/terms/words
words = newStr1.split()
print(words)

In [None]:
# traverse through list of words
for w in words:
    print(w)

In [None]:
# next solution using string library
# string library provides range of different types of characters as data
import string
help(string)

In [None]:
# string library has data that can be useful, e.g.
string.punctuation

In [None]:
ss

In [None]:
newStr = ''
for c in ss:
    if c in string.ascii_lowercase:
        newStr += c
    elif c in string.ascii_uppercase:
        newStr += c
    elif c == ' ':
        newStr += ' '
        
print(newStr)

In [None]:
# write a function that removes all the punctations except for space
# returns new cleaned up string
def cleanUp(someStr):
    newStr = ''
    for c in someStr:
        if c.islower():
            newStr += c
        elif c.isupper():
            newStr += c
        elif c.isspace():
            newStr += c
    return newStr

In [None]:
s = cleanUp(ss)
print(s)

## string formatting
- use format method provided in str object
- use `{}` as replacement field
- numbers in curly braces are optional; determine the argument to substitute with
- each of the replacement fiels can also contain a format specification
    - `<`: left alignment, 
    - `>`: right alignment
    - `^`: center, 
        - e.g. {1:<10}: left align arg #1 in a 10-character width
    - type conversion such as f for float (.2f two decimal places), x for hex, etc. can also be used

In [3]:
name = "Corinr"
age = 25

In [4]:
s1 = "His name is {}!".format(name)
print(s1)

His name is Corinr!


In [None]:
# newer syntax
print(f"His name is {name}!")

In [None]:
# note age and name are provided in reverse order
print("His name is {1} and {1} is {0} years old.".format(age, name))

In [5]:
n1 = 4
n2 = 5.5
s3 = "{0} x {1} = {2} and {0} ^ {1} = {3:.2f}".format(n1, n2, n1*n2, n1**n2)
print(s3)

4 x 5.5 = 22.0 and 4 ^ 5.5 = 2048.00


In [6]:
# formating decimal/float values to certian decimal points
print("Pi to three decimal places is {0:.3f}".format(3.1415926))

Pi to three decimal places is 3.142


In [7]:
n1 = "Paris"
n2 = "Whitney"
n3 = "Hilton"
print("123456789 123456789 123456789 123456789 123456789 123456789")
print("|||{0:<15}|||{1:^15}|||{2:>15}|||Born in {3}|||"
        .format(n1,n2,n3,1981))

123456789 123456789 123456789 123456789 123456789 123456789
|||Paris          |||    Whitney    |||         Hilton|||Born in 1981|||


In [8]:
# formatting decimal int to hexadecimal number
print("The decimal value {0} converts to hex value 0x{0:x}".format(16))

The decimal value 16 converts to hex value 0x10


In [9]:
# formatting decimal int to binary number
print("The decimal value {0} converts to binary value 0b{0:b}".format(8))

The decimal value 8 converts to binary value 0b1000


In [10]:
# formatting decimal int to octal number
print("The decimal value {0} converts to octal value 0o{0:o}".format(8))

The decimal value 8 converts to octal value 0o10


In [None]:
letter = """
Dear {0} {2}.
 {0}, I have an interesting money-making proposition for you!
 If you deposit $10 million into my bank account, I can
 double your money ...
"""
print(letter.format("Paris", "Whitney", "Hilton"))
print(letter.format("Bill", "Warren", "Jeff"))

In [None]:
layout = "{0:>4}{1:>6}{2:>6}{3:>8}{4:>13}{5:>24}"

print(layout.format("i", "i**2", "i**3", "i**5", "i**10", "i**20"))
for i in range(1, 11):
    print(layout.format(i, i**2, i**3, i**5, i**10, i**20))

## Exercises

1. print a neat looking multiplication table like this:
<pre>
        1   2   3   4   5   6   7   8   9  10  11  12
  :--------------------------------------------------
 1:     1   2   3   4   5   6   7   8   9  10  11  12
 2:     2   4   6   8  10  12  14  16  18  20  22  24
 3:     3   6   9  12  15  18  21  24  27  30  33  36
 4:     4   8  12  16  20  24  28  32  36  40  44  48
 5:     5  10  15  20  25  30  35  40  45  50  55  60
 6:     6  12  18  24  30  36  42  48  54  60  66  72
 7:     7  14  21  28  35  42  49  56  63  70  77  84
 8:     8  16  24  32  40  48  56  64  72  80  88  96
 9:     9  18  27  36  45  54  63  72  81  90  99 108
10:    10  20  30  40  50  60  70  80  90 100 110 120
11:    11  22  33  44  55  66  77  88  99 110 121 132
12:    12  24  36  48  60  72  84  96 108 120 132 144
</pre>

2. Write a program that determines whether a given string is palindrome. Palindrome is a word, phrase, or sequence that reads the same backward as forward, e.g., madam or nurses run or race car.

2.1 Convert Exercise 2 into a function and write at least two test cases.

3. Write a program that calculates number of trials required to guess a 3 digit pass code 777 (starting from 000, 001, 002, 003, 004, 005..., 010, etc.) using some brute force technique.

3.1. Convert Exercise 3 into a function and write at least 3 test cases.

4. Write a function that calculates the [run-length encoding](https://en.wikipedia.org/wiki/Run-length_encoding) of a given string. Run-length is a lossless data compression in which runs of data are stored as a single data value and count, rather than the original run. Assume that the data contains alphabets (upper and lowercase) only and are case insisitive.
E.g.: 
    - aaaabbc -> 4a2b1c
    - Abcd -> 1a1b1c1d

In [None]:
# 4 solution
# Algorithm:
# for each character:
#   if the current character is same as the previous one
#        increment count
#   else 
#        print the count and the previous character
#        reset count and previous character
#   

def run_length_encoding(text):
    # check for corner case
    if not text: # if text is empty!
        return ''
    
    encoding = ''
    # FIXME: implement the algorithm
    
    
    return encoding
    

In [None]:
# unit testing for run_length_encoding
assert run_length_encoding('') == ''
assert run_length_encoding('aaaabbc') == '4a2b1c'
assert run_length_encoding('abcd') == '1a2b3c4d'
assert run_length_encoding('zzaazyyyYY') == '2z2a1z5y'
# FIXME: Write few more test cases; what corner cases can you think of 
# that would break run_length_encoding function?

5. Write a function that decodes the given run-length encoded compressed data, i.e. decompresses the compressed data to the original string.
e.g. 
    - '' -> ''
    - '1a2b3c' -> 'abbccc'
    - '10a' -> 'aaaaaaaaaa'

## Kattis problems

1. Avion - https://open.kattis.com/problems/avion
2. Apaxiaans - https://open.kattis.com/problems/apaxiaaans
3. Hissing Microphone - https://open.kattis.com/problems/hissingmicrophone
4. Reversed Binary Numbers - https://open.kattis.com/problems/reversebinary
5. Kemija - https://open.kattis.com/problems/kemija08
6. Simon Says - https://open.kattis.com/problems/simonsays
7. Simon Says - https://open.kattis.com/problems/simon
8. Quite a Problem - https://open.kattis.com/problems/quiteaproblem
9. Eligibility - https://open.kattis.com/problems/eligibility
10. Charting Progress - https://open.kattis.com/problems/chartingprogress
11. Pig Latin - https://open.kattis.com/problems/piglatin
12. Battle Simulation - https://open.kattis.com/problems/battlesimulation
13. Palindromic Password - https://open.kattis.com/problems/palindromicpassword
14. Image Decoding - https://open.kattis.com/problems/imagedecoding
15. Viðsnúningur - [https://open.kattis.com/problems/vidsnuningur](https://open.kattis.com/problems/vidsnuningur)