<a href="https://colab.research.google.com/github/zengmmm00/DASC_PRE_PYTHON/blob/main/02b_Strings.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

> _Self-learning material_  
> **Python workshop - 2b. Strings**

# Strings

## Strings and Tuples

**Strings** and Tuples are very similar. In fact, a string can be seen as a tuple of characters.

In [None]:
myTuple = (1, 2, 3)
myStr = "abc"
print('Length of', myTuple, "is", len(myTuple))

Length of (1, 2, 3) is 3


In [None]:
print('Length of', myStr, "is", len(myStr))

Length of abc is 3


## String in for-loop

We can iterated a string in a **for-loop**:

In [None]:
myStr = "abc"
for c in myStr:
    print(c)

a
b
c


## Slicing

**Slicing** is also possible for strings.

In [None]:
myStr = "abcdefg"
print(myStr[1:4])
print(myStr[1:6:2])

bcd
bdf


## String is Immutable

However since string is **immutable**, we cannot assign values to an index or a slice.

In [None]:
myStr = "abcdefg"
myStr[1] = 'X'
myStr[1:4] = 'XXX'

- The above code will throw an **error**. 
- We need to construct the string using concatenation or string formatting.

# String manipulation

## String construction

Suppose we try to produce a string that shows the result of adding two values:

In [None]:
a = 1
b = 2
c = a + b
s = a + "+" + b + "=" + c
print(s)

- The above code will throw an error when executed.
- Operator `+` can be used to concatenate two strings, but it cannot concatenate a string and a number.
- We need to **convert** the non-string value to a string using the `str()` function.

In [None]:
a = 1
b = 2
c = a + b
s = str(a) + "+" + str(b) + "=" + str(c)
print(s)

1+2=3


## String repeatition

Operator `*` can be used to generate a repeated sequence of strings.

In [None]:
print('=-=' * 10)

=-==-==-==-==-==-==-==-==-==-=


## Other useful string operations

| Operation | Effect |
| ---       | --- |
| `str.count(sub)` | Count the number of occurrances of `sub` in `str`. |
| `str.startsWith(prefix)` | Check if `str` starts with `prefix`. |
| `str.endsWith(suffix)` | Check if `str` ends with `suffix`. |
| `str.find(sub)` | Find the position of the first occurrance of `sub` in `str`. |
| `str.split(sep)` | Split `str` into list of strings using separator `sep`. |
| `str.join(iter)` | Join list/tuple `iter` into a string concatenated by `str`. |

Reference: <https://docs.python.org/3/library/stdtypes.html#string-methods>

## F-string

F-string allow us to inject values into a string by specifying placeholders.

In [None]:
a = 1
b = 2
print(f'{a}+{b}={a+b}')

1+2=3


- `f'...'` specifies an f-string.
- `{}` is a placeholder for values to be put into the string.
- Details will be covered in a notes 3f (optiona).

# Quiz

## Quiz 2b

1. Give a single statement following the code below, so that the variable `s` with a value of type string,
   is appended with the number in variable `i` using operator `+`. 
   Use no spaces in your answer.

```python
s = input()
i = int(input())
# ...
```

2. Given variable `s`, with the use of slicing, give an expression that equals to `ape` if `s='apple'`,
   and equals to `pnape` if `s='pineapple'`.

# Exercises

## Exercise 2-3

- Write a program that reads an input string, and print a list of strings following a specific pattern as shown below.
- Suppose the input string is `hello`, the output should be:

```text
hellohello
_ellohell
__llohel
___lohe
____oh
```

Another example output for the input `foobar`:

```text
foobarfoobar
_oobarfooba
__obarfoob
___barfoo
____arfo
_____rf
```

## Exercise 2-4 (VPL available)

- Write a program that reads an input string, and print a strings following a specific pattern as shown below.
- Suppose the input string is `hello`, the output should be:

```text
hhehelhellhelloellolloloo
```

Note: The output is the concatenation of `'h'`, `'he'`, `'hel'`, `'hell'`, `'hello'`, `'ello'`, `'llo'`, `'lo'`, and `'o'`.

Sample input/output:

| Input | Output |
| --- | --- |
| blah | bblblablahlahahh |
| foobar | ffofoofoobfoobafoobaroobarobarbararr |

# Optional: Better string construction

## Concatenation performance

Consider the following example:

```python
a = "...100 characters..."
s = ""
for i in range(10):
    s += a
```
- Every time the statement `s += a` is executed, a new string is created and all characters are copied to a new string.
- What happens if the for-loop is running on `range(1000)`?
- Number of characters copied will be a lot!

## Better string construction

For better performance, it is very common that string is constructed by the following steps:
1. Constructng a **list** of strings of different parts
2. **Join** the list of strings into a single string.

In this way, we avoided generating intermediate concatenation results.

To join a list of string, we can use the `join()` function. The statement `x.join(y)` will join the list of string `y` using string `x`. For example:

In [None]:
a = "aaa"
b = "bbb"
c = "ccc"
print(','.join([a, b, c]))

aaa,bbb,ccc


To concatenate a list of strings, we join them with an empty string:

In [None]:
a = "aaa"
b = "bbb"
c = "ccc"
print(''.join([a, b, c]))

aaabbbccc


## Improved code

We revisit our previous example:

```python
a = "...100 characters..."
s = ""
for i in range(10):
    s += a
```

We can rewrite it to improve its performance:

```python
a = "...100 characters..."
l = []
for i in range(10):
    l.append(a)
    
s = ''.join(l)
```

# Optional: Multiline string

## Multiline string

Multi-line string is defined using three single quotes or double quotes:

```python
a = '''Hello,
world!'''

b = """Hello,
world!"""
```

## Multiline comments

Often the multi-line string notation is used to represents multi-line comments:

```python
'''
This is a comment.
This string is not kept in any variable.
'''
```

Multiline comments is also used for documentation of codes.

# Optional: Character code

## Character code

- Each character is represented by a code internelly in Python.
- This is also true for most programming languages.
- We can convert a character to its code by using the `ord()` function, or the `chr()`function for the reverse.

## `ord()` and `chr()`

Here is an example printing the code for character `a`:

In [None]:
c = "a"
a = ord(c)
c2 = chr(a)
print(c, "has a code of", a)

a has a code of 97


In [None]:
print(a, "is the code of the character", c2)

97 is the code of the character a


## Use of character code

Since the character code of `a` to `z` are consecutive, we can easily convert letter `a` to `z` to a range of `0` to `25`. For example:

In [None]:
c = "k"
ord_a = ord('a')
print(c, "is", ord(c) - ord('a'), "letters after 'a'")

k is 10 letters after 'a'


## Case checking

We can check the letter case of a character using a similar method:

In [None]:
def testCase(c):
    if ord(c) >= ord('a') and ord(c) <= ord('z'):
        print(c, "is a lower case letter")
    elif ord(c) >= ord('A') and ord(c) <= ord('Z'):
        print(c, "is a capital letter")
    else:
        print(c, "is not a letter")

testCase("k")
testCase("K")
testCase("*")

k is a lower case letter
K is a capital letter
* is not a letter


Note: We will discuss the use of function in a later section.

## ROT13

As an example, we can implement [ROT13](https://en.wikipedia.org/wiki/ROT13), a substitution cipher in this way:

In [None]:
# ROT13 on string
def rot13(text):
    result = []
    for c in text:
        result.append(rot13c(c))
    return "".join(result)

# ROT13 on one character
def rot13c(c):
    lowerc = ord(c) - ord('a')
    upperc = ord(c) - ord('A')
    if lowerc >= 0 and lowerc < 26:
        return chr(ord('a') + (lowerc + 13) % 26)
    elif upperc >= 0 and upperc < 26:
        return chr(ord('A') + (upperc + 13) % 26)
    else:
        return c

s = input("Input a string: ")
print(rot13(s))

Uryyb, jbeyq!
