# String (`str`) as a Data Type

String is one of the most commonly used data type in Python. Python has a built-in string class named `str` with many handy features (there is an older module named "string" which you should not use).

Run the code cell below.

In [None]:
print(type("abc"))

s = '12.4'
print(type(s))

Strings in python are surrounded by either single quotation marks, or double quotation marks.

'hello' is the same as "hello".

Run the code cells below to understand why Python introduces two different quotation mark types for strings.

In [None]:
s1 = "This string contains 'single' quotation marks."
print(s1)

s2 = 'This string contains "double" quotation marks.'
print(s2)

In [None]:
s3 = "This "string" cannot work."
print(s3)

We may cast an `int` variable or a `float` variable to a `str` variable using the function **`str()`**. Run the code below.

In [None]:
a = 3.14
b = 16

print(a + b)

s_a = str(a)
s_b = str(b)
print(s_a + s_b)

It is common to concatenate two string values together with the `+` operator, but we can do much more than that. We can extract partial strings from string values, add or remove spacing, convert letters to lowercase or uppercase, and check that strings are formatted correctly. We can even write Python code to access the clipboard for copying and pasting text.

# String Methods

Python has a set of **built-in methods** that you can use on strings. **All string methods returns new values. They do not change the original string because string is an IMMUTABLE data type.**

For example, the method `upper()` converts a string to its upper case. Try the code below.

In [None]:
sentence = "I love Python!"

uppercase = sentence.upper()

print(sentence)
print(uppercase)

The list of built-in methods for strings are as follows. To display this list in Python, simply run `help(str)`.

**Method** | **Description**
:------------- | :-
|
capitalize()   | Converts the first character to upper case
casefold()     | Converts string into lower case
center()	   | Returns a centered string
count()        | Returns the number of times a specified value occurs in a string
encode()	   | Returns an encoded version of the string
endswith()	   | Returns true if the string ends with the specified value
expandtabs()   | Sets the tab size of the string
find()	       | Searches the string for a specified value and returns the position of where it was found
format()	   | Formats specified values in a string
format_map()   | Formats specified values in a string
index()	       | Searches the string for a specified value and returns the position of where it was found
isalnum()	   | Returns True if all characters in the string are alphanumeric
isalpha()	   | Returns True if all characters in the string are in the alphabet
isdecimal()	   | Returns True if all characters in the string are decimals
isdigit()	   | Returns True if all characters in the string are digits
isidentifier() | Returns True if the string is an identifier
islower()	   | Returns True if all characters in the string are lower case
isnumeric()	   | Returns True if all characters in the string are numeric
isprintable()  | Returns True if all characters in the string are printable
isspace()	   | Returns True if all characters in the string are whitespaces
istitle()	   | Returns True if the string follows the rules of a title
isupper()	   | Returns True if all characters in the string are upper case
join()	       | Joins the elements of an iterable to the end of the string
ljust()	       | Returns a left justified version of the string
lower()	       | Converts a string into lower case
lstrip()	   | Returns a left trim version of the string
maketrans()	   | Returns a translation table to be used in translations
partition()	   | Returns a tuple where the string is parted into three parts
replace()	   | Returns a string where a specified value is replaced with a specified value
rfind()	       | Searches the string for a specified value and returns the last position of where it was found
rindex()	   | Searches the string for a specified value and returns the last position of where it was found
rjust()	       | Returns a right justified version of the string
rpartition()   | Returns a tuple where the string is parted into three parts
rsplit()	   | Splits the string at the specified separator, and returns a list
rstrip()	   | Returns a right trim version of the string
split()	       | Splits the string at the specified separator, and returns a list
splitlines()   | Splits the string at line breaks and returns a list
startswith()   | Returns true if the string starts with the specified value
strip()	       | Returns a trimmed version of the string
swapcase()	   | Swaps cases, lower case becomes upper case and vice versa
title()	       | Converts the first character of each word to upper case
translate()	   | Returns a translated string
upper()	       | Converts a string into upper case
zfill()	       | Fills the string with a specified number of 0 values at the beginning

If you need to view the details of a particular method, for example `replace()`, you may simply run `help(str.replace)`.

In [None]:
help(str.replace)

In [None]:
s = "all strings are immutable!"

s1 = s.replace("a", "A")
s2 = s.replace("a", "A", 1)
s3 = s.replace("a", "A", 2)

print("The original string  :", s)
print("Replacing all a's    :", s1)
print("Replacing first a    :", s2)
print("Replacing first 2 a's:", s3)
      

**Example 9.1**

The strings below store *The Middle*, a poem by *Ogden Nash*. Learn and apply the list methods `center()` and `rjust()` to print the poems in

**(a)** a centralised format, and

**(b)** a right-justified format.

```
title = "The Middle"
s1 = "When I remember bygone days"
s2 = "I think how evening follows morn;"
s3 = "So many I loved were not yet dead,"
s4 = "So many I love were not yet born."
author = "- Ogden Nash"
```

In [None]:
# code (a) here

In [None]:
# code (b) here

**What is the difference between `isdecimal()`, `isdigit()` and `isnumeric()`?**

Apply these methods on the following strings, then print and observe whether there is any difference in the output.
```python
s1 = "123"
s2 = "-45"
s3 = "0.75"
s4 = "12a"
```

In [None]:
# test here
s1 = "123"
#test below
print(s1.isdecimal())
print(s1.isdigit())
print(s1.isnumeric())

In [None]:
s2 = "-45"
#test below


In [None]:
s3 = "0.75"
#test below


In [None]:
s4 = "12a"
#test below


There is no difference between the three methods in the above examples. Now let's look at something differenet.
```python
s5 = "⓪③⑧"
s6 = "௦௧௨௩௪"
s7 = "一二三四五"
```

In [None]:
s5 = "⓪③⑧"
#test below


In [None]:
s6 = "௦௧௨௩௪"
#test below


In [None]:
s7 = "一二三四五"
#test below


By now you should have observed the differences between these 3 methods.

**Example 9.2**

Write a function `exist()` which
- takes in two strings `s` and `c`, where `c` is a string with a single character;
- returns `True` if `c` can be found in `s`; returns `False` otherwise.

There are at least two ways to do it, using either string method `count()` or `find()`. You may use the `help()` function to explore how these methods work.

*Test cases:*
```python
exist("ABCABC1", "A") should return True
exist("ABCABC2", "b") should return False
```

In [None]:
#Using count() method

In [None]:
#Using find() method

Besides the method above, Python provides a keyword `in` which can give you the conclusion immediately. Try the code below:

In [None]:
print("A" in "ABCAB1")

In [None]:
if "b" in "ABCABC2":
    print("'b' is found in the string!")
else:
    print("'b' is not found in the string!")

# Length of a String, `len()`

The **build-in function (not a method) `len()`** returns the length of a string as an integer. Run the following test code.

In [None]:
print(len("007"))

s1 = "Good job!"
print(len(s1))

s2 = ""
length = len(s2)
print(length)

**Example 9.3**

Write a function `longerstring()` which
- takes in two strings;
- returns the longer string between the two (if the two strings have the same length, you may return either one).

*Test cases:*
```python
longerstring("ABC", "1234") should return "1234"
longerstring("abc", "90") should return "abc"
```


# Indexing and Slicing

<p><img alt="String Indexing" src="https://drive.google.com/uc?id=1_C5xW8sTYJfkJ_ga-7UDyGSXASNEOq3m" width = "300" align="center" vspace="0px"></p>

We may access individual characters in a string by indexing. In forward indexing, **`0` is the index of the first character**. Try the code below.

In [None]:
string = "PYTHON"
print(string[0])
print(string[1])
print(string[2])
print(string[3])
print(string[4])
print(string[5])

Note that the index of the last character in a string is **its length - 1**. The code below will result in an error.

In [None]:
print(string[6])

We may also access these characters in backward indexing. **`-1` is the index of the last character**. Try the code below.

In [None]:
print(string[-1])
print(string[-2])
print(string[-3])
print(string[-4])
print(string[-5])
print(string[-6])

**Reminder:**

String is an **IMMUTABLE** data type, thus you can only access its characters by indexing but cannot modify them. The code below will result in an error.

In [None]:
string[1] = "y"

**Example 9.4**

Write a function `same_start_end()` which
- takes in a string;
- returns `True` if its first and last characters are the identical; returns `False` otherwise.

```python
same_start_end("aA12A") should return False
same_start_end("aBcCbA3a") should return True
same_start_end("1") should return True
```

In [None]:
# code here

You can return a range of characters in a string by using the slice syntax. Specify the **start index (inclusive)** and the **end index (exclusive)**, separated by a colon, to return a part of the string**. Try the code below.

In [None]:
string = "PYTHON" # its length is 6, thus indices are from 0 to 5

print(string[1:5]) # slicing from index 1 to index 4 (not including index 5)
print(string[-3:-1]) # slicing from index -3 to index -2 (not including index -1)

In index slicing, you may omit the start index (which implies to start from the same character) or omit the end index (which implies to end at the last character). Try the code below.

In [None]:
first3 = string[:3] # slicing from the start to index 2 (not including index 3)
print(first3)

from4 = string[4:] # slicing from index 4 to the end
print(from4)

This implies to backward indexing too. Try the code below.

In [None]:
last3 = string[-3:] # slicing from index -3 to the end
print(last3)

exceptlast = string[:-1] # slicing from the start to index -2 (not including index -1)
print(exceptlast)

**Example 9.5**

Write a function `digit_middle()` which
- takes in a string (you may assume the length is at least 3)
- returns `True` if its middle part (excluding its first and last characters) are all digits; return `False` otherwise.

In [None]:
# code here

<p><img alt="Python Cartoon" src="https://drive.google.com/uc?id=17qx7WNiRa-JOtgCgHWPtDZ7RkmwaaUDu" width = "100" align="left" vspace="0px"></p>

# *Go to Assignment 09A*

Now we are able to access all the characters in a string with indexing and a `while`-loop.

**Example 9.6**

Suppose we want to print out all the characters in the string "I love Python!" from the start to the end, separated by "|". The code is given below.
```Python
s = "I love Python!"
l = len(s)
index = 1
keepgoing = (index <= l)

while keepgoing:
    print(s[index], end = "|")
    index = index + 1
    keepgoing = (index <= l)
```
The code above has two critical errors. Copy-past the code below and recitify the errors. The expected output is:
```
I| |l|o|v|e| |P|y|t|h|o|n|!|
```

In [None]:
# code here

We have a neater solution to the example above using a `for`-loop as string is an **iterable** data type. Try the code below.

In [None]:
s = "I love Python!"

for c in s:
    print(c, end = "|")

# `for`-loop

A `for`-loop is used for iterating over a iterable. Some of the iterable data types in Python are
- string
- list
- range
- tuple
- dictionary
- set

***Syntax***
```python
for an_iterating_variable in an_iterable :
    
    <statements in the body of the for-loop until the entire iterable is exhausted>
    
<statements after the loop>
```

Below is a visual representation of the `for`-loop in flowchart.
<p><img alt="Flowchart_FOR_Loop" src="https://drive.google.com/uc?id=1aTcdz9ofa-UiOyUANuHVQlxgKYGcFx1O" width = "500" align="centre" vspace="0px"></p>

# Sentry Variable in a `for`-loop

A `for`-loop is also controllbed by a **sentry variable** (or a **loop variable**). The following steps are performed by Python automatically following the syntax.
* Step 1: Initialise the sentry variable (the iterating variable used in the `for` statement, is initialised to be the first item in the iterable). 
* Step 2: Check the sentry variable (whether the iterable is exhausted).
* Step 3: Update the sentry variable (the iterating variable is assigned with the next item in the iterable after every loop).

**Example 9.7**

Write a function `count_vowels()` which
- takes in a string;
- returns the total number of vowels in the string (including both upper and lower cases).

Use a `for`-loop instead of other methods for this example.

*Test cases:*
```python
count_vowels("Python") should return 1
count_vowels("I love Python!") should return 4
count_vowels("Computational Thinking is cool!") should return 11
```

In [None]:
# code here

**Example 9.8**
Write a function `reverse_string()` which
- takes in a string;
- returns its reversed string, e.g. the reversed string of "abc" is "cba".

*Test cases:*
```Python
reverse_string("abc") should return "cba"
reverse_string("Python") should return "nohtyP"
reverse_string("End of Term!") should return "!mreT fo dnE"
```

In [None]:
# code here

<p><img alt="Python Cartoon" src="https://drive.google.com/uc?id=17qx7WNiRa-JOtgCgHWPtDZ7RkmwaaUDu" width = "100" align="left" vspace="0px"></p>

# *Go to Assignment 09B*

# Solution

**Example 9.1**

In [None]:
#(a)
title = "The Middle"
s1 = "When I remember bygone days"
s2 = "I think how evening follows morn;"
s3 = "So many I loved were not yet dead,"
s4 = "So many I love were not yet born."
author = "- Ogden Nash"

print(title.center(50))
print(s1.center(50))
print(s2.center(50))
print(s3.center(50))
print(s4.center(50))
print(author.center(50))

In [None]:
#(b)
print(title.rjust(50))
print(s1.rjust(50))
print(s2.rjust(50))
print(s3.rjust(50))
print(s4.rjust(50))
print(author.rjust(50))

**Example 9.2**

In [None]:
#Using count() method
def exist(s, c):
    
    if s.count(c) == 0:
        return False
    else:
        return True

In [None]:
#Using find() method
def exist(s, c):
    
    if s.find(c) == -1:
        return False
    else:
        return True

**Example 9.3**



In [None]:
def longerstring(s1, s2):

    if len(s1) >= len(s2):
        return s1
    else:
        return s2

**Example 9.4**

In [None]:
def same_start_end(s):
    
    return s[0] == s[-1]

**Example 9.5**

In [None]:
def digit_middle(s):
    
    return s[1:-1].isdecimal()

**Example 9.6**

In [None]:
s = "I love Python!"
l = len(s)
index = 0
keepgoing = (index < l)

while keepgoing:
    print(s[index], end = "|")
    index = index + 1
    keepgoing = (index < l)

**Example 9.7**

In [None]:
def count_vowels(s):
    
    counter = 0
    vowels = "AEIOUaeiou"
    
    for c in s:
        if (c in vowels):
            counter = counter + 1
    
    return counter


**Example 9.8**

In [None]:
def reverse_string(s):
    
    r_string = ""
    
    for c in s:
        r_string = c + r_string
    
    return r_string

reverse_string("End of Term!")