## Lesson 3 - String Processing

- 3.1 - String Sequence
- 3.2 - Basic Operations of String Object
- 3.3 - Special Character and Escape Sequence
- 3.4 - Methods and Functions of String Object
- 3.5 - String Search
- 3.6 - String Modification
- 3.5 - Transforming Object to a Sting: repr() and str()
- 3.6 - Formating String Object: format(), %, and f-string
- 3.7 - Using bytes Object
- 3.8 - Application of print()

In everyday business operation, **string** is one of the most important and challenging type of data in programming design from user's name, file name, and other text processing. The powerful built-in text processing and formatting tools allows us to complete these tasks efficiently. We are going to discuss some of these tools for processing **string** object in Python.

String literals can be enclosed by either double or single quotes, although single quotes are more commonly used.  A double quoted string literal can contain signle quotes without any fuss and likewise single quoted string can contain double quotes.  

Reference: https://docs.python.org/3/library/stdtypes.html#text-sequence-type-str

## 3.1 - String Sequence

In the previous lesson, we mentioned that **list** and **tuple** are ordered colletions of arbitrary objects, in which the order of the elments are defined when it was created. This kind of ordered data type is called "**sequence**" in Python. The Python **strings** object is an immutable sequence of Unicode characters.  

Since Python **Strings** is a sequence data type, we can also apply indexing [n] to access a character or slicing [n:m] to access a sub-part of sequences from a string. Python strings are **immutable** which means they cannot be changed after they are created. Therefore, the extracted or modified characters needs to be assigned to a new variable.

e.g. Python strings
``` Python
x = "Hello"
x[0]  # return "H"
x[-1]  # return "o"
x[1:]  # return "ello"
```

In [None]:
# Create the Python sting
x = "Hello"
x[0]  # return "H"
x[-1]  # return "o"
x[1:]  # return "ello"

An application for slicing a string is for text processing to take out any **escape** character at the end of the string, such as "\n" is a newline character, "\t" is a tab character, etc. It is an useful text processing for raw text data.

e.g. 
``` Python
x = "Goodbye\n"
x = x[:-1]
x  # return 'Goodbye'
```

In [None]:
# Take out the escape character
x = "Goodbye\n"
x = x[:-1]
x  # return 'Goodbye'

The previous example is just one of the ways to remove those unnecessary character from a string.  Python has many built-in functions and methods to handle the task.

We can also use Python built-in len( ) function to count the characters within a string.  For example,

``` Python
len("Goodbye")  # return 7
```



In [None]:
# Using the len( ) function on string
len("Goodbye")  # return 7

Note: Do not confuse between a list and a string, the most obvious different between the two is that: string is **immutable**, which is mainly for performance purpose. Any attempt to modify a string will return an error message. For instances,

``` Python
string = "Hello"
string.append('c') # return error message
string[0] = "A"  # return error message
```

In the previous example, when we are trying to remove an escape character '\n', we are literally copying from the orginal string, slicing it, and then creating a new string. In the following sections, we will discover some of the functions and methods of string objects, which applies exactly the same logic.

In [None]:
# Error message for modifying the string object
string = "Hello"
string.append('c') # return error message
string[0] = "A"  # return error message

## 3.2 - Basic Operations of String Object

When we need to join multiple strings, the easiest (or most popular) way is to connect the strings with the mathematical operator (+):

``` Python
x = "Hello " + "World"
x  # return 'Hello World'
```

Python will also join the strings with space between them:

``` Python
x = "Hello "   "World"
x  # return 'Hello World'
```

The multiplication operator (*) also works with the string data type (even though it is not often used).

``` Python
8 * "x"  # return 'xxxxxxxx'
```

In [None]:
# Using the addition operator to join two string objects
x = "Hello " + "World"
x  # return 'Hello World'

In [None]:
# Using space to join two string objects
x = "Hello "   "World"
x  # return 'Hello World'

In [None]:
# Using multiplication operator on string object
8 * "x"  # return 'xxxxxxxx'

## 3.3 - Special Characters and Escape Sequence

In the previous section, we have seen an example of **escape character**: \n means newline and \t means tab. In Python strings, the backslash "\" is a special character, also called the **escape character**, when it is combined with other characters it becomes an **escape sequence**, which is used in representing certain withespace characters. In this section, we introduce some of the most common **escape sequences** and their applications.

##### Basic Escape Sequence

Below table presents the most common used **escape sequences**.  These **escape sequences** also apply to the **bytes** object covered at the end of this lesson.

|  Escape Sequence  |  Meaning  |
| :---:  |  :---:  |
|  \newline  |  Ignored  |
|  \'  |  single quote (')  |
|  \"  |  double quote (")  |
|  \\  |  Backslash (\)  |
|  \a  |  ASCII Bell (Bell)  |
|  \b  |  ASCII Backspace (Backspace)  |
|  \f  |  ASCII Formfeed (start a new page)  |
|  \n  |  ASCII Linefeed (start a new line)  |
|  \r  |  ASCII Carriage Return (Carriage Return)  |
|  \t  |  ASCII Horizontal Tab (Tab)  |
|  \v  |  ASCII Vertical Tab (Vertical Tab)  |

Except the escape squences mentioned above, there are other ASCII characters are defined with numbers.  

#####  Unicode Escape Sequence (8 bit / 16 bit)

In Python, we can use 8-bit or 16-bit escape sequence to presents any ASCII (American Standard Code for Inofrmation Interchange) characters.  8-bits escape sequence is deined by a backslash (\) follow with three numbers: \nnn, where nnn is a 8-bit value.  16-bit escape sequence is defined by a backslash and x (\x) follow with two 16-bit characters: \xnn, where nn presents a 16-bit value.

e.g. Unicode Escape Sequence
``` Python
'm'  # return 'm'
'\155'  # return 'm'
'\x6D'  # return 'm'
'\x6d'  # return 'm', we can write the 16-bit value in either small or capital letter
```

Applying the same logic, we can write other escape sequence in the following way:

``` Python
'\n'  # return '\n'
'\012'  # return '\n'
'\x0A'  # return '\n'
```

Starting Python 3, all string object is defined by Unicode string, which allow us to presents any characters from different languages. Here is some simple example to demonstrate the use of 16-bit character and Unicode name.

``` Python
# Using the Unicode name to present Unicode character
unicode_a = '\N{LATIN SMALL LETTER A}'
unicode_a  # return 'a'

unicode_a_with_acute = '\N{LATIN SMALL LETTER A WITH ACUTE}'
unicode_a_with_acute  # return 'á'

# Using \u follow with 4 digit 16-bit character to present a Unicode Character
'\u00E1'  # return 'á' by Unicode character
```

In [1]:
# Using the Unicode name to present Unicode character
unicode_a = '\N{LATIN SMALL LETTER A}'
unicode_a  # return 'a'

'á'

In [None]:
# Using the Unicode name to present Unicode character
unicode_a_with_acute = '\N{LATIN SMALL LETTER A WITH ACUTE}'
unicode_a_with_acute  # return 'á'

In [None]:
# Using \u follow with 4 digit 16-bit character to present a Unicode Character
'\u00E1'  # return 'á' by Unicode character

##### Use print( ) function to show the escape sequence

To present the escape sequence, we can use the print( ) function to actually see the return value from sequence.  Here are some examples,

e.g. a string 'a\n\tb' verus print('a\n\tb')

``` Python
# Python interpret the string directly
'a\n\tb'  # return 'a\n\tb' 

# print( ) function interpret escape sequence and return the actual value
print('a\n\tb')  # return a and b with newline and tab
```

In this example, the first case returned the original string format.  In the second case, the print( ) function interpret the escape sequence and return the actual value with the newline and tab space.

In [None]:
# Python interpret the string directly
'a\n\tb'  # return 'a\n\tb' 

In [None]:
# print( ) function interpret escape sequence and return the actual value
print('a\n\tb')  # return a and b with newline and tab

Generally, the print( ) function returns the string with an additional newline.  Sometime a string contains a newline escape sequence and we are trying to avoid doubling the newline, we can put in an argument in the print( ) function (end="") and set the newline as empty, so that the return value does not include an additional newline.

e.g.
``` Python
print('Hello World\n')  # return 'Hello World' with two newlines

print('Hello World\n', end="")  # return 'Hello World with one newline
```

In [2]:
# With additional newline from print( ) function
print('Hello World\n')

Hello World



In [3]:
# Without additional newline from print( ) function
print('Hello World\n', end="")

Hello World


## 3.4 - Methods and Function for Strings

Python has a set of built-in methods that we can apply on strings, which we can directly apply these methods for text processing.  Furthermore, the string module contains a number of useful constants and classes, as well as some deprecated functions that are also available as methonds on strings.  In this section, we only focus on how to apply the Python built-in methods and their applications.  To apply the Python built-in methods on strings, we only need to remember to add a period (.) after the string to operate:

Reference: https://docs.python.org/2.5/lib/string-methods.html

e.g.  stringName.method()
``` Python
# returns lowercased string
x = 'HELLO WORLD'
x.lower()  # return new string 'hello world'

# returns uppercased string
x.upper()  # return new string 'HELLO WORLD'

# converts first character to Captial letter
x.capitalize()  # return new string 'Hello world'
```

Note: Keep in mind that string is an immutable sequence, so the method of string object is actaully returning a new string, but not modifying the original string.

In [4]:
# returns lowercased string
x = 'HELLO WORLD'
x.lower()  # return new string 'hello world'

'hello world'

In [5]:
# returns uppercased string
x.upper()  # return new string 'HELLO WORLD'

'HELLO WORLD'

In [6]:
# converts first character to Captial letter
x.capitalize()  # return new string 'Hello world'

'Hello world'

##### Using split( ) and join( ) 

Both the split( ) and join( ) methods are very powerful tools for text processing.  They have the exact opposite function in text processing: split( ) method breaks a string object into a list of strings, and the join( ) method connects a list of strings and form a new string object.  

Using (+) for joining strings could be useful.  However, when a massive number of strings need to be joined, the (+) method will create efficiency problem.  For instance, when we are trying to join two strings "Hello" and "world", three objects need to be created: "Hello", "world", and "Hello world".  The first two objects will only be dumped when the thrid object is created.  Therefore, the (+) method actually creates a massive useless string objects during the process.  

The join( ) method is a more efficient way for the same task.

Syntex: str.join(sequence) 

e.g. join a list of strings with empty spaces between each element
``` Python
# join a list of strings with empty spaces between them
" ".join(["join", "puts", "spaces", "between", "elements"])
# return 'join puts spaces between elements'
```

We only need to change the string object in front of the join( ) function to identify the object between each string from the list.

e.g. join a list of strings with ::
``` Python
# join a list of strings with double colon
"::".join(["Separated", "with", "colons"])
# return 'Separated::with::colons'
```

We can also join a list of strings with empty space.

``` Python
# join a list of string with empty space
"".join(["Separated", "by", "nothing"])
# return 'Separatedbynothing'
```

In [None]:
# join a list of strings with empty spaces between them
" ".join(["join", "puts", "spaces", "between", "elements"])

In [None]:
# join a list of strings with double colon
"::".join(["Separated", "with", "colons"])

In [None]:
# join a list of string with empty space
"".join(["Separated", "by", "nothing"])

split( ) method splits a string into a list.  The default separator is any whitespace, but users can specify the separator for spliting.  Here are some examples:

Syntex: str.split(separator, max)

e.g.
``` Python
x = "You\t\t can have tabs\t\n \t and newlines \n\n " \"mixed in"
x.split()  # return ['You', 'can', 'have', 'tabs', 'and, 'newlines', 'mixed', 'in']

x = "Mississippi"
x.split("ss")  # return ['Mi', 'i', 'ippi']
```

We can also use the second argument to define the number of split.  If given the argument with value of n, split( ) creates a list of n+1 elements or until the string cannot be split.  Here are some examples:

``` Python
x = 'a b c d e'
x.split(' ', 1)  # return ['a', 'b c d e']
x.split(' ', 2)  # return ['a', 'b', 'c d e']
x.split(' ', 9)  # return ['a', 'b', 'c', 'd', 'c']
```

If we want to split a string with a whitespace with the second arugment defined, we can pass in "None" for the first arugment.  

``` Python
x = 'a\nb c d'
x.split(' ', 2)  # return ['a\nb', 'c', 'd']
x.split(None, 2)  # return ['a', 'b', 'c d']
```

split( ) and join( ) are popular tool used for textual data. It is recommended to use the Python standard csv and json packages for language processing. 

In [None]:
x = "You\t\t can have tabs\t\n \t and newlines \n\n " \"mixed in"
x.split()

In [None]:
x = "Mississippi"
x.split("ss")

In [None]:
x = 'a b c d e'
x.split(' ', 1)

In [None]:
x.split(' ', 2)

In [None]:
x.split(' ', 9)

In [None]:
x = 'a\nb c d'
x.split(' ', 2)

In [None]:
x.split(None, 2)

In [None]:
# Try it yourself!

# How to use split( ) and joint( ) to change all the whitespaces in a string
# to '-'?
# e.g.
# "this is a test"  =>  "this-is-a-test"

##### Using int( ) and float( )

Integers and floasts are data tyeps that deal with numbers.  Python has built-in functions int( ) and float( ) converting strings to integers or floats. If the string cannot be interpreted, the functions will return ValueError. 

int( ) function has two arguments, which the second argument is optional that can be used to define the number format.  The default value is 10.  Here are some examples:

e.g. 
``` Python
float('123.456')  # return 123.456

float('xxyy')  # return error message

int('3333')  # return 3333

int('123.456')  # return error message

int('10000', 8)  # return 4096

int('101', 2)  # return 5

int('ff', 16)  # return 255

int('123456', 6)  # return error message
```

In [None]:
# Try it yourself!

# Try to translate the following into numbers and explain why they will return an error.

int('a1')

int('12G', 16)

float('12345678901234567890')

int('12*2')

##### Using strip( ), lstrip( ), rstrip( ) to remove whitespaces

strip( ), lstrip( ), and rstrip( ) are the simple methods used for removing the whitespaces starting or ending in a string.  lstrip( ) and rstrip( ) are used to remove the left whitespace and the right whitespace respectively. Here are some examples:

e.g.

``` Python
x = " Hello, World\t\t"
x.strip()  # return 'Hello, World'

x.lstrip()  # return 'Hello, World\t\t'

x.rstrip()  # return ' Hello, World'
```

Note: Whitespaces may be interpreted differently in different operation systems, but we can check the space characters using string.whitespace from **string** package.  Here is the returned whitespace characters from Window operation system.

``` Python
import string
string.whitespace  # return the whitespace characters

" \t\n\r\v\f"  # return ' \t\n\r\x0b\x0c'
```

Above returned values **\x0b** and **\x0c** translate as the **line tabulation** and **form feed**.  We should be careful not to change any of the whitespace characters in the orginal string when using strip( ) because it may return an unexpected result. 

Another way is to assign the removed characters to a variable and using strip( ), lstrip( ) and rstrip( ) to remove the specific characters from a string.

e.g.
``` Python
x = "www.python.org"
x.strip("w")  # return '.python.org'

x.strip("gor")  # return 'www.python.'

x.strip('.gorw')  # return 'python'
```

Note: Regardless the order of the characters in the string, strip( ) removes all the characters specified in the argument.

rstrip( ) would be extremely helpful when we are dealing with text file with multiple lines, which can effectively remove all the newline whitespaces at the end of each line.  

In [1]:
# Check the whitespaces characters in your OS
import string
string.whitespace

' \t\n\r\x0b\x0c'

In [2]:
" \t\n\r\v\f"

' \t\n\r\x0b\x0c'

In [None]:
# Using the strip( ) arugments to remove specific character from a string
x = "www.python.org"
x.strip("w") 

In [None]:
x.strip("gor")

In [None]:
x.strip('.gorw')

In [None]:
# Try it yourself!

# If string x = "(name, data),\n", which of the following return "(name, date)" string?

x.rstrip("),")

x.strip("),\n")

x.strip("\n)(,")

##### Using isdigit( ), isalpha( ), islower( ), and isupper( )

Python has a few useful methods used for checking the value of the string, such as checking if a string contains numberical or text only, check if all the text are uppercase or lowercase, etc.  Here are some examples of this methods:

e.g.

``` Python
x = "123"

# Check if the string is numbers
x. isdigit()  # return True

# Check if the string is English letters
x.isalpha()  # return False

x = "ABC"

# Check if the characters in the string are all lowercase
x.islower()  # return False

# Check if the characters in the string are all uppercase
x.isupper()  # return True
```

## 3.5 - String Search

Python provides some built-in methods for string search.  

##### Using "in" keyword

We have mentioned that string is a sequence data type, so we can use the "**in**" keyword for word search in a string:

e.g.
``` Python
x = "The string"
"str" in x  # return True

"sTr" in x  # return False

"e s" in x  # return True
```



In [None]:
x = "The string"
"str" in x 

In [None]:
"sTr" in x

In [None]:
"e s" in x 

##### Using fine( ), rfind( ), index( ), and rindex( )

We can also perform word search using some Python built-in methods, such as find( ), rfind( ), index( ), and rindex( ).

We introduce the use of find( ) method here.  When using the find( ) method for word search, we need to pass in the search substring as an argument.  The method will return the lowest index of the substring if it is found in given string.  If it's not found then it returns -1.

e.g. 
``` Python
x = "Mississippi"
x.find("is")  # return 1

x.find("zz")  # return -1
```

The find( ) method has two optional arugments, which can be used in the following form, find(substring, start, end), where both **start** and **end** arguments must be an interger.  **start** refers to the starting position where sub is needs to be checked within the string.  **end** refers to the ending position where suffix is needs to be checked within the string.

e.g.
``` Python
x = "Mississippi"
x.find("s")  # return 2

x.find("s", 2)  # return 2

x.find("s", 4)  # return 5

x.find("s", 4, 5)  # return -1

x.find("ss", 3)  # return 5

x.find("ss", 0, 3)  # return -1
```

rfind( ) method is exactly like the find( ) method, but searching from an opposite direction.  It starts at the end of the string to the head, so the return position of the substring will be the last one in the string.

e.g.
``` Python
x = "Mississippi"
x.rfind("ss")  # return 5
```

rfind( ) also has the same optional arguments, which work exactly the same way as find( ), but in the reverse order.

In [None]:
x = "Mississippi"
x.find("is")

In [None]:
x.find("zz")

In [None]:
x = "Mississippi"
x.find("s") 

In [None]:
x.find("s", 2)

In [None]:
x.find("s", 4) 

In [None]:
x.find("s", 4, 5)

In [None]:
x.find("ss", 3)

In [None]:
x.find("ss", 0, 3)

In [None]:
x = "Mississippi"
x.rfind("ss")

index( ) and rindex( ) are similar to find( ) and rfind( ), but index( ) and rindex( ) return ValueError when the substring cannot be found.  

e.g.
``` Python
x = "Mississippi"
x.index("s")  # return 2

x.index("s", 4)  # return 5

x.index("ss", 0, 3)  # return ValueError

x.rindex("ss")  # return 5
```

In [1]:
x = "Mississippi"
x.index("s")

2

In [2]:
x.index("s", 4)

5

In [4]:
x.index("ss", 0, 3)

ValueError: substring not found

In [5]:
x.rindex("ss")

5

##### count( ) Method

String object has a method **count( )**, which can count the frequency of the substring in a string object.  To use count( ), we need to pass in the substring into the argument.

e.g.
``` Python
x = "Mississippi"
x.count("ss")  # return 2
```

In [None]:
x = "Mississippi"
x.count("ss")

##### startswith( ) and endswith( ) Methods

We can also use the startswith( ) and endswith( ) methods to check if a string starts or ends with a specific substring.  It returns a boolean.

e.g.
``` Python
x = "Mississippi"
x.startswith("Miss")  # return True

x.startswith("Mist")  # return False

x.ednswith("pi")  # return True

x.endswith("p")  # return False
```

Note: Both startswith( ) and endswith( ) allows search with mutiple substrings, but the substrings need to be passed into the methods in tuple format.  Here is an example:

e.g.
``` Python
x.endswith(("i", "u"))  # return True
```


In [None]:
x = "Mississippi"
x.startswith("Miss")

In [None]:
x.startswith("Mist")

In [None]:
x.ednswith("pi")

In [None]:
x.endswith("p") 

In [None]:
# Check if x ends with either i or u
x.endswith(("i", "u"))

In [None]:
# Try it yourself!

# If you want to check if there is a line end with "rejected",
# what method you are going to use and is there another way to achieve the same task?



## 3.6 - String Modification

As mentioned eariler, string is an immutable object in Python.  However, there are a few methods that we can use to operate with a string object and return a new modified string.  These methods satisfy the demand for most of the everyday text operations. 

First of all, we can use **replace( )** to replace some of the characters in a string and return a new one.  The first argument is the substring to be replaced and the second arugment is the new replacement.  This method has the third optional argument, which can be found in the Python documentation.

e.g.
``` Python
x = "Mississippi"
x.replace("ss", "+++")  # return 'Mi+++i+++ippi'
```



In [None]:
x = "Mississippi"
x.replace("ss", "+++")

We can also use **maketrans( )** with **translate( )** to creat a one to one mapping of a character to its translation/replacement.  This may not be common methods to use, but it could be useful when we are trying to simplify our programming code.  Here is an example:

e.g.
``` Python
x = "~x ^ (y % z)"
table = x.maketrans("~^()", "!&{}")
x.translate(table)  # return '!x & [y % z]'
```

In [None]:
x = "~x ^ (y % z)"
table = x.maketrans("~^()", "!&{}")
x.translate(table)

##### lower( ), upper( ), capitalize( ), title( ), swapcase( ), and expandtabs( )

The following methods are the commonly used in text editing:

**lower( )**: returns the lowercased string from the given string  
**upper( )**: returns the uppercased string from the given string  
**capitalize( )**: coverts the first character of the string to capital letter and returns new string  
**title( )**: converts the first character in each word to capital letter and remaining characters to lowercase in string and returns new string  
**swapcase( )**: converst all uppercase characters to lowercase and vice versa the given string and returns new string  
**expandtabs( )**: specifies the amount of space to be substituted with the "\t" symbol in the given string and returns new string  

In [7]:
x = "This Is A Test"
x.lower()

'this is a test'

In [8]:
x.upper()

'THIS IS A TEST'

In [9]:
x = "this is a test"
x.capitalize()

'This is a test'

In [10]:
x.title()

'This Is A Test'

In [11]:
x = "ThIs Is A TeSt"
x.swapcase()

'tHiS iS a tEsT'

In [16]:
x = "this\tis\ta\ttest"
x.expandtabs()

'this    is      a       test'

In [None]:
# Try it yourself!

# Try to use all of the methods to play around with different strings.



##### ljust( ), rjust( ), center( ), zfill( )

We can apply the following methods for filling specific characters in a string.

**ljust( )**: returns a new string of given length after substituting a given character in right side of orginal string  
**rjust( )**: returns a new string of given length after substituting a given character in left side of original string  
**center( )**: creates and returns a new string which is padded with the specified character  
**zfill( )**: return a copy of the string with '0' characters paded to the leftside of the given string

In [20]:
x = "test"
x.ljust(10)

'test      '

In [21]:
x.rjust(10)

'      test'

In [22]:
x.center(10)

'   test   '

In [23]:
x.zfill(10)

'000000test'

In [None]:
# Try it yourself!

# Try to use all of the methods to play around with different strings.

##### Using list to modify string object

Since string is an immutable object in Python, so we cannot modify it as a list object.  Even though we can apply the methods mentioned above to create a new string object, but sometime we still want to modify it directly.  We can achieve this objective by changing a string object to a list then modify the characters in the list and finally change the modified list of characters back into a string.  Here is an example:

e.g. 
``` Python
text = "Hello, World"

# Change the string to a list of characters
wordList = list(text)

# Remove all character after comma
wordList[6:] = []

# Reverse the characters order
wordList.reverse()

# Join the characters in the list with no space
text = "".join(wordList)

# Print the text
print(text)  # return ',olleH'
```

Note: We can also use tuple( ) to transform the string into a tuple and use "".join( ) to switch it back to string.  However, neither of the two are suggested methods for text processing in Python because the process continues to create and delete string object, which is a relatively expensive method.  The efficiency be significantly affected when the program is designed for processing a massive string objects (hundreds of million words).

In [None]:
text = "Hello, World"

# Change the string to a list of characters
wordList = list(text)

# Remove all character after comma
wordList[6:] = []

# Reverse the characters order
wordList.reverse()

# Join the characters in the list with no space
text = "".join(wordList)

print(text)

In [None]:
# Try it yourself!

# Try to replace all symbol characters in a string to empty spaces



## 3.7 - Transforming Object to a Sting: repr() and str()

In Python, almost every thing can be transform into a string by the built-in function repr( ).  The repr( ) function returns a printable representational string of the given object.  We can use the following example to demonstrate how to transform a list object into a string.

e.g.
``` Python
# Using repr( ) to transform a list object to a string
repr([1, 2, 3])  # return '[1, 2, 3]'

# Suppose we create a list and assign to x
x = [1]
x.append(2)
x.append([3, 4])

# Print a string with x
'the list x is ' + repr(x)  # return 'the list x is [1, 2, [3, 4]]'
```

In this example, we use repr( ) to transfrom a list object to a string and casted with another string.  If we use "+" operator to join the string with a list object, Python will get confused of what you are trying to do and return an error message.  

repr() bascially can be used to transform any Python object into a string.  We can also try to use repr( ) to interpret some of the built-in objects in Python.

e.g. 

``` Python
repr(len)  # return '<built-in function len>'

repr(tuple)  # return "<class 'tuple'>"

repr(input)  # return '<bound method Kernel.raw_input of <ipykernel.ipkernel.IPythonKernel object at 0x00000259B7E39208>>'
```

repr( ) does not return the built-in code of the function, but only return a string value that specifyies len( ) is a built-in function in Python.  Same for the tuple object, repr( ) only returns a string that describe the tuple object is a class in Python.  In general, repr( ) is an effective tools for debugging.

Python also has a built-in function str( ) to transform different object to a string.  The difference between repr( ) and str( ) is that repr( ) returns a Python object called "formal string represetnation", which means we can also re-generate the orginal object with the given string.

On the other hand, str( ) returns a Python object called "informal string representation", which we cannot use it to re-generate the orginal object.  In a simple term, str( ) return a string value for reading and repr( ) return a string for Python to interpret.  We can experience it with the following example:

e.g.
``` Python
from datetime import datetime
now = datetime.now()

# Using print( ) to print the current date and time
print(now)

# Using str( ) to print the current date and time as string
str(now)

# Using repr( ) to interpret the datetime object 
repr(now)
```

For beginners, it is often the case people thought the two are the same and getting confused when to use one, but not the other.  It becomes handy when we need to identify the object type in our code.  In many cases, we would apply str( ) for creating readable string in the code and use repr( ) if we need to figure out the object representation.

In [None]:
# Using repr( ) to transform a list object to a string
repr([1, 2, 3])

In [None]:
# Suppose we create a list and assign to x
x = [1]
x.append(2)
x.append([3, 4])

# Print a string with x
'the list x is ' + repr(x)

In [14]:
repr(len)

'<built-in function len>'

In [17]:
repr(tuple)

"<class 'tuple'>"

In [16]:
repr(input)

'<bound method Kernel.raw_input of <ipykernel.ipkernel.IPythonKernel object at 0x00000259B7E39208>>'

In [18]:
from datetime import datetime
now = datetime.now()

# Using print( ) to print the current date and time
print(now)

2020-05-07 13:23:52.897323


In [19]:
# Using str( ) to print the current date and time as string
str(now)

'2020-05-07 13:23:52.897323'

In [20]:
# Using repr( ) to interpret the datetime object 
repr(now)

'datetime.datetime(2020, 5, 7, 13, 23, 52, 897323)'

## 3.8 - Formatting a String

Python 3 has three different methods for formatting strings.  In this section we first introduce format( ).  format( ) is a powerful formatting tool in Python, which provides multiple ways to format textual data into string.  We demonstrates a few basic formatting examples here:

Reference: https://docs.python.org/3/library/string.html#format-string-syntax

##### Placeholders and Numbered Indexes

format( ) is method for string object.  The most common way to use it is to create an empty placeholders in a string and fill in the placeholders with the arguments in the format( ) function by its order.

e.g. 
``` Python
# Using empty placeholders
"{} is the {} of {}".format("Ambrosia", "food", "the gods")  # return 'Ambrosia is the food of the gods'

# Using empty placeholders with curly brackets in the string
"{{Ambrosia}} is the {} of {}".format("food", "the gods")  # return '{Ambrosia} is the food of the gods'
```

In this example, the curly brackets { } are the placeholders in the orginal string, which are replaced by the sub-string values by their order. When a string contain curly brackets, we need to make sure it has double curly brackets to include the original curly brackets.

Not only string can be passed into format( ).  Any Python object can be passed into format( ) and Python will transform it to string.

e.g.
``` Python
# Passing in number and mathematical operation to format( )
"{} + {} = {}".format(1, 2, 1+2)  # return '1 + 2 = 3'

# Passing in a list object to format( )
x = [1, 2 , 'three']
"The {} contains: {}".format("list", x)  # return "The list contains: [1, 2, 'three']"
```

We can also use the number indexes to appoint the sub-strings to the specific placeholders in a string.  Here is an example:

e.g.
``` Python
# Use number indexes
"{2} is the {0} of {1}".format("food", "the gods", "Ambrosia")  # return 'Ambrosia is the food of the gods'
```

In this example, we have three number indexes {0}, {1}, and {2}, which are used to identify the position of the sub-strings.  Since the placeholders depends on the nubmer index, we can also repeat these indexes using format( ) method.  Here is an example:

e.g. 
``` Python
# Repeat the indexes
"{0} is not {1} and {1} is not {0}".format("abc", "cba")  # return 'abc is not cba and cba is not abc'
```


In [None]:
# Using empty placeholders
"{} is the {} of {}".format("Ambrosia", "food", "the gods")

In [None]:
# Using empty placeholders with curly brackets in the string
"{{Ambrosia}} is the {} of {}".format("food", "the gods") 

In [None]:
# Passing in number and mathematical operation to format( )
"{} + {} = {}".format(1, 2, 1+2) 

In [None]:
# Passing in a list object to format( )
x = [1, 2 , 'three']
"The {} contains: {}".format("list", x) 

In [None]:
# Use number indexes
"{2} is the {0} of {1}".format("food", "the gods", "Ambrosia") 

In [21]:
# Repeat the indexes
"{0} is not {1} and {1} is not {0}".format("abc", "cba")

'abc is not cba and cba is not abc'

##### Using named parameter

format( ) also allow us using the variable name of the sub-string to fill a specific placeholder.  Here is an example:

e.g.
``` Python
# Use the named parameter
"{food} is the food of {user}".format(food="Ambrosia", user="the gods")  # return 'Ambrosia is the food of the gods'
```

