# Strings and Characters in Python
In computer sciences, a **character** will refer to letters, numbers or symboles. A chain of characters will be called a **string**.<br />
Strings are declared between **"** or between **'**, and refered to as **str**.

In [46]:
# Declaration of a string using ""
type("Hello, World!")

str

In [47]:
# Declaration of a string using ''
type('Hello, World!')

str

There is **no practical difference** between strings declared using " and ', as long as they have the same value.

In [48]:
# Comparison between strings declared with "" and ''
"Hello, World!" is 'Hello, World!'

True

In opposition to other programming languages, **Python does not have a character type**. So any isolated character will be considered a **string**.<br /> 
Thus characters are of a lesser importance.

In [49]:
# Initiation of a single character using ''
type('c')

str

However, characters can still be useful.<br />
It is possible to know the **Unicode value** of a character using the function **'ord()'**.<br />
And conversively, you can determine the character corresponding to a Unicode value using the function **'chr()'**

In [50]:
# Unicode value of character c
ord('c')

99

In [51]:
# Character corresponding to Unicode value 99
chr(99)

'c'

In [52]:
# But the function chr still returns a string
type(chr(99))

str

## The importance of the backslash
### Escaping
One of the most critical aspect of declaring a string is to pay attention to the content of the string.<br />
For example, if a string is declared with **'**, what will happen if the string itself contains a '?<br />

In [53]:
string = 'I'm very happy to learn Python'

SyntaxError: invalid syntax (<ipython-input-53-efc1f3453059>, line 1)

The simplest answer would be to use the other option to declare the string.<br />
In this case **"**.

In [54]:
string = "I'm very happy to learn Python"
string

"I'm very happy to learn Python"

Works like a charm! However, there is a more universal solution to this problem.<br />
Indeed, we can use **escaping**. For that, we just need to use the character **\** before the ' or the ". It will then be interpreted as the corresponding character.

In [55]:
string = 'I\'m very happy to learn Python'
string

"I'm very happy to learn Python"

### Formatting
The backslash is also used to perform **formatting** within a string. 

In [56]:
# Using \t for tab
"Hello! \tGoodbye!"

'Hello! \tGoodbye!'

Interestingly, it does nothing here. That is because in **interactive mode** (Jupyter Notebook or Python console), displaying a string only by calling it will only display its value.<br />
For the formatting to be considered, the string needs to be displayed with the function **"print()"**.

In [57]:
# Using \t for tab, and seeing the result
print("Hello! \tGoodbye!")

Hello! 	Goodbye!


In [58]:
# Using \b for backspace
print("Hello! \bGoodbye!")

Hello! Goodbye!


In [59]:
# Using \n for new line
print("Hello! \nGoodbye!")

Hello! 
Goodbye!


In [60]:
# Using \r for carriage return
print("Hello!\rGoodbye!")

Hello!Goodbye!


If \ as a particular meaning within a string, it means that having a backlash in your string can cause problems.

In [61]:
print("The date is 30\11\2020 ")

The date is 30	0 


The solution is to **escape the backslash** as well, writing two consecutive backslashes.

In [62]:
print("The date is 30\\11\\2020 ")

The date is 30\11\2020 


## String comparison
It is possible to know if two strings are equivalent using the **is** operator.

In [63]:
#Checking equality
"Hello" is "Goodbye"

False

In [64]:
#Checking inequality
"Hello" is not "Goodbye"

True

It is possible to know if a substring is inside a string using the **in** operator.

In [65]:
#Checking if a substring is in a string
"friend" in "Hello, my friend!"

True

In [66]:
#Checking if a substring is not in a string
"friend" not in "Hello, my friend!"

False

## String manipulation
### String length and string slicing
A string can practically be considered as a **list of characters**.<br /> 
As such, the function **"len()"** can be used to determine the number of characters included in a string.

In [67]:
# Displaying number of characters in string
len("foo")

3

Similarilly as a list, a string can be **sliced** using indices corresponding to characters. 

In [68]:
# Keeping letters 3 to 6 of a string
string = "foobar"
string[2:6]

'obar'

It is also possible to slice a string and keep only one letter in a given step. 

In [69]:
# Keeping letters 2 to 8 of a string, keeping only 1 letter out of 3
string = "foobarfoo"
string[1:8:3]

'oao'

## Exercise
Slice a string made of all consecutive letters of the alphabet to generate the following substring:<br /> 
cfilorux

In [3]:
# The alphabet string can be imported as follows
import string
alphabet = string.ascii_lowercase
print(alphabet)

# TODO


abcdefghijklmnopqrstuvwxyz


### Concatenation
String concatenation is the action of **putting strings back to back**, literally adding one to the other.<br /> 
In Python, this is done using the operator **+**.

In [70]:
# Concatenating two strings
'foo' + 'bar'

'foobar'

### Duplication
If a string needs to be duplicated a certain amount of time, the operator * can be used.

In [1]:
# Duplicating the string 'foo' 4 times
# TODO


Interesingly, Python accepts a negative number of the duplication. However, it will return an empty string

In [72]:
'foo' * -2

''

### Replacing
Part of a string can be replaced using the function **"replace()"** using the following syntax:

In [73]:
# Replace 'Hello' in the following string by 'Goodbye'
string = 'Hello, my friend!'
# TODO


'Goodbye, my friend!'

## String built-in functions
### Format
The **"format()"** allows to easily **insert values into a string**. For this, their desired position in the string needs to be placed within **{}**.

In [1]:
"The year is {}".format(2020)

'The year is 2020'

Also, this function allows to **nominatively place** those values. For this, you just have to put a name inside the {}.

In [2]:
"The date is {day}/{month}/{year}".format(day=30, month=12, year=2020)



'The date is 30/12/2020'

### Lower case
The **"lower()"** converts all characters in the string to lower case.

In [80]:
# Put the string in lower case
string = "HELP, MY CAPS LOCK KEY IS BROKEN"
# TODO


'help, my caps lock key is broken'

### Split
The **"split()"** function breaks down a string every given character into a list of substrings.

In [88]:
# Split the string every '/'
string = '30/11/2020'
# TODO


['30', '11', '2020']

### Join
Alternatively, it is possible to concatenate every string in a list, separated by a given substring by using the **"join()"** function. However, having a non string element in the list will lead to an error.

In [89]:
list_to_join = ['April', 'May', 'June']
'/'.join(list_to_join)

'April/May/June'

### Capitalize
The **"capitalize()"** function returns a copy of the string with only its first character capitalized.

In [74]:
'hello, my friend! goodbye!'.capitalize()

'Hello, my friend! goodbye!'

### Find
The **"find()"** function returns the lowest index in the string where a substring is found. If the substring is not contained in the string, this function will return -1.

In [75]:
# Find the lowest index of the word 'friend' in the string
string = 'Hello, my friend! goodbye my friend!'
# TODO


10

In [76]:
"Hello, World!".find('friend')

-1

In addition, the **"rfind()"** function returns the highest index (it is a reverse search).

In [77]:
# Finds the highest index of the word friend in the string
string = 'Hello, my friend! goodbye my friend!'
# TODO


29

Both , "find()" and "rfind()" functions have an alternative, respectively **"index()"** and **"rindex()"**.<br /> 
However, "index()" and "rindex()" will give an **error if the subtring is not contained in the string**.

In [78]:
"Hello, World!".index('friend')

ValueError: substring not found

### Count
The **"count()"** function returns the number of times a substring occurs within a string.

In [79]:
# Find the occurance of the substring 'friend' in a string
string = 'Hello, my friend! goodbye my friend!'.count('friend')
# TODO


2

### Upper case
The **"upper()"** converts all characters in the string to lower case.

In [81]:
# Put the string in upper case
string = "help, i cannot find my shift key"
# TODO


'HELP, I CANNOT FIND MY SHIFT KEY'

### Swap case
The **"swapcase()"** converts all characters in the string to lower case.

In [82]:
# Swap the case of the string
my_string = "i REALLY SHOULD LEARN TO TYPE WITHOUT LOOKING AT THE KEYBOARD"
# TODO


'I really should learn to type without looking at the keyboard'

### Removing trailing and leading spaces
The **"strip()"** function returns a copy of the string with trailing and leading spaces characters removed.

In [83]:
# Removing trailing and leading spaces
string = " Hello, World! "
# TODO


'Hello, World!'

In [84]:
# Only removing leading spaces
" Hello, World! ".lstrip()

'Hello, World! '

In [85]:
# Only removing trailing spaces
" Hello, World! ".rstrip()

' Hello, World!'

The **"strip()"**, **"lstrip()"** and **"rstrip()"** functions have an optional argument that allows to delete more types of characters.

In [86]:
# Removing trailing and leading 'H'
"Hello, World! ".strip('H')

'ello, World! '

In [87]:
# Removing trailing and leading 'H' and '!'
"Hello, World!".strip('H!')

'ello, World'