## Strings are sequences of characters

*Strings* or *character strings* consist of single characters in a certain sequence. 
Python knows a number of such data types, where multiple elements are stored in a particular sequence. Such data types are called *sequence types* and can all be used in a similar way in Python. Many of the functions and methods we will learn about in connection with strings can later be applied to other sequence types. These are:

* Strings
* bytes
* byte arrays
* lists
* tuples

But some things are also applicable to other data structures like dictionaries or sets.

In [None]:
from IPython.display import Image
Image("img/Python-data-structure.jpg")

But let's start with a first sequence type: strings.

Strings are ordered sequences of characters. 

![string1.png](img/string1.png)

A string is created by enclosing a value in quotes (whether single or double):

~~~
name = 'Santa Claus'
name = "Santa Claus"
~~~

We have already learned this and other basics about strings in the notebook on data types.

### Determine the length of the sequence
The number of elements in the sequence (i.e. the number of characters in the string) can be determined with the function `len()`:

In [None]:
sentence = 'A string is a chain of characters.'
len(sentence)

### Addressing individual elements
Each element in the sequence can be addressed individually by its position:

![String2.png](img/string2.png)

Note that the first element of the sequence has index 0!

In [None]:
# For the following examples a short string is clearer
sentence = "A string"
sentence[0]

<div class="alert alert-block alert-info">
<b>Exercise 1</b>
    

Try to address more characters from the string in the code cell above. For example, output the element with index 3. You will see that a space is a normal character in the string.

</div>

<div class="alert alert-block alert-info">
<b>Exercise 2</b>
    
What happens if we try to access `sentence[10]` (a non-existing character)?

</div>



In [None]:
sentence[10]

### Negative index numbers
Python has a nice feature that has since been adopted by other languages: You can use negative index numbers to move from right to left in the sequence. `-1` is the last element,
`-2` the last second to last and so on.

![String2.png](img/string3.png)

In [None]:
sentence[-1]

Without the negative indexing we would always have to determine the length of the string first and calculate the index of the last element starting from this value:

In [None]:
sentence[len(sentence)-1]

So,  ``sentence[-1]`` is easier to use.

<div class="alert alert-block alert-info">
<b>Exercise 3</b>
<p>
Experiment with different index values.
</p>
</div>

### Slicing: Cutting out a substring
By specifying two values separated by a colon (index of the first element to be sliced out and the first element that is no longer to be sliced out), you can extract a substring from a string:

![string4.png](img/string4.png)

In [None]:
sentence[0:3]

If the first value is `0`, it can be omitted:

In [None]:
sentence[:3]

If the second value is omitted, it is equivalent to "to the end of the string":

In [None]:
sentence[3:]

<div class="alert alert-block alert-info">
<b>Exercise 4</b>
    
<p>
Slice the <tt>sentence</tt> in a way that the result is <tt>stri</tt>.
</p>
</div>

In [None]:
sentence[2:6]

Slicing also works with negative index values:

In [None]:
sentence[-4:]

<div class="alert alert-block alert-info">
<b>Exercise 5</b>
    
<p>
Slice the <tt>sentence</tt>in a way that the result is <tt>stri</tt>.Use negative values.
</p>
</div>

In [None]:
sentence[-6:-2]

Inside the square brackets there can be a third numeric value that defines the step size:

In [None]:
print(sentence)
sentence[0:8:2]

If the third value is set to `2`, only every second character is taken into account, with `3` only every third character and so on.

This third value can also be negative, which reverses the direction. This makes it very easy to "flip" a string:

In [None]:
sentence[::-1]

## Strings are immutable

Unlike other data types, which we will learn about a little later, once a string has been created, it cannot be changed afterwards. Every change leads to a new String object with a new Id.

In [None]:
mystring = 'foo'
print(id(mystring))
mystring = 'bar'
print(id(mystring))

Adding a character, for example, also results in a new object:

In [None]:
mystring = 'foo'
print(id(mystring))
mystring = mystring + 'x'
print(id(mystring))

## String methods

The string data type (``str``) provides a set of methods (i.e., functionalities that can be applied to the string). As we will learn in the object orientation section, methods are functions bound to an object. For now, we just need to know how to call methods: you append -- separated by a period -- the name of the function to the value (or the variable referring to the value). Here is a simple example:

In [None]:
'abc'.upper()

### upper() and lower()

These two methods create a new string with all characters of the original string converted to upper and lower case respectively.

In [None]:
sentence = 'I am a string'
print(sentence.upper())
print(sentence.lower())

### replace()

The `replace` method of a String object replaces all occurrences of a character (or substring) with another character (or substring). This is done by first specifying the substring to be replaced, and then the replacement: 

In [None]:
sentence = 'I am a string'
sentence.replace('a', 'y')

In [None]:
'I am a string'.replace('a string', 'a cat')

### find()

The find() method returns the position (index) of the first found occurrence of the searched character in the string.

In [None]:
sentence = 'I am a string'
sentence.find('I')

`find()` also works with more than one character:

In [None]:
sentence.find('am')

If the searched character or string is not found, `find()` returns `-1`:

In [None]:
sentence.find('y')

The data type `str` knows many more methods, which we will get to know bit by bit.

## String Formatting / String Templates
Until now, if we wanted to print more than one value, we wrote the two values separated by a comma into the `print()` function:

~~~
print('foo', 'bar')
~~~

This is a pretty lousy trick (we actually created a tuple and printed it). Normally, you should use one of the methods presented below for such cases.

Python knows several ways to include variables or expressions in a string.

### f-Strings
A relatively new, but recommended way of formatting strings since Python 3.6 are so-called f-strings (*formatted string literals*). (The examples therefore only work if you have at least Python version 3.6 installed). The great innovation here is that the variables or expressions can be written directly on the spot in the curly braces. f-Strings are indicated by a preceding `f`.

In [None]:
name = input('Input your name: ')
age = int(input('Input your age: '))
print(f'Hi {name}. you are approximately {age * 365.25} days old.')

A very detailed description of f-strings can be found at https://docs.python.org/3/reference/lexical_analysis.html#f-strings. However, since this may be too formal for beginners, you should google for 'python f-string'. You will find a number of simple introductions.

### format()
A somewhat older, but still much used type of string formatting is the string method `format()`. It is supported since Python 3.2. In the simplest form, each pair of curly braces is replaced by a value in the order of the arguments to `format()`:

The parameters of the ``format()`` method can be:

* Literals: ``"=> {}".format("Santa Claus")` # Literals don't really make sense here.
* Variable: ``"=> {}".format(name)`
* Expressions: ``"=> {}".format(age * 365.25)``

The output can also be formatted, here for example we specify that the floating point number should be output with two decimal places:

In [None]:
print("Hi {}! You are approx. {:.2f} days old.".format(name, age * 365.25))

If needed, you can find all formatting options here: https://docs.python.org/3/library/string.html#formatstrings

### Formatting via the % operator

The oldest kind of string formatting works, as in other languages, via the `%` operator. This type is still widely used, but is no longer recommended.
Here, `%s` stands as a placeholder for a string, `%d` as a placeholder for an integer, and `%f` for a float:

In [None]:
print('Hi %s! You are approx. %f days old!' % (name, age * 365.25))

<div class="alert alert-block alert-info">
<b>Exercise 6</b>
    
<p>
Write a program that: 
<ol>

<li>Prompts for the input of a name </li>
<li>Assigns this input to a variable </li>
<li>Generates the following output: </li>
    </ol>
<pre>Your name is XYZ and consists of n characters.
It starts with X and ends with Z. 
Read from behind it is ZYX.</pre>

</p>
</div>