# Compound data types: Strings

Strings are in `Python` a compund data-type similar to the `numpy`-arrays! The building block of a string are individual characters.

- They are offered within the `str`-module which is automatically loaded when you start `python`. Besides the string data-type, the module contains functions to operate on strings.
- Strings form a *homogeneous* array of individual characters.

In [10]:
# The module string is loaded automatically when you start a Python kernel.

# The dir command shows you all available function/methods and variables
# within a module
#dir(str)

# To obtain help on a specifiv function within str:
#help(str.replace)

In [6]:
message = "Hello Thomas"

print(type(message))
print(len(message))

<class 'str'>
12


## Basic String creation

Strings are mostly created manually or via format strings - see below.

In [None]:
# Strings can be enclosed within single or double quotes.
# Both are equivalent.
message_1 = "Hello Thomas"
message_2 = 'Hello World'

print(message_1)
print(message_2)

In [None]:
# The different quotes can be used to include the other quote within a string:
message_1 = "Bob's results are impressive."

# another possibility is 'quoting' as in Linux:
message_2 = 'Bob\'s results are impressive.'

print(message_1, message_2)

In [None]:
# multiline strings are created with triple quotes:
message = """This is a longer message
over several lines."""

print(message)

## Accessing string elements

Accessing string elements is *equivalent* to the methods from `numpy`-arrays (element access, slicing)

In [None]:
m = "Hello Thomas"

print(m[0], m[6])
print(m[1:5])
print(m[6:])

**Attention:** Individual elements of a string cannot be changes once it has been created (strings are *immutable* objects)!

In [None]:
m = "Hello Thomas"
m[0] = "A"

Also iteration works as for `numpy`-arrays.

In [None]:
message = "Hello World"

for char in message:
    print(char)

## String Operations

There are two `mathematical` operators defined on strings: (1) Addition to concatenate string and multiplication with a number to repeat a string.

In [12]:
# You can use '+' between two strings to concatenate them:
m1 = "Hello "
m2 = "World"

print(m1 + m2)

# multiplication with a number repeats a string
m = "Hello"

print(2 * m)
print(m * 3)

Hello World
HelloHello
HelloHelloHello


The `str`-module contains numerous functions on strings.

In [None]:
message = "Hello World!"
help(message.replace)

In [None]:
# create a new string from an old one substituting some string part with something else:
message = "Hello World!"
print(message)

message.replace('World', 'Universe')
#message = message.replace('World', 'Universe')

print(message)

**Note:** Read carefully the documentation on operations of compound data types! In general, they can either modifiy an exisitng object (*substitution in place*; not for strings as they are immutable) or create a new object.

In [None]:
import numpy as np

a = np.array([9, 8, 7, 6])

help(a.sort)

In [None]:
import numpy as np

a = np.array([9, 8, 7, 6])
print(a)

# The following command directly modifies 'a':
a.sort()
print(a)

In [None]:
help(a.sort)

## Format Strings or f-strings

Strings can be composed from substrings and concatenated with the `+`-operator. This is interesting to include values from variables into dynamic strings.

In [None]:
file_name = "test.txt"
error_code = 51

m1 = "I cannot read file " + file_name + "."
print(m1)

# Note the conversion of the integer 51 to a string with
# str():
m1 = m1 + " I quit with error code " + str(error_code) + "."
print(m1)

Such string compositions can comfortable be done with format strings

In [None]:
file_name = "test.txt"
error_code = 51

m1 = f"I cannot read file {file_name}. I quit with error code {error_code}."
print(m1)

See the appendix for some interesting additional possibilities to format numbers included in f-strings.

## Raw strings or r-strings

*Raw* strings do not treat the backslash as a special character (masking). It is necessary to use them to label matplotlib plots with `LaTeX`-expressions.

In [None]:
# \n is a newline character (Linux)
m1 = "Thomas \n Erben"
print(m1)

# The r marks a 'raw' string:
m2 = r"Thomas \n Erben"
print(m2)

In [None]:
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt

x = np.linspace(0.2, 2.0 * np.pi, 100)
y = 0.5 * np.sin(x) / x

# Without using a raw string, the following
# lines throws an error:
plt.title(r'$\frac {\sin(x)}{x}$')
plt.plot(x, y)

# Appendix: Some useful options to format numbers in f-strings

No formatting of the number

In [None]:
x = 2. / 3.

f"Two thirds are {x}"

Round the float after the third digit.

In [None]:
x = 2. / 3.

# The 'f' stands for float.
f"Two thirds are {x:.3f}"

Round the float after the third digit and print the whole number with a total length of 10.

In [None]:
f"Two thirds are {x:10.3f}"

We can do something similar for integers (without the number of decimal places)

In [None]:
n = 42
f"The truth is {n:10d}"

We can also add leading zeros

In [None]:
n = 42
f"The truth is {n:010d}"

or align to the left

In [None]:
n = 42
f"The truth is {n:<10d}. This we must accept!"