# Strings, lists and tuples

## $ \S 1 $ Strings<a name="strings"></a>

## $ 1.1 $ Strings as sequences of characters
A __string__ is a sequence of characters enclosed in either single `'` or double `"` quotes. The type corresponding to strings is denoted by `str`.

To get the $ i $-th character of a string called, say, $ s $, use `s[i]`; the output is also string (albeit one having only $ 1 $ character).

<div class="alert alert-warning">In Python, indices are <i>always</i> counted starting from <b> $ 0 $  (zero)</b>, not $ 1 $. To avoid confusion, we adapt our terminology accordingly to speak of, e.g., 'm' as the <i>0-th</i> character of the string 'magic', 'a' as its first character, and so on...</div>

📝 By prefixing an index with a minus sign $ - $, we start counting to the 'left' from the 0-th character. For example, `s[-1]` is the _last_ character of $ s $, `s[-2]` its *next-to-last* character, and so on.

#### Example:

In [None]:
g = "Gandalf"
s = "Sauron"
explosion = "BOOM!"

character = g[0]
print(character, type(character))

another_character = s[4]
print(another_character)

yet_another_character = explosion[-1]
print(yet_another_character)

# Since these characters are actually strings, we can concatenate them using '+':
print(character + another_character + yet_another_character)

G <class 'str'>
o
!
Go!


## $ 4.2 $ Operations on strings

As in the preceding example, strings can be __concatenated__ using the binary operator __+__:

In [None]:
string_1 = "ancient"
string_2 = "magic"
string_3 = "spells"

print(string_1 + string_2)
print(string_1 + " " + string_2 + " " + string_3)

ancientmagic
ancient magic spells


The function `len` applied to a string returns its **length**, i.e., the number of characters it contains, which is always a non-negative integer.

#### Example:

In [None]:
print(len(string_1))
print(len(string_2))
print(len(string_3))

7
5
6


🚫 A string is an __immutable__ object, meaning that its individual characters _cannot_ be modified during the program. Trying to do so will make the interpreter throw a `TypeError`.

In [None]:
# Let's try to modify the first string of g to see what happens:
g[0] = 'R'

TypeError: 'str' object does not support item assignment

The **colon operator `:`**, as in `[i:j]`, is used to __slice__ a string from its $ i $-th character (inclusive) to its $ j $-th character (exclusive).

__Example:__

In [None]:
string = 'magic'

print(string[0:2])   # Slice from the 0th character to the 2nd (not including the 2nd).
print(string[2:5])   # Slice from the 2nd character to the 5th (not including the 5th).

print(string[:2])    # Omit the first index to slice from the beginning to the second index.
print(string[2:])    # Omit the second index to slice from the first index to the end.

ma
gic
ma
gic


⚠️ To make an independent copy of a string, use a *complete slice* `[:]`.

__Example:__

In [None]:
string_1 = 'potion'
string_2 = string_1[:]    # Omit both indices to make an independent _copy_ of the original string.

string_1 = 'magic'

print(string_1, string_2)

magic potion


📝 In analogy with the interpretation of `+` as concatenation of strings, if one 'multiplies', using `*`, a string by a positive integer $ n $, then the result is a new string which consists of $ n $ copies of the original concatenated, one after another. The remaining arithmetic operators (`-`, `/`, `//` and `%`) cannot be applied to strings.

__Example:__

In [None]:
s = 'ha'
t = 5 * s       # t consists of 5 copies of s.
print(t)

u = (-4) * s    # What happens if we multiply s by 0 or by a negative integer?
print(u)

hahahahaha



In [None]:
print(t // s)   # We can't divide two strings!

TypeError: unsupported operand type(s) for //: 'str' and 'str'

In [None]:
v = 2.71 * s    # Can we multiply a string by a float?

TypeError: can't multiply sequence by non-int of type 'float'

### $ 1.3 $ Comparing strings

📝 All of the comparison operators introduced in the previous notebook work for
strings as well. Strings are ordered according to the __lexicographic__ (or __dictionary__) __order__:

In [None]:
a = "potion"
b = "portion"
q = "quarterstaff"  
r = "robe"

print(a < b)
print(b < q)
print(a == q)
print(q != r)
print(q <= r)

False
True
False
True
True


## $ \S 2 $ Lists

### $ 2.1 $ The `list` type

A **list** (that is, an object of type `list`) consists of zero, one or several objects ordered in sequence. The items of a list are allowed to be of *any* type, and the types of different elements do not have to be the same. In particular, one can create lists which contain integers, floats and strings; lists whose elements are other lists or tuples; lists of lists of functions, and so on.

A list is represented using *brackets* `[]`, with its elements separated by commas. The function `len` can be used to count the number of items contained in a list.

__Example:__

In [None]:
fruits = ["acai", 'apple', "apricot", 'avocado']
numbers = [0, 'eight', -53, 12.34, (3 + 4j)]       # The elements of a list can be of different types!
empty = []                                         # This is an empty list.
mages = ['Delfador']                               # This list has a single element.

print(len(fruits))                                 # Use 'len' to get the length of a list.
print(len(empty))

new_list = fruits + mages                          # We can concatenate two lists using '+'.
print(new_list)

4
0
['acai', 'apple', 'apricot', 'avocado', 'Delfador']


📝 Just like strings, lists can be **concatenated** with the `+` operator, 'multiplied' by positive integers using `*` and **sliced** with the `:` operator.

### $ 2.2 $ Modifying lists

In contrast to strings, lists are **mutable** objects, meaning that their individual elements can be modified by assignments.

__Example:__

In [None]:
movies = ["Gone with the Wind",
         "Interstellar",
         "E.T.",
         "It's a Wonderful Life",
         "Rain Man"]

print(movies)
# "It's a Wonderful Life" appears inside double quotes
# when printed because this string contains a single quote.

movies[1] = "Forrest Gump"     # Modify the 1st (not 0th!) element of the list.
print(movies)

# Modifying more than one element at once:
movies[2:4] = ["Modern Times", "Paths of Glory"]
print(movies)

print(movies[:2])              # Print the first two elements.

['Gone with the Wind', 'Interstellar', 'E.T.', "It's a Wonderful Life", 'Rain Man']
['Gone with the Wind', 'Forrest Gump', 'E.T.', "It's a Wonderful Life", 'Rain Man']
['Gone with the Wind', 'Forrest Gump', 'Modern Times', 'Paths of Glory', 'Rain Man']
['Gone with the Wind', 'Forrest Gump']


⚠️ Before a list can store an item whose index is $ k $, it must have items associated with every index between $ 0 $ and $ k - 1 $. Trying to modify or access in any way the element of index $ k $ in a list which currently does not have such an element generates an `IndexError`.

__Example:__

In [None]:
drinks = ["coffee", "tea", "water"]
print(drinks[3])

IndexError: list index out of range

In [None]:
drinks = ["coffee", "tea", "water"]
drinks[3] = "orange juice"

IndexError: list assignment index out of range

### $ 2.3 $ Some methods defined on lists

Lists also support several methods (a **method** is a function associated with a specific class or type). Here are examples of how some of them are used.

__Example:__

In [None]:
fruits = ["avocado", 'apricot', "acai", 'apple']

fruits.append('apple')            # Append an element to the end of a list.
print(fruits)

fruits.insert(0, "strawberry")    # Insert an element in a specified position.
print(fruits)

fruits.remove('apple')            # Remove the _first occurrence_ of an element.
print(fruits)

a = fruits.pop(2)                 # Remove the element having the specified index               
print(fruits)                     # and return it as output. 

b = fruits.pop()                  # Use 'pop' without any arguments to remove the
print(fruits)                     # last item of a list and return it as output.

print(a, b)

fruits.sort()
print(fruits)

['avocado', 'apricot', 'acai', 'apple', 'apple']
['strawberry', 'avocado', 'apricot', 'acai', 'apple', 'apple']
['strawberry', 'avocado', 'apricot', 'acai', 'apple']
['strawberry', 'avocado', 'acai', 'apple']
['strawberry', 'avocado', 'acai']
apricot apple
['acai', 'avocado', 'strawberry']


Note that, in each case, the name of the list appears before the method, and is separated from it by a period (`.`). More formally:

* `append(x)` can be used to append an element $ x $ to the _end_ of a list.
* `insert(i, x)` is used to insert an element $ x $ in an arbitrary position $ i $ of a list. Elements having indices $ < i $ remain in their original position, while those having indices $ \ge i $ are shifted one position to the right.
* `remove(x)` is used to remove the *first occurrence* of an item $ x $ of a list. If the list does not contain any instances of this element, then the interpreter throws a `TypeError`.
* `sort(lst)` is used to **sort** the elements of a list _lst_ in ascending order, provided this makes sense.
* `pop(i)` removes the item of the list at index $ i $.

📝 The first four of these methods modify the list as described, _but they return_ `None` as output. However, `pop` returns the popped element as output.

__Example:__

In [None]:
a = fruits.insert(0, "banana")
print(a)

b = fruits.sort()
print(b)
print(fruits)

None
None
['acai', 'avocado', 'banana', 'strawberry']


<div class="alert alert-warning">If $ x $ stores a <i>mutable</i> object, then the assignment <code>y = x</code> does not result in a new <i>object</i> named $ y $; instead, this just makes $ y $ a new pointer to the object stored by $ x $. Because of this, any modification of the value of $ x $ will affect $ y $, and vice-versa.</div>

__Example:__

In [None]:
x = [0, 1, 2]
y = x

x.pop()         # Popping an element from x also affects y,
y               # since they refer to the same object!

[0, 1]

In [None]:
x = [0, 1, 2]    # To create an independent copy of x, use a complete slice:
y = x[:]

x.pop()
y                # y has not been affected by the modification of x.

[0, 1, 2]

## $ \S 3 $ Tuples

### $ 3.1 $ The `tuple` type

Another sequential data type is `tuple`, the type of __tuples__. Like a list, a tuple is a sequence of non-negative length of objects of arbitrary types, separated by commas. However, tuples are enclosed by _parentheses_ `()` instead of brackets. Also, tuples are __immutable__ (like strings), so that their individual elements _cannot_ be modified.

### $ 3.2 $ Operations on tuples

As for the other sequential types that we have considered (strings and lists), tuples can be concatenated with `+`, their length can be retrieved using `len`, and their elements and slices can be accessed using `[]` and the `:` operator.

__Example:__

In [None]:
# Each of the following tuples records some data about famous scientists:
record_1 = ('Albert', 'Einstein', 'physicist', 26, 'Germany')
record_2 = ('Marie', 'Curie', 'chemist', 32, 'Poland')
record_3 = ('Charles', 'Darwin', 'biologist', 50, 'England')

# Each of them is indeed of type 'tuple':
print(type(record_1))

# Accessing individual elements:
print(record_1[0])
print(record_2[2])
print(record_3[-1])

<class 'tuple'>
Albert
chemist
England


In [None]:
# Slicing:
full_name = record_1[:2]
print(full_name)

('Albert', 'Einstein')


In [None]:
# To convert a tuple to a list, use 'list' as a function:
data = list(record_1)
print(data, type(data))

# Similarly, to convert a list to a tuple, use 'tuple' as a function:
philosophers = ["Plato", "Aristotle", "Seneca", "Socrates"]
tuple_of_philosophers = tuple(philosophers)
print(tuple_of_philosophers, type(tuple_of_philosophers))

['Albert', 'Einstein', 'physicist', 26, 'Germany'] <class 'list'>
('Plato', 'Aristotle', 'Seneca', 'Socrates') <class 'tuple'>


### $ 3.3 $ Some warnings

⚠️ To define a tuple consisting of a single item, a comma must still be used, so that the tuple can be disambiguated from an expression surrounded by parentheses:

In [None]:
language = ('Sindarin', )         # To define a tuple, we must include a comma!
print(language, type(language))

lang = ('Sindarin')               # This is not a tuple, but rather a string;
print(lang, type(lang))           # the parentheses play no role in this case.


('Sindarin',) <class 'tuple'>
Sindarin <class 'str'>


🚫 Since a tuple is _immutable_, an attempt to modify one or more of its elements results in a `TypeError`:

In [None]:
coordinates = (1.234, 5.678)
coordinates[0] = 0.123

TypeError: 'tuple' object does not support item assignment

⚠️ We emphasize that even if $ x $ and $ y $ are two tuples or lists of the same length and whose items are  of the same numerical type, `x + y` is *not* obtained by summing their respective elements; it is instead the *concatenation* of $ x $ and $ y $. Similarly, if $ a $ is a scalar, then `a * x` is *not* obtained by multiplying each item of $ x $ by $ a $, even if $ a $ is an integer.

<div class="alert alert-warning">Neither lists nor tuples are adequate data structures to represent <b>vectors</b> (in the sense of linear algebra). The most adequate type for this task is an <b>array</b> (type: <code>array</code>, provided by the <a href="https://scipy.github.io/old-wiki/pages/Numpy_Example_List.html"><b>NumPy</b></a> module), which we will consider later.</div>

📝 **Question:** Do we really need both lists and tuples? 

*Answer:* No, strictly speaking we could always get by using only one of them. However, the versatility has many advantages.

📝 **Question:** What is the difference between lists and tuples?

*Answer:* The main difference is that lists are _mutable_ while tuples are _immutable_. In particular:
* We cannot modify the value of individual elements of a tuple as we can with lists.
* We cannot remove or add elements to a tuple. Tuples have no methods equivalent to `append`, `pop`, `insert`, `remove`, etc.. In particular, tuples have a fixed length. Even though it is possible to assign a tuple to a variable and then assign another tuple of different length to the same variable, this is not the same operation as modifying the original tuple.
* Tuples are generally a bit 'faster' than lists.
* Because tuples are immutable, they offer a better choice when storing information which should be protected from modification to avoid unforeseen behavior.

