# Chapter 4 – Data Structures
## 4.1 Sequences and Lists
### Data From Data
We have often mentioned the importance of *types* in programming. Python is more flexible than many languages about typing, but it's still really important that you can always work out what the expected types are in any given situation. Sometimes it's the only way to make the code work: `"Number " + 1` will give an error, but `"Number " + str(1)` produces the desired result. A line of code might do many different things depending on the exact types, so keeping track of what type you expect a variable to be can make debugging much easier.

Many of the types we've used so far are what some languages might call *primitives*. While this is not a word used in the official Python documentation in this context, it is commonly used in other guides and tutorials. A *primitive* is an *atomic* data type – it is its own thing, you cannot break it down into smaller parts. In Python this includes integers, floats, strings, and Booleans.

In another language programming language called Java, *characters* are a *primitive* data type, and strings are made up of multiple characters. This makes the string a *composite* data type in Java.

A **data structure** is a composite data type where data is organised in a way that provides certain benefits – often speed or space efficiency. For some problems, a suitable choice of data structure can make the solution more efficient and easier to write.

In Python, strings *can* be broken down into characters, but each character is just a single letter string. The *object* you get as a result of *indexing* the string is itself another string:

In [2]:
text = "hello"
character = text[0]
type(text) == type(character)

True

Strings in Python are an example of a *sequence* data type. Sequences are collections which support certain operations like indexing, slicing (e.g. `text[3:5]`), and so on.

Another sequence type you will use a lot is the list.

### Lists
A **list** is an ordered collection of objects. It is a sequence type data structure. Lists are written with square brackets, so we can write a list containing the numbers from `1` to `5`:

In [4]:
[1, 2, 3, 4, 5]

[1, 2, 3, 4, 5]

Or a list containing just the number `31`:

In [29]:
[31]

[31]

Or a list containing the string `"Python"`:

In [8]:
["Python"]

['Python']

Or a list containing nothing:

In [28]:
[] 

[]

Or a list containing the number 31, the string `"Python"`, and a list containing the numbers `1` to `5` (we can have lists inside lists):

In [10]:
[31, "Python", [1, 2, 3, 4, 5]]

[31, 'Python', [1, 2, 3, 4, 5]]

Lists support the exact same sequence operations that you have already learned from strings:

In [18]:
my_list = [31, "Python", [1, 2, 3, 4, 5]]
print(f"The length of my list is {len(my_list)} and the first element is {my_list[0]}")

The length of my list is 3 and the first element is 31


Unlike strings, lists are **mutable**, meaning that we can change them after they have been created. Specifically we can change their contents. We can always reassign a variable:

In [12]:
text = "hello"
text = "goodbye"
print(text)

goodbye


But we could not change the values of the string itself:

In [14]:
text = "hello"
text[0] = "g"

TypeError: 'str' object does not support item assignment

However, with a list, this item assignment operation will work:

In [16]:
my_list = [1, 2, 3, 4, 5]
my_list[0] = 5
my_list[1] = 4
print(my_list)

[5, 4, 3, 4, 5]


### List Methods
Lists, like strings, have many useful *methods*, subroutines that are called like this:

In [30]:
my_list = [1, 2, 3, 4, 5]
my_list.index(3)

2

`my_list.index(obj)` returns the index of the object `obj` in the list `my_list`. If the object is not found then this returns an error. This is in contrast to `text.find(ss)` which would return `-1` if the substring `ss` was not found in the string `text`. You can use `.index` with strings but you cannot use `.find` with lists.

Do you remember the difference between a function and a procedure? (If not, go back to [Section 2.1](../Chapter%202/2.1.ipynb)!)

We specifically pointed out the fact that string methods did *not* work like procedures – they return a new string, they do not modify the existing string:

In [21]:
text = "hello"
text.replace('e', 'u')
print(text)

hello


But similar looking methods on lists ***do*** modify the object, they *are* procedures:

In [23]:
my_list = [1, 2, 3, 4, 5]
my_list.reverse()
print(my_list)

[5, 4, 3, 2, 1]


And this can lead to some really confusing mistakes, because these procedures specifically *do not* return values:

In [27]:
my_list = [1, 2, 3, 4, 5]
new_list = my_list.reverse()
print(new_list)

None


But other methods of a list object ***are*** functions so they *do* return values:

In [26]:
my_list = [1, 2, 3, 4, 5]
new_list = my_list.copy()
print(new_list)

[5, 4, 3, 2, 1]


Unfortunately this is something you simply have to get used to. Remember you can always consult the documentation for any function, either online or using built-in tools, which should explain how it works. Alternatively, if you are ever at all unsure, just try it out in a Jupyter cell or Python interpreter instance!

In [32]:
help(my_list.copy)

Help on built-in function copy:

copy() method of builtins.list instance
    Return a shallow copy of the list.

