# Collections
Many types in Python are "collections". Collections can be used to hold and operate on multiple objects at once. An example of a simple example would be an list of integers.

Inside Python, this means an instance of a collection actually contains "references" to a number of other objects that hold individual "items".

Items in collections can be "read through" in order (this will be covered in the "Loops" notebook).Alternatively, one can access individual items accessed using the syntax:

```python
collections_name[item_identifier]
```

The exact nature of the item identifier varies between collections.

We'll now look at the two most common collections - strings and lists. There's also a collection called dictionaries.

## Strings
A String is used to store text - it is a collection of individual characters (for exaple, single letter or number). Strings are defined by placing the string of characters in between a pair of double or single quotation marks. For instance:

In [None]:
a = "Hello world!"
print(a)
print(type(a))

### Accessing elements of a string

The ```n```th character of a string named ```string1``` may be returned (as a string with a single character) using the syntax:

```python
string1[n]
```

The index inside the square brackets is an integer. Note that, in Python the index of the first character is 0 and not 1. This is true for many programming languages but some start index numbering at 1.

If this index-numbering system seems counter-inituitive to you, you're not alone. The good news is that it becomes natural after working with Python for a while. It may also help to think about the index as an "offset" from the start of the collection.

In Python, it is also possible to use a negative index. This returns an entry in the string counting backward from the end of the string. For example, to access the last character of ```string1``` you may use the syntax:

```python
string1[-1]
```
For example:

In [None]:
a = "Hello back!"
print(a[0])
print(a[1])
print(a[-1])

### String slicing

It's possible to use indices to create a string containing multiple characters of another string using the following syntax:

```python
string1[start_index:stop_index:step]
```
Here, the value in the start index ```start_index``` value gives the index of the first character to be included, ```stop_index``` gives the first index not to be included and ```step``` gives the frequency of characters to be included (e.g. a step of 1 means every character, 2 means every other character, etc). For example:

In [None]:
my_string = "Hello there!"
print(my_string[3: 11: 2])



If the ```start_index``` is ommitted (but the colons remain), the results will run from the start of the list. If the ```stop_index``` is ommitted (but the colons remain), the results will run until the end of the list. If the ```step``` is omitted, a step size of 1 will be assumed. Multiple values may be omitted For instance:

In [None]:
string1="An example"
a=string1[::4]
print(a)

print(string1[0::2])
print(string1[1:2:])
print(string1[::])

### String length

You can find the length of a string using the ```len``` function, using the syntax:

```python
len(string1)
```

For example:

In [None]:
print(len("snake"))

### String operators

The ```+``` operator has a special meaning for strings. For strings, it acts as the concatenation operator, meaning it joins two string together into one larger new string. 

The ```*``` operator concatenates a string together multiple times. For example 

```python
string1 + string1
```
is the same as

```python
string1 * 2
```

In [None]:
string1 = "I'm the best "
string2 = "at coding in Python"
joined_string = string1 + string2
print(joined_string)

string3 = "Definitely! "
print(string3 * 3)

There are a number of other very useful string operators and functions in Python, but we won't be examining them in this notebook.

### Exercise 4.1

For the following example, write down what you think the results will be before running the example. Remember that spaces count as characters.

In [None]:
string1 = "Python"
string2 = "is my favourite language"

excerpt1 = string1[-1]
print(excerpt1)

excerpt2 = string2[1:10:3]
print(excerpt2)

excerpt3 = string2[1::2]
print(excerpt3)

### Exercise 4.2

Try the following exercises in the cell below:
- Define a string of your choice with at least ten characters
- Print the first character of the string
- Make two new strings
    - One from the first three characters
    - One from the last three characters
- Join these strings together to form a new string
- Make another string containing every other character of your original string beginning with the second character

In [None]:
# type your code here

### Exercise 4.3
What happens if you use a negative ```step``` in the index? What happens if you do this and make your ```start_index``` value higher than your ```stop index``` value?

For the variable ```code_string``` in the example below, print every third value, counting backward from the penultimate character. You should get a message.

In [None]:
# add your code here 
code_string = "q!2refrdgho2!c73 h#eg4hfet@f gvdd4e kkfgc1dab,r3fcgh fguthofeYe"

## Lists
Lists are another kind of collection that use an integer as their index. Lists are very useful for grouping together related data within a code.

A list can be created using the syntax:

```python
list1 = [item0, item1, item2]
```
where you may have 0 or more items specified inside the square brackets (think of the square brackets as putting items in a "box"). An empty list can be created with:
```python
list1 = []
```

Items within the list can be accessed in the same way as characters of a string:

In [None]:
shopping_list = ["apples", "bananas", "bread", "mushrooms"]
print(shopping_list[0])
print(shopping_list[-1])
print(shopping_list[1:4:2])

Note that, in this case, returning multiple items from the list for an index causes a list to be returned containing the relevant items. 

### List methods

You can also add items to the end of a list using the ```append``` method. Methods are pieces of code that are part of a type class, which act on the contents of the type. They may be accessed using the syntax:

```python
variable_name.method_name(argument1)
```

An argument is a variable written in parentheses which tells the method what you want it to do. A method may require 0 or more arguments (depending on the requirements of the method in question).

For the ```append``` method of the list class, the syntax is:

```python
list1.append(item_to_be_appended)
```

Another way to insert an item into a list is using the ```insert``` method which has two arguments and inserts a value into the list. The first is the index the value is to be inserted in, the next is the value to be inserted. Note that lists may have items of a number of different types. An item in a list can itself be any type of object, including a list. For example:


In [None]:
assorted_data = ["shoes", 1]

assorted_data.append(False)
print(assorted_data)

assorted_data.insert(1, 3.14)
print(assorted_data)

assorted_data.append([1,2])
print(assorted_data)

print(type(assorted_data))
print(type(assorted_data[0]))
print(type(assorted_data[1]))
print(type(assorted_data[2]))
print(type(assorted_data[3]))
print(type(assorted_data[4]))

### Changing items inside a list

It's possible to change the value of a list by assigning a new value to it. For example:

In [None]:
my_list = [1,2,34]
print(my_list)

my_list[1] = 4
print(my_list)

### Exercise 4.4

In the cell below:
- Create a list named ```cuddly_animals``` with the names of at least two cuddly animals as items
- Append the name of another cuddly animal to the end of the list
- Insert the name of yet another cuddly animal so it is the first entry in this list
- What happens if you use the ```+``` operator between the lists ```shopping_list``` and ```cuddly_animals```?
- What happens if you try to print the value of an item of a list using an index greater than the number of items in the list?
- What happens if you try to set the value of an item of a list using an index greater than the number of items in the list?

In [None]:
# type your code here

## Immutable and Mutable Types

We've now discussed a number of different types of variables. In Python, every variable will always reference an object, which is a specific instance of a type. For example, a variable which has the value ```3``` actually references an object of the ```int``` type which has the value of 3.

One useful subdivision within objects is the distinction between mutable and immutable objects. 

### Immutable strings

A string is an example of an immutable object and a list is an example of an mutable object. Mutable objects may have their value or part of their value changed, but immutable objects may not. However, the following code is valid:

In [None]:
string1 = "bananas"
print(string1)
string1 = "oranges"
print(string1)

We saw that the value of the variable ```string1``` changed in the second assignment, but we just said that a string was immutable and so couldn't have its value changed. So what's going on?

### Objects are stored in computer memory

To consider the "bananas" and "oranges" conundrum, we have to talk about computer memory or RAM (random access memory). This is a fast (usually not so large) temporary storage that is closed to the computers central processing unit. When your program is being executed, your data (in form of objects) are read in RAM. To organise all objects, every position in RAM has an address associated with it. So, variable names are essentially attached to different memory addresses.

In the first assignment in the code cell above, we create a variable named ```string1``` that is attached to an object "bananas". This object will exist in a location of the memory of your machine. This string object cannot be altered - it is immutable. In the second assignment, we create an entirely new string object and reference it with the variable name ```string1```, discarding the old one.

We can use the functions ``id()`` which returns the memory address of the object. We also use ```hex()``` function that turns this value into a hexidecimal format - a commonly accepted form for memory addresses.

When we say immutable, it means that the original memory location where "bananas" were stored cannot be modified.

In [None]:
string1 = "bananas"
print(hex(id(string1)))
string1 = "oranges"
print(hex(id(string1)))

We see that the memory address of the object referenced by the variable named ```string1``` changes after the second assignment - the variable is referencing an entirely new object. It Python's way of dealing with immmutable data types.

The immutability of strings is the reason why they lack methods such as ```.insert()``` or ```.append()``` that a list has.



### Mutable lists

When we try something similar with a list, the following happens:

In [None]:
# create a list
list1 = ["turnips"]
print(list1)
print(hex(id(list1)))

# append to a list
list1.append("swede")
print(list1)
print(hex(id(list1)))

# assign a new list to the same variable
list1 = ["avocado"]
print(list1)
print(hex(id(list1)))

We see that appending a new value to the list does not change its address - we are modifying the same object in place in the memory. However, assigning a new list to the variable named ```list1``` still creates a new ```list``` object at a new location in the memory.

But what happens to the first ```list``` object we created? The answer is that Python employs a software tool named "garbage collection". Python will periodically check through the stored objects in the memory. If an object is not referred to by a variable that exists in the program then there is no way to access this object, but it's taking up memory. As a result, Python will delete this object from the memory, freeing up the space to be used for something else.

Most of this happens automatically and you won't need to think about it. But you do need to think about the relationship between variables and the objects they relate to and mutability is important in this discussion. First, consider the following, which occurs for a string (which is immutable):

In [None]:
# instantiate a varible that references a string
string1 = "bananas"
print(string1)
print(hex(id(string1)))
print("")

# make a new variable that references the same string
string2 = string1
print(string1)
print(string2)
print(hex(id(string1)))
print(hex(id(string2)))
print("")

# change the value of the first variable
string1 = "apples"
print(string1)
print(string2)
print(hex(id(string1)))
print(hex(id(string2)))
print("")

When we create the variable ```string2``` and assigned ```string1``` to it, we didn't create a new object in memory with the same value as ```string1```, we actually created a variable which references the **same** object as ```string1```. Then, when we create a new string reference it with ```string1```. While ```string1``` references a new object, ```string2``` still references the old object.

Now, we will try to do something similar for a list (which is mutable):

In [None]:
# create a list
list1 = ["bananas"]
print(list1)
print(hex(id(list1)))
print("")

# make a second variable that references the same list
list2 = list1
print(list1)
print(list2)
print(hex(id(list1)))
print(hex(id(list2)))
print("")

# append to the list using the second variable name
list2.append("oranges")
print(list1)
print(list2)
print(hex(id(list1)))
print(hex(id(list2)))
print("")

# change the value of an item in the list using the second variable name
list2[0] = "mangoes"
print(list1)
print(list2)
print(hex(id(list1)))
print(hex(id(list2)))
print("")

# assign a new list to the first variable
list1 = ["apples"]
print(list1)
print(list2)
print(hex(id(list1)))
print(hex(id(list2)))

The behaviour of which variable points to which object remains the same. However, because lists are mutable, we're able to change some properties of the object midway through. When we append "oranges" to the list by accessing the ```.append()``` method of ```list2``` or assign a new value to the zeroth entry of the list, we actually change the object that both ```list1``` and ```list2``` reference in-place (i.e. without changing the relationships between the variable names and the underlying object).

This is an important concept to grasp as you may have multiple different variables all referencing the same object and, if the object is mutable, changing it by referencing one of these variables will change the underlying object that is referenced by all of these variables.



### Extension Exercise 4.5
In the code cell below, experiment with assignment and changing the values of:

* A float
* An int
* A complex number (if you did the relevant extension exercise above)

Use the ```hex(id())``` functions to examine the memory addresses of each variable. Experiment with different operators and assignments on variables with these types to see which will create a new object and which won't. Can you work out (or make an educated guess) which of these types of mutable and immutable? What happens when you independently assign the same value to two separate ```int``` objects (i.e. don't assign either ```int``` to the other)? Can you think why might Python do this?

In [None]:
# type your code here

### Why is this important?

Thinking about your program in terms of memory locations is a good practice that will develop with your experience. 

For example, some programming languages do not have "garbage collection" and you have to make sure that you don't leave any objects that cannot be accessed lying around in your RAM. 

Keeping memory lean and changing objects in-place is often leveraged when optimising code (making it as efficient and fast as possible). Optimisation is an important part of tackling large problems involving long computations and big data.
