## Data Structures as Collections and their Operations.

We have understood the fundamentals, let us layer the fundamentals on each other to derive secondary complex patterns.

1. Lists - list operations.
2. Tuples - tuple operations.
3. Sets - set operations.
4. Dictionaries - dictionary operations.
5. Collection operations - converting data structures.
6. Further characteristics.
7. Debugging.
8. Why!!

### Lists

**A List Is a Sequence**

Like a string, a list is a sequence of values. In a string, the values are characters; in a
list, they can be any type. The values in a list are called elements or sometimes items.
There are several ways to create a new list; the simplest is to enclose the elements in
square brackets (`[ and ]`):

`[10, 20, 30, 40]`

`['crunchy frog', 'ram bladder', 'lark vomit']`

The first example is a list of four integers. The second is a list of three strings. The
elements of a list don’t have to be the same type. The following list contains a string, a
float, an integer, and (lo!) another list:
`['spam', 2.0, 5, [10, 20]]`

A list within another list is nested.

A list that contains no elements is called an empty list; you can create one with ***empty
brackets**, `[]`.

As you might expect, you can assign list values to variables:




In [1]:
cheeses = ['Cheddar', 'Edam', 'Gouda']
numbers = [42, 123]
empty = []
print(cheeses, numbers, empty)

['Cheddar', 'Edam', 'Gouda'] [42, 123] []


### String Lists

To get started, let’s deepdive into string 

#### A String Is a Sequence

A string is a sequence of characters. You can access the characters one at a time with the bracket operator:


In [None]:
fruit = 'banana'
letter = fruit[1]

The second statement selects character number `1` from `fruit` and assigns it to `letter`.

The expression in brackets is called an `index`. The `index` indicates which character in the `sequence` you want **(hence the name)**.

But you might not get what you expect:

In [None]:
print(letter)

For most people, `the first letter of` **'banana'** is *b*, not *a*. But for computer scientists,
the index is an offset from the beginning of the string, and the offset of the first letter
is **zero**.

In [None]:
letter = fruit[0]
letter

So `b` is the `0th` letter **(“zero-eth”)** of **'banana'**, `a` is the `1th` letter **(“one-eth”)**, and `n` is
the `2th` letter **(“two-eth”)**.
As an index, you can use an expression that contains variables and operators:


In [None]:
i = 1
fruit[i]

In [None]:
fruit[i+1]

In [None]:
letter = fruit[1.5]
print(letter)

The value of the index has to be an integer. Otherwise you get:
`TypeError: string indices must be integer`

**len**

len is a built-in function that returns the number of characters in a string:

In [None]:
fruit = 'banana'
len(fruit)

To get the last letter of a string, you might be tempted to try something like this:

In [None]:
length = len(fruit)
last = fruit[length]

`IndexError: string index out of range`
    
The reason for the `IndexError` is that there is no **letter in 'banana'** with the index `6`.
Since we started counting at `zero`, the `six letters` are numbered `0 to 5`. To get the last
character, you have to **subtract 1 from length**:

In [None]:
last = fruit[length-1]
last

Or you can use negative indices, which count backward from the end of the string.
The expression `fruit[-1]` yields the last letter, `fruit[-2]` yields the second to last,
and so on

**String Slices**

A segment of a string is called a `slice`. Selecting a `slice` is similar to selecting a character

In [2]:
s = 'Monty Python'
s[0:5]

'Monty'

In [None]:
s[6:12]

The operator `[n:m]` returns the part of the string from the **“n-eth”** character to the
**“m-eth”** character, including the first but excluding the last. This behavior is counter‐
intuitive, but it might help to imagine the indices pointing between the characters, as
in Table.

|‘B’	|‘a’	|‘n’	|‘a’	|‘n’ |‘a’
-----|-----|----- |----- |-----  |-----
|0	|1	|2	|3	|4 | 5

If you omit the first index **(before the colon)**, the slice starts at the beginning of the
string. If you omit the **second index**, the slice goes to the end of the string:

In [None]:
fruit = 'banana'
fruit[:3]

In [None]:
fruit[3:]

If the first index is **greater than or equal to the second** the result is an empty string,
represented by two quotation marks:

In [None]:
fruit = 'banana'
fruit[3:3]

An **empty string** contains no characters and has **length 0**, but other than that, it is the same as any other string.
Continuing this example, what do you think `fruit[:]` means? Try it and see.

**Strings Are Immutable**

It is tempting to use the `[]` operator on the left side of an assignment, with the intention of changing a character in a string. For example:


In [4]:
greeting = 'Hello, world!'
greeting[0] = 'J'

`TypeError: 'str' object does not support item assignment`
    
The reason for the error is that strings are **immutable**, which means you **can’t change
an existing string**. The best you can do is create a new string that is a variation on the
original:


In [None]:
greeting = 'Hello, world!'
new_greeting = 'J' + greeting[1:]
new_greeting

This example concatenates a new **first letter** onto a **slice of greeting**. It has no effect
on the original string

#### Creating a list that contains items of the string data type

In [None]:
sea_creatures = ['shark', 'cuttlefish', 'squid', 'mantis shrimp', 'anemone']
print(sea_creatures)

As an ordered sequence of elements, each item in a list can be called individually, through indexing. Lists are a compound data type made up of smaller parts, and are very flexible because they can have values added, removed, and changed. When you need to store a lot of values or iterate over values, and you want to be able to readily modify those values, you’ll likely want to work with list data types.

**Indexing Lists**

Each item in a list corresponds to an index number, which is an integer value, starting with the index number 0.

For the list sea_creatures, the index breakdown looks like this:

|‘shark’	|‘cuttlefish’	|‘squid’	|‘mantis shrimp’	|‘anemone’
-----|-----|----- |----- |-----
|0	|1	|2	|3	|4


The first item, the string **'shark'** starts at index `0`, and the list ends at index `4` with the item **'anemone'**.

Because each item in a Python list has a corresponding index number, we’re able to access and manipulate lists in the same ways we can with other sequential data types.

Now we can call a discrete item of the list by referring to its index number:

In [None]:
print(sea_creatures[1])

The index numbers for this list range from `0-4`, as shown in the table above. So to call any of the items individually, we would refer to the index numbers like this:

* `sea_creatures[0] = 'shark'`
* `sea_creatures[1] = 'cuttlefish'`
* `sea_creatures[2] = 'squid'`
* `sea_creatures[3] = 'mantis shrimp'`
* `sea_creatures[4] = 'anemone'`

If we call the list sea_creatures with an index number of any that is **greater than 4**, it will be out of range as it will not be valid:

In [None]:
print(sea_creatures[18])

In addition to positive index numbers, we can also access items from the list with a negative index number, by counting backwards from the end of the list, starting at -1. This is especially useful if we have a long list and we want to pinpoint an item towards the end of a list.

For the same list sea_creatures, the negative index breakdown looks like this:

|‘shark’	|‘cuttlefish’	|‘squid’	|‘mantis shrimp’	|‘anemone’
-----|-----|----- |----- |-----
|-5	|-4	|-3	|-2	|-1

So, if we would like to print out the item `'squid'` by using its negative index number, we can do so like this:

In [None]:
print(sea_creatures[-3])

We can concatenate string items in a list with other strings using the `+` operator:

In [None]:
print('Sammy is a ' + sea_creatures[0])

We were able to concatenate the string item at index number 0 with the string `'Sammy is a '`. We can also use the `+` operator to **concatenate 2 or more lists together**.

With index numbers that correspond to items within a list, we’re able to access each item of a list discretely and work with those items.

#### Lists Are Mutable

The syntax for accessing the elements of a list is the same as for accessing the characters of a string the **bracket operator** `[]`. The expression inside the brackets specifies the
index. Remember that the **indices start at 0**:


In [None]:
sea_creatures[0]

Unlike strings, **lists are mutable**. When the bracket operator appears on the left side of
an assignment, it identifies the element of the list that will be assigned:

**Modifying Items in Lists**

We can use indexing to change items within the list, by setting an index number equal to a different value. This gives us greater control over lists as we are able to modify and update the items that they contain.

If we want to change the string value of the item at index `1` from `'cuttlefish'` to `'octopus'`, we can do so like this:

In [None]:
sea_creatures[1] = 'octopus'

Now when we print `sea_creatures`, the list will be different:

In [None]:
print(sea_creatures)

We can also change the value of an item by using a **negative index** number instead:

In [None]:
sea_creatures[-3] = 'blobfish'
print(sea_creatures)

Now `'blobfish'` has replaced `'squid'` at the negative index number of `-3` **(which corresponds to the positive index number of 2)**.

Being able to modify items in lists gives us the ability to change and update lists in an efficient way.

**Slicing Lists**

We can also call out a few items from the list similar to String slicing. Let’s say we would like to only print the `middle items` of sea_creatures, we can do so by **creating a slice**. With slices, we can call multiple values by creating a range of index numbers separated by a colon `[x:y]`:

In [None]:
print(sea_creatures[1:4])

When creating a slice, as in `[1:4]`, the **first index number** is where the slice starts **(inclusive)**, and the **second index number** is where the slice ends **(exclusive)**, which is why in our example above the items at position, 1, 2, and 3 are the items that print out.

If we want to include either end of the list, we can omit one of the numbers in the `list[x:y]` syntax. For example, if we want to **print the first 3 items of the list sea_creatures**— which would be `'shark', 'octopus', 'blobfish'` — we can do so by typing:

In [None]:
print(sea_creatures[:3])

This printed the beginning of the list, stopping right before index `3`.

To include all the items at the end of a list, we would reverse the syntax:

In [None]:
print(sea_creatures[2:])

We can also use negative index numbers when slicing lists, similar to positive index numbers:

In [None]:
print(sea_creatures[-4:-2])
print(sea_creatures[-3:])

One last parameter that we can use with slicing is called **stride**, which refers to how many items to move forward after the first item is retrieved from the list. So far, we have omitted the stride parameter, and **Python defaults to the stride of 1**, so that every item between two index numbers is retrieved.

The syntax for this construction is `list[x:y:z]`, with `z` referring to stride. Let’s make a larger list, then slice it, and give the stride a value of `2`:

In [None]:
numbers = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]

print(numbers[1:11:2])

Our construction `numbers[1:11:2]` prints the values between index numbers inclusive of **1 and exclusive of 11**, then the stride value of `2` tells the program to print out only every other item.

We can omit the first two parameters and use stride alone as a parameter with the syntax `list[::z]`:

In [None]:
print(numbers[::3])

By printing out the list numbers with the stride set to 3, only every third item is printed:

**0**, 1, 2, **3**, 4, 5, **6**, 7, 8, **9**, 10, 11, **12**

Slicing lists with both positive and negative index numbers and indicating stride provides us with the control to manipulate lists and receive the output we’re trying to achieve.

#### List Operators
Operators can be used to make modifications to lists. We’ll review using the + and * operators and their compound forms `+=` and `*=`.

The `+` operator can be used to concatenate two or more lists together:

In [None]:
sea_creatures = ['shark', 'octopus', 'blobfish', 'mantis shrimp', 'anemone']
oceans = ['Pacific', 'Atlantic', 'Indian', 'Southern', 'Arctic']

print(sea_creatures + oceans)

Because the `+` operator can concatenate, it can be used to add an item (or several) in list form to the end of another list. Remember to place the item in square brackets:

The `*` operator can be used to multiply lists. Perhaps you need to make copies of all the files in a directory onto a server, or share a playlist with friends — in these cases you would need to multiply collections of data.

Let’s multiply the `sea_creatures list` by `2` and the `oceans list by 3`:

In [None]:
print(sea_creatures * 2)
print(oceans * 3)

By using the `*` operator we can replicate our lists by the number of times we specify.

We can also use compound forms of the `+` and `*` operators with the assignment operator `=`. The `+=` and `*=` compound operators can be used to populate lists in a quick and automated way. You can use these operators to fill in lists with placeholders that you can modify at a later time with user-provided input, for example.

Let’s add an item in list form to the list `sea_creatures`. This item will act as a placeholder, and we’d like to add this placeholder item several times. To do this, we’ll use the `+=` operator with a `for loop`.

In [None]:
for x in range(1,4):
    sea_creatures += ['fish']
    print(sea_creatures)

For each iteration of the for loop, an extra list item of 'fish' is added to the original list sea_creatures.
`sea_creatures += ['fish']` is the same as `sea_creatures = sea_creatures + ['fish']`

The `*=` operator behaves in a similar way:

We will talk more about a `for loop` in later exercise

In [None]:
sharks = ['shark']

for x in range(1,4):
    sharks *= 2
    print(sharks)

`sharks *= 2` is the same as `sharks = shark * 2`

#### List Methods


Python provides methods that operate on lists. For example, **append** adds a new element to the end of a list:


In [None]:
t = ['a', 'b', 'c']
t.append('d')
t

**extend** takes a list as an argument and appends all of the elements:

In [5]:
t1 = ['a', 'b', 'c']
t2 = ['d', 'e']
t1.extend(t2)
t1


['a', 'b', 'c', 'd', 'e']

This example leaves `t2` unmodified.
sort arranges the elements of the list from low to high:


In [None]:
t = ['d', 'c', 'e', 'b', 'a']
t.sort()
t

Most `list` methods are void; they modify the `list and return None`. If you accidentally
write `t = t.sort()`, you will be disappointed with the result

#### Deleting Elements
There are several ways to delete elements from a list. If you know the index of the
element you want, you can use `pop`

In [8]:
t = ['a', 'b', 'c']
x = t.pop(1)
t



['a', 'c']

In [9]:
x


'b'

`pop` modifies the `list` and returns the element that was removed. If you don’t provide
an `index`, it **deletes and returns the last element**.

Items can be removed from lists by using the `del` statement. This will delete the value at the index number you specify within a list.

From the `sea_creatures list`, let’s remove the item `'octopus'`. This item is located at the `index` position of `1`. To remove the item, we’ll use the `del` statement then call the list variable and the index number of that item:

In [None]:
del sea_creatures[1]
print(sea_creatures)

Now the item at index position `1`, the string `'octopus'`, is no longer in our list `sea_creatures`.

We can also specify a `range` with the `del` statement. Say we wanted to remove not only the item `'octopus'`, but also `'blobfish'` and `'mantis shrimp'` as well. We can call a range in sea_creatures with the del statement to accomplish this:

In [None]:
sea_creatures =['shark', 'octopus', 'blobfish', 'mantis shrimp', 'anemone', 'yeti crab']

del sea_creatures[1:4]
print(sea_creatures)

By using a range with the `del` statement, we were able to remove the `items` between the index number of `1` **(inclusive)**, and the index number of `4` **(exclusive)**, leaving us with a list of 3 items following the removal of 3 items.

The `del` statement allows us to remove specific items from the list data type.

If you know the element you want to remove (but not the index), you can use `remove`:


In [10]:
t = ['a', 'b', 'c']
t.remove('b')
t

['a', 'c']

#### Lists and Strings


A string is a **sequence of characters** and a list is a **sequence of values**, but a list of characters is not the same as a string. To convert from a string to a list of characters, you can use list:

In [None]:
s = 'spam'
t = list(s)
t

Because `list` is the name of a built-in function, you should **avoid using it as a variable name**. I also avoid `l `because it looks too much like `1`. So that’s why I use `t`.
The list function breaks a string into individual letters. If you want to break a string
into words, you can use the split method:


In [None]:
s = 'pining for the fjords'
t = s.split()
t


An optional argument called a **delimiter** specifies which characters to use as word
boundaries. The following example uses a hyphen as a delimiter:


In [None]:
s = 'spam-spam-spam'
delimiter = '-'
t = s.split(delimiter)
t

**join** is the inverse of split. It takes a list of strings and concatenates the elements.
join is a string method, so you have to invoke it on the delimiter and pass the list as a
parameter:


In [None]:
t = ['pining', 'for', 'the', 'fjords']
delimiter = ' '
s = delimiter.join(t)
s


In this case the **delimiter** is a space character, so join puts a space between words. To
concatenate strings without spaces, you can use the empty string, '', as a delimiter.


#### Aliasing
If `a` refers to an `object` and you assign `b = a`, then both variables refer to the same
object:

In [None]:
a = [1, 2, 3]
b = a
b is a


The **association of a variable** with an object is called a **reference**. In this example,
there are two references to the same object.
An object with **more than one reference** has more than one name, so we say that the
object is **aliased**.
If the aliased object is mutable, changes made with one alias affect the other:


In [None]:
b[0] = 42
a

Although this behavior can be useful, it is **error-prone**. In general, it is safer to avoid
aliasing when you are working with mutable objects.
For immutable objects like strings, aliasing is not as much of a problem. In this example:

In [None]:
a = 'banana'
b = 'banana'

It almost never makes a difference whether a and b refer to the same string or not

### Debugging
Careless use of lists (and other mutable objects) can lead to long hours of debugging.
Here are some common pitfalls and ways to avoid them:

 1. Most list methods modify the argument and return **None**. This is the opposite of
the string methods, which return a new string and leave the original alone.
If you are used to writing string code like this:


In [None]:
word = word.strip()

It is tempting to write list code like this:

In [None]:
t = t.sort() # WRONG!

Because **sort** returns `None`, the next operation you perform with t is likely to fail.
Before using **list methods and operators**, you should read the documentation
carefully and then test them in interactive mode.

2. Pick an idiom and stick with it.Part of the problem with lists is that there are too many ways to do things. For
example, to remove an element from a list, you can use `pop, remove, del, or even
a slice assignment`.
To add an element, you can use the `append` method or the `+` operator. Assuming
that `t` is a list and `x` is a list element, these are correct:

In [None]:
t.append(x)
t = t + [x]
t += [x]


And these are wrong:

In [None]:
t.append([x]) # WRONG!
t = t.append(x) # WRONG!
t + [x] # WRONG!
t = t + x # WRONG!

Try out each of these examples in interactive mode to make sure you understand
what they do. Notice that only the last one causes a runtime error; the other three
are legal, but they do the wrong thing

3. Make copies to avoid aliasing.
If you want to use a method like `sort` that modifies the argument, but you need
to keep the original list as well, you can make a `copy`:

In [13]:
t = [3, 1, 2]
t2 = t[:]
t2.sort()
print(t)
print(t2)


[3, 1, 2]
[1, 2, 3]


In this example you could also use the built-in function sorted, which returns a
new, sorted list and leaves the original alone:

In [None]:
t2 = sorted(t)
print(t)
print(t2)

### Bug-Fixing Exercises

#### Bug-Fixing Exercise 1
The code below tries to extract `'b'` from the list, but there is an **error**. Try to fix the code.



In [None]:
elements = ['a', 'b', 'c']
print(elements(1))

#### Bug-Fixing Exercise 2
The code below aims to update `'b'` with `'x'` in elements. However, the output of the code is still `['a', 'b', 'c']`. Try to fix the code so `'b'` is replaced with `'x'`.


In [None]:
elements = ['a', 'b', 'c']
new = 'x'
new = elements[1]
print(elements)