## Composite Types

Previously, we learned about some of Python's basic types: ints, floats, and Booleans.  We sometimes call these scalar types (or atomic types) because they hold one indivisible piece of information.  Of course, things get much more complicated, and we often have to hold many pieces of information together in one place. In data science, we often have to store things like measurements over time, different attributes for a single person, or entire tables of data.  To do these things, we need to organize atomic types together into larger objects; these are what we call *composite types*.

There are many different ways to organize items into larger objects. If I have n objects, I can:

- Line them up in the order that I receive them and keep adding to these as I get more
- Put them in a container with a specific label or name for that thing, like a dictionary
- Make a package of several of them that I can't add to any longer

I can even do combinations of what is mentioned above; I could have a list of dictionaries, for example.

Now, while these are rough explanations using simple English, let's see what they actually look like in Python. We'll start with the basics and move on to things that are slightly more advanced.

# Sequences

A sequence is an ordered arrangement of items, one after the other. A sequence has a first element, a second element, and so on. To represent a sequence in Python, we actually have a choice of several object types: lists, tuples, arrays, and others. The most general purpose of these types is the list. You'll end up using lists again and again, so it's important to know how they work.

Like all sequences, a list contains items arranged in an order. The elements of a list can be a mix of types; they can include strings, integers, floats--virtually anything. You can also repeat values in a list, so the same value can appear multiple times.

There are a variety of ways that we can create lists, the simplest being square brackets.

In [1]:
[]

[]

We can also make a list by separating items with commas and putting them within square brackets.

In [2]:
[1,2,1,2]

[1, 2, 1, 2]

We began learning about strings in our last lesson. Let's try and look a bit deeper into strings and how they work.

For example, you might expect that list('hello') would give you a list with one item, 'hello'. This is incorrect.

In [3]:
list('hello')

['h', 'e', 'l', 'l', 'o']

You can see that what actually happens is that each letter becomes an item.  In fact, strings are also a sequence type in Python; they are treated as a sequence of characters.

The word `list` is a reserved keyword in Python so you shouldn't assign something to the variable name of `list`. (Hint: Don't try and name your list `list`.)

In [1]:
list

list

Many other names are fair game, so let's try a different name.

In [2]:
first_list = [1,2,3,4,5]
second_list = [6,7,8,9]

In [3]:
first_list

[1, 2, 3, 4, 5]

In [4]:
second_list

[6, 7, 8, 9]

## General Sequence Methods

A list is just one example of a sequence in Python, but it has a lot in common with other sequences. In fact, there are quite a few operations that work on nearly all sequences in Python. We'll demonstrate some of these, using lists as our example.

We can check if a certain item exists in a list with the `in` keyword.

`in` will give you an answer to the question "Does this value exist in the following list?"

It will return an object of type Boolean, which will be either `True` or `False`.

In [5]:
3 in first_list

True

In [6]:
7 in first_list

False

In [7]:
7 in second_list

True

In [9]:
3 in second_list

False

We can also check if an item is not in a list by adding the `not` keyword.  This turns a True into a False and vice versa.

In [11]:
3 not in first_list

False

So we can check for existence in a pretty straightforward way. Another useful method is to join two sequences together. For example, `first_list` and `second_list` could be joined to give us the numbers from 1 to 9. All we have to do to join them together is use the plus keyword. This will leave the original lists unchanged and return a new list.

In [12]:
first_list + second_list

[1, 2, 3, 4, 5, 6, 7, 8, 9]

In [13]:
print(first_list)

[1, 2, 3, 4, 5]


In [14]:
print(second_list)

[6, 7, 8, 9]


In [15]:
third_list = first_list + second_list

In [16]:
third_list

[1, 2, 3, 4, 5, 6, 7, 8, 9]

Now we've got an entirely new list.  It's important to notice that our original lists are still there and unchanged.

## Indices and Slices

We can use square brackets to extract individual items from a list, or entire slices.  The format is exactly the same as we used for strings, but we will review it here.

First, let's extract a single item from a list using its index.

In [29]:
third_list[1]

2

Notice that, just like strings, we count the items in a list starting from zero. The first item is 0 and the second one is 1 and so on. The index represents the *offset* from the first position.

That means that in order to get the first item, we have to use 0.

In [32]:
third_list[0]

1

In order to get the first five items, we need to slice from 0 to 5 because a `slice` does not include the last value.

In [17]:
third_list[0:5]

[1, 2, 3, 4, 5]

We can also include a step option in a slice.  The following statement will get every third number from the list, starting at the first position.

In [None]:
third_list[1::3]

Notice that we've omitted the ending index, which would normally appear between the two colons.  Remember that when we omit the second value it defaults to the end of the list.

If we want to start at the beginning of the list, we can similarly omit the first value.

In [21]:
third_list[:5]

[1, 2, 3, 4, 5]

So the complete format for taking a slice of a sequence is `[start:stop:step]`.

Slicing a list will always create a new list.

Here are three more useful list operations used to get the number of elements, their minimum value, and their maximum value.

In [24]:
print(len(third_list))
print(min(third_list))
print(max(third_list))

9
1
9


We can also look through a list to locate an item.  To do this, we would use the `index` method. This will return the index of the first item in the list that matches what we are looking for.

The following is an example of trying to locate the number 3 in a list.

In [25]:
my_list = [1,2,2,3,3,4,3]

In [26]:
my_list.index(3)

3

We can see that the first occurance of 3 is at index 3.  Once we get one value, we might want to get the next one. We can do this by setting the start and stop points of our index method. This means that Python will start searching at a certain start index and get everything up to (but not including) the stop index. These are *optional* values so they don't need to be supplied (as in the example above).

With all these values, the index method looks like this: `index(value, start, stop)`.

In [30]:
my_list.index(3,4,6) # starts at index 4 and searches up to, but not including, index 6.

4

Now we can also get the counts of values in a list with the .count method; all we need to do is pass in the value that we want.

In [31]:
my_list

[1, 2, 2, 3, 3, 4, 3]

In [32]:
my_list.count(3)

3

Remember that, with a few exceptions, every operation listed so far works on all sequences in Python. That includes lists, tuples, ranges, and strings, all of which we will discuss in more detail.

In [62]:
"Eggs".count("g")

2

In the next segment, we'll look at some of the ways lists are actually different than other sequences in Python.