<img src='images/logo_full.png'/>

## <p style="text-align: center;"> Python for Data Science </p>
The course is offered by [Ai Adventures](www.aiadventures.in). The notebooks are created by [Pranav Uikey]() and [Ankur Singh](). This material is subject to the terms and conditions of the [Creative Commons CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/) license. Any use for commercial purpose is strictly prohibited.

# List

One more Data Type you’ll need to understand before you can begin writing programs is the **list** data type and its cousin, the **tuple**. Lists and tuples can contain multiple values, which makes it easier to write programs that handle large amounts of data. And since lists themselves can contain other lists, you can use them to arrange data into hierarchical structures.

In this notebook, We’ll discuss the basics of lists. Then We’ll briefly cover the list-like tuple and string data types and how they compare to list values.

## The List Data Type

A list is a value that contains multiple values in an **ordered sequence**. A list looks like this: ['cat', 'bat', 'rat', 'elephant']. Just as string values are typed with quote characters to mark where the string begins and ends, a list begins with an opening square bracket and ends with a closing square bracket, []. Values inside the list are also called **items**. Items are separated with commas (that is, they are comma-delimited). For example

In [1]:
[1,2,3] #list of integers

[1, 2, 3]

In [2]:
['cat', 'bat', 'rat', 'elephant'] #list of strings

['cat', 'bat', 'rat', 'elephant']

In [3]:
['hello', 3.1415, True, None, 42] #list of mixed data types

['hello', 3.1415, True, None, 42]

In [2]:
spam = ['cat', 'bat', 'rat', 'elephant']
spam

['cat', 'bat', 'rat', 'elephant']

The *spam* variable is assigned ['cat', 'bat', 'rat', 'elephant'].

**Note:** The value [] is an empty list that contains no values, similar to '', the empty string.**

### Getting Individual Values in a List with Indexes

Say you have the list ['cat', 'bat', 'rat', 'elephant'] stored in a variable named *spam*. The Python code `spam[0]` would evaluate to 'cat', and `spam[1]` would evaluate to 'bat', and so on. The integer inside the square brackets that follows the list is called an **index**. The first value in the list is at index 0, the second value is at index 1, the third value is at index 2, and so on.

![](images/000074.png)

In [6]:
spam[0]

'cat'

In [7]:
spam[3]

'elephant'

In [8]:
['cat', 'bat', 'rat', 'elephant'][3]

'elephant'

In [9]:
'Hello ' + spam[0]

'Hello cat'

In [10]:
'The ' + spam[1] + ' ate the ' + spam[0] + '.'

'The bat ate the cat.'

**Note**: the expression `'Hello ' + spam[0]` evaluates to `'Hello ' + 'cat'` because spam[0] evaluates to the string 'cat'. This expression in turn evaluates to the string value `'Hello cat'`.

Python will give you an **IndexError error** message if you use an index that exceeds the number of values in your list value.

In [11]:
spam[1000]

IndexError: list index out of range

**Indexes can be only integer values, not floats**. The following example will cause a **TypeError error**:

In [3]:
spam[1]

'bat'

In [13]:
spam[1.0]

TypeError: list indices must be integers or slices, not float

In [14]:
spam[int(1.0)]

'bat'

##### Lists can also contain other list values. 
The values in these lists of lists can be accessed using multiple indexes, like so:

In [15]:
spam = [['cat', 'bat'], [10, 20, 30, 40, 50]]

In [16]:
spam[0]

['cat', 'bat']

In [17]:
spam[0][1]

'bat'

In [18]:
spam[1][4]

50

The **first index** dictates which list to use, and the **second index** indicates the value within the list value. For example, `spam[0][1]` prints 'bat', the second value in the first list. If you only use one index, the program will print the full list value at that index.

### Negative Indexes

While indexes start at 0 and go up, you can also use negative integers for the index. The integer value -1 refers to the last index in a list, the value -2 refers to the second-to-last index in a list, and so on.

In [20]:
spam =          ['cat', 'bat', 'rat', 'elephant']
#negative index [-4      -3     -2      -1      ]

In [21]:
spam[-1]

'elephant'

In [22]:
spam[-3]

'bat'

In [23]:
'The ' + spam[-1] + ' is afraid of the ' + spam[-3] + '.'

'The elephant is afraid of the bat.'

### Getting Sublists with Slices

Just as an index can get a single value from a list, a slice can get **several values from a list**, in the form of a new list. A slice is typed between square brackets, like an index, but it has two (or three) integers separated by a **colon**. Notice the difference between indexes and slices.

- **spam[2]** is a list with an index (one integer).

- **spam[1:4]** is a list with a slice (two integers).

- **spam[1:4:2]** is a list with a slice (three integers).

In a slice, the **first integer** is the index where the slice starts. The **second integer** is the index where the slice ends. A slice goes up to, but will not include, the value at the second index.

In 3rd case, The first **two integers** will be the **start** and **stop index**, and the **third** will be the **step argument**. The **step** is the amount that the index is increased by.

In [1]:
spam = ['cat', 'bat', 'rat', 'elephant']

In [2]:
spam[0:4]

['cat', 'bat', 'rat', 'elephant']

In [3]:
spam[1:3]

['bat', 'rat']

In [4]:
spam[0:-1]

['cat', 'bat', 'rat']

As a shortcut, you can leave out one or both of the indexes on either side of the colon in the slice. Leaving out the first index is the same as using 0, or the beginning of the list. Leaving out the second index is the same as using the length of the list, which will slice to the end of the list.

In [5]:
spam[:2]

['cat', 'bat']

In [6]:
spam[1:]

['bat', 'rat', 'elephant']

In [7]:
spam[:]

['cat', 'bat', 'rat', 'elephant']

In [8]:
spam[1:4:2]

['bat', 'elephant']

In [9]:
spam[1::3]

['bat']

In [10]:
spam[:5:3]

['cat', 'elephant']

In [11]:
spam[-1:-5:-1]

['elephant', 'rat', 'bat', 'cat']

In [12]:
spam[-1:-4:-2]

['elephant', 'bat']

In [13]:
spam[::-2]

['elephant', 'bat']

### Getting a List’s Length with *len()*

The *len()* function will return the number of values that are in a list value passed to it, just like it can count the number of characters in a string value.

In [30]:
len(spam)

4

### Changing Values in a List with Indexes

Normally a variable name goes on the left side of an assignment statement, like `spam = 42`. However, you can also use an index of a list to change the value at that index. For example, `spam[1] = 'aardvark'` means “Assign the value at index 1 in the list spam to the string 'aardvark'.”

In [31]:
spam = ['cat', 'bat', 'rat', 'elephant']

In [32]:
spam[1] = 'aardvark'

In [33]:
spam

['cat', 'aardvark', 'rat', 'elephant']

In [34]:
spam[2] = spam[1]

In [35]:
spam

['cat', 'aardvark', 'aardvark', 'elephant']

In [36]:
spam[-1] = 12345

In [37]:
spam

['cat', 'aardvark', 'aardvark', 12345]

### List Concatenation and List Replication

The `+` operator can combine two lists to create a new list value in the same way it combines two strings into a new string value.

In [40]:
[1, 2, 3] + ['A', 'B', 'C']

[1, 2, 3, 'A', 'B', 'C']

The `*` operator can also be used with a list and an integer value to replicate the list. 

In [39]:
['X', 'Y', 'Z'] * 3

['X', 'Y', 'Z', 'X', 'Y', 'Z', 'X', 'Y', 'Z']

In [41]:
spam = [1, 2, 3]

In [42]:
spam = spam + ['A', 'B', 'C']

In [43]:
spam

[1, 2, 3, 'A', 'B', 'C']

### Removing Values from Lists with *del* Statements

The *del* statement will delete values at an index in a list. All of the values in the list after the deleted value will be moved up one index.

In [44]:
spam = ['cat', 'bat', 'rat', 'elephant']

In [45]:
del spam[2]

In [46]:
spam

['cat', 'bat', 'elephant']

In [47]:
del spam[2]

In [48]:
spam

['cat', 'bat']

The *del* statement can also be used on a simple variable to delete it, as if it were an “unassignment” statement. If you try to use the variable after deleting it, you will get a *NameError error* because the variable no longer exists.

In practice, you almost never need to delete simple variables. The del statement is mostly used to delete values from lists.

### Using *for* Loops with Lists

You learned about using for loops to execute a block of code a certain number of times. Technically, a *for* loop repeats the code block once for each value in a list or list-like value. For example, if you ran this code:

In [51]:
for i in range(4):
    print(i)

0
1
2
3


This is because the return value from range(4) is a list-like value that Python considers similar to [0, 1, 2, 3]. The following program has the same output as the previous one:

In [52]:
for i in [0, 1, 2, 3]:
    print(i)

0
1
2
3


What the previous *for* loop actually does is loop through its clause with the variable i set to a successive value in the [0, 1, 2, 3] list in each iteration.

A common Python technique is to use `range(len(someList))` with a for loop to iterate over the indexes of a list. 

In [53]:
supplies = ['pens', 'staplers', 'flame-throwers', 'binders']
for i in range(len(supplies)):
    print('Index ' + str(i) + ' in supplies is: ' + supplies[i])

Index 0 in supplies is: pens
Index 1 in supplies is: staplers
Index 2 in supplies is: flame-throwers
Index 3 in supplies is: binders


Using `range(len(supplies))` in the previously shown for loop is handy because the code in the loop can access the index (as the variable i) and the value at that index (as `supplies[i]`). Best of all, `range(len(supplies))` will iterate through all the indexes of supplies, no matter how many items it contains.

### The *in* and *not* in Operators

You can determine whether a value is or isn’t in a list with the *in* and *not in* operators. Like other operators, *in* and *not in* are used in expressions and connect two values: a value to look for in a list and the list where it may be found. These expressions will evaluate to a Boolean value.

In [54]:
'howdy' in ['hello', 'hi', 'howdy', 'heyas']

True

In [55]:
spam = ['hello', 'hi', 'howdy', 'heyas']
'cat' in spam

False

In [56]:
'howdy' not in spam

False

In [57]:
'cat' not in spam

True

For example, the following program lets the user type in a pet name and then checks to see whether the name is in a list of pets.

In [None]:
myPets = ['Zophie', 'Pooka', 'Fat-tail']
print('Enter a pet name:')
name = input()
if name not in myPets:
    print('I do not have a pet named ' + name)
else:
    print(name + ' is my pet.')

### The Multiple Assignment Trick

The multiple assignment trick is a shortcut that lets you assign multiple variables with the values in a list in one line of code. So instead of doing this:

![](images/Magic.jpg)

In [58]:
cat = ['fat', 'orange', 'loud']

In [59]:
size = cat[0]

In [60]:
color = cat[1]

In [61]:
disposition = cat[2]

In [62]:
# you could type this line of code:
size, color, disposition = cat

The number of variables and the length of the list must be exactly equal, or Python will give you a ValueError:

In [63]:
size, color, disposition, name = cat

ValueError: not enough values to unpack (expected 4, got 3)

The multiple assignment trick can also be used to swap the values in two variables:

In [64]:
a, b = 'Alice', 'Bob'

In [65]:
a, b = b, a

In [67]:
print(a); print(b)

Bob
Alice


### Adding Values to Lists with the *append()* and *insert()* Methods

To add new values to a list, use the *append()* and *insert()* methods. Enter the following into the interactive cell to call the *append()* method on a list value stored in the variable spam:

In [68]:
spam = ['cat', 'dog', 'bat']

In [None]:
spam.append('moose')
spam

The previous *append()* method call adds the argument to the end of the list. The *insert()* method can insert a value at any index in the list. The first argument to *insert()* is the index for the new value, and the second argument is the new value to be inserted. 

In [None]:
spam = ['cat', 'dog', 'bat']

In [None]:
spam.insert(1, 'chicken')
spam

Notice that the code is `spam.append('moose')` and `spam.insert(1, 'chicken')`, not `spam = spam.append('moose')` and `spam = spam.insert(1, 'chicken')`. Neither *append()* nor *insert()* gives the new value of spam as its return value. (In fact, the return value of *append()* and *insert()* is *None*, so you definitely wouldn’t want to store this as the new variable value.) Rather, the list is modified in place.

### Removing Values from Lists with *remove()*

The *remove()* method is passed the value to be removed from the list it is called on. 

In [None]:
spam = ['cat', 'bat', 'rat', 'elephant']

In [None]:
spam.remove('bat')
spam

Attempting to delete a value that does not exist in the list will result in a *ValueError error*

In [None]:
spam.remove('chicken')

If the value appears multiple times in the list, only the first instance of the value will be removed

In [69]:
spam = ['cat', 'bat', 'rat', 'cat', 'hat', 'cat']

In [70]:
spam.remove('cat')
spam

['bat', 'rat', 'cat', 'hat', 'cat']

The *del* statement is good to use when you know the index of the value you want to remove from the list. The *remove()* method is good when you know the value you want to remove from the list.

### List-like Types: Strings and Tuples

Lists aren’t the only data types that represent ordered sequences of values. For example, strings and lists are actually similar, if you consider a string to be a “list” of single text characters. Many of the things you can do with lists can also be done with strings: **indexing; slicing; and using them with for loops, with len(), and with the in and not in operators.**

In [2]:
name = 'Ai_Adventures'

In [3]:
name[0]

'A'

In [4]:
name[-2]

'e'

In [5]:
name[0:4]

'Ai_A'

In [6]:
'Zo' in name

False

In [7]:
'z' in name

False

In [8]:
'p' not in name

True

In [9]:
for i in name:
        print(i)

A
i
_
A
d
v
e
n
t
u
r
e
s


### Mutable and Immutable Data Types

But lists and strings are different in an important way. A list value is a mutable data type: It can have values added, removed, or changed. However, a string is immutable: It cannot be changed. Trying to reassign a single character in a string results in a *TypeError error*, as you can see by entering the following into the cell below:

In [82]:
name = 'Zophie a cat'

In [83]:
name[7] = 'the'

TypeError: 'str' object does not support item assignment

The proper way to “mutate” a string is to use slicing and concatenation to build a new string by copying from parts of the old string

In [84]:
name = 'Zophie a cat'

In [85]:
newName = name[0:7] + 'the' + name[8:12]

In [86]:
name

'Zophie a cat'

In [87]:
newName

'Zophie the cat'

We used [0:7] and [8:12] to refer to the characters that we don’t wish to replace. Notice that the original 'Zophie a cat' string is not modified because strings are immutable.

Although a list value is mutable, the second line in the following code does not modify the list eggs:

In [88]:
eggs = [1, 2, 3]

In [89]:
eggs = [4, 5, 6]

In [90]:
eggs

[4, 5, 6]

The list value in eggs isn’t being changed here; rather, an entirely new and different list value ([4, 5, 6]) is overwriting the old list value ([1, 2, 3]).

![](images/000076.jpg)

If you wanted to actually modify the original list in eggs to contain [4, 5, 6], you would have to do something like this:

In [91]:
eggs = [1, 2, 3]

In [92]:
del eggs[2]

In [93]:
del eggs[1]

In [94]:
del eggs[0]

In [95]:
eggs.append(4)

In [96]:
eggs.append(5)

In [97]:
eggs.append(6)

In [98]:
eggs

[4, 5, 6]

The *del* statement and the *append()* operations depicted below.

![](images/000078.jpg)

Changing a value of a mutable data type (like what the *del* statement and *append()* method do in the previous example) changes the value in place, since the variable’s value is not replaced with a new list value.

### The Tuple Data Type
The tuple data type is almost identical to the list data type, except in two ways. First, tuples are typed with **parentheses, (** and **)**, instead of square brackets, [ and ].

In [99]:
eggs = ('hello', 42, 0.5)

In [100]:
eggs[0]

'hello'

In [101]:
eggs[1:3]

(42, 0.5)

In [102]:
len(eggs)

3

But the main way that tuples are different from lists is that tuples, like strings, are **immutable**. Tuples cannot have their values modified, appended, or removed. 

In [103]:
eggs[1] = 99

TypeError: 'tuple' object does not support item assignment

You can use tuples to convey to anyone reading your code that you don’t intend for that sequence of values to change. If you need an ordered sequence of values that never changes, use a tuple. A second benefit of using tuples instead of lists is that, because they are immutable and their contents don’t change, Python can implement some optimizations that make code using tuples slightly faster than code using lists.

### Converting Types with the *list()* and *tuple()* Functions

Just like how `str(42)` will return '42', the string representation of the integer 42, the functions `list()` and `tuple()` will return list and tuple versions of the values passed to them. Enter the following into the cell, and notice that the return value is of a different data type than the value passed:

In [104]:
tuple(['cat', 'dog', 5])

('cat', 'dog', 5)

In [105]:
list(('cat', 'dog', 5))

['cat', 'dog', 5]

In [106]:
list('hello')

['h', 'e', 'l', 'l', 'o']

Converting a tuple to a list is handy if you need a mutable version of a tuple value.

## Summary

Lists are useful data types since they allow you to write code that works on a modifiable number of values in a single variable.

Lists are **mutable**, meaning that their contents can change. Tuples and strings, although list-like in some respects, are immutable and cannot be changed. A variable that contains a tuple or string value can be overwritten with a new tuple or string value, but this is not the same thing as modifying the existing value in place—like, say, the **append()** or **remove()** methods do on lists.