# Containers

Until now we have used variables that can store one value at a time. What if we want to store more than one value. Let's look at an example for motivation.

Say you are provided with earth surface temperature for past 70 years (very crucial for understanding about global warming), it would be very inconvenient to store data as shown below
```python
temp1952 = 34
temp1953 = 34.8
.
.
temp2021 = 46
```
While it is hard to store each of these values in seperate variables it is even harder to perform any operations on them. What if we want to calculate the maximum temperature or minimum temperature, or plot a graph to show the longitudnal variations in temperature. **Representing a large set of values in individual variables is practically infeasible**.

We want to have the **group of values stored in a single container, which can then be assigned to a single variable**. 

![containersgroup](images/sampleVariableList.png)

We should also be able to **read/update/delete each individual elements**.

Fortunately, Python provides us with different built-in containers for storing more than one value at a time. We will cover some of the commonly used containers (there are many more which can be installed from external sources)

## List

A list is represented as a **sequence of items (ordered), separated by commas, and enclosed in square brackets**. Let's look at some concrete example.

In [1]:
fruitBasket = ['apple','orange','guava','cherry']
numbers = [1,56,7,67,76,55,456]
mixedList = [1,2,3,'hi','hello',False,True,34.56] #can have any datatype
emptyList = []
anotherEmptyList = list()

Can you check the type of fruitBasket declared above

**Lists can virtually store values of any data types, including lists (nested lists) and other containers** (we will cover them later).

```python
aListOfList = [[1,2],[2,4],[4,5]]
```

Previously we have seen that a variable is assigned a location in memory and has a name (identifer) and value associated with it. **A list will have a single name (identifer) associated with a contiguos block of memory (together in sequence)**. 

![list_memory](images/list_memory_view.png)

### Accessing elements in a list using Index values

An index value is a number that refers to a given position in the list. List indexes in Python start from **0 (not 1)** and ends at **length of list - 1**.

Let's see some examples

In [2]:
fruitBasket = ['apple','orange','guava','cherry']
print (fruitBasket[0]) #first element
print (fruitBasket[3]) #fourth element

apple
cherry


What do you think will happen in this case
```python
print (fruitBasket[4])
```

You will get an index error as we are asking the 5th element in the list while the list has only 4 elements. You can even use negative index to access the list elements in the reverse order

In [3]:
print (fruitBasket[-1]) #last element
print (fruitBasket[-2]) #second last element

cherry
guava


You can also access nested lists using nested index. Let's see an example

In [4]:
aListOfList = [[1,2],[2,4],[4,5]]
print (aListOfList[0])
print (aListOfList[2][0])

[1, 2]
4


Now can you retrieve the last element from the last element of the list using index

### Accessing a sequence of elements in List using slicing

![slicing](images/slicing.png)
Slicing helps to extract portions of list (contiguos elements only) through indexing.

The general format for slicing is

```python
listVariable[start:stop:step]
```

where start,stop, and step are optional

Now let's look at some concrete examples

In [5]:
#our list
monthlyTemperature = [36,39,47,60,71,80,84,82,76,64,51,40]

Here monthlyTemperature is a list having average temperature for every month in a year (starting from Jan).

We want to extract the temperature for first three months (Jan,Feb,Mar). We could do this using slicing

In [6]:
print (monthlyTemperature[0:3])

[36, 39, 47]


Now we want to extract the tempertaure from June to October. Again we could use slicing

In [7]:
print (monthlyTemperature[5:10])

[80, 84, 82, 76, 64]


Now can you try to extract the temperatures for the last three months (Oct,Nov, Dec)

If you need elements till the end you could also skip the stop value. For example to extract the last 5 values from our list we can use

In [8]:
print (monthlyTemperature[7:])

[82, 76, 64, 51, 40]


And if you want the first five values you can omit the start value

In [9]:
print (monthlyTemperature[:5])

[36, 39, 47, 60, 71]


The step value can be used to do some intersting things. For example if we want to extract the temperature for every other month (Jan, Mar,May etc), then we can just do 

In [10]:
print (monthlyTemperature[::2])

[36, 47, 71, 84, 76, 51]


### Modifying values inside a list

We can use indices to change values in a list. Let's see an example

In [11]:
fruitBasket = ['apple','orange','guava','cherry']

We want to change the first element of the list to 'grape'

In [12]:
fruitBasket[0] = 'grape'
print (fruitBasket)

['grape', 'orange', 'guava', 'cherry']


Now we want to change the last element to 'pear'.

We can also slice to change a sequence of values. Let's change the first two elements to 'mango' and 'kiwi'

In [13]:
fruitBasket[0:2] = ['mango','kiwi']
print (fruitBasket)

['mango', 'kiwi', 'guava', 'cherry']


Now we want to change all the values to 'pineapple'

In [14]:
fruitBasket[:] = ['pineapple']*4   #this will create a new list with the element repeated 4 times
print(fruitBasket)

['pineapple', 'pineapple', 'pineapple', 'pineapple']


### Adding new elements to the list using append() method

append() method in lsit can be usedd to add new elements to an existing list (or populating a list). You will be heavility using this method when dealing with dynamic data. Let's add new fruits to our fruit basket. 

In [15]:
fruitBasket = ['apple','orange','guava','cherry']

Lets add 'tomato' to our fruit basket

In [16]:
fruitBasket.append('tomato')
print(fruitBasket)

['apple', 'orange', 'guava', 'cherry', 'tomato']


Now try to add a list [7,8] the list given below

In [17]:
numbers = [[1,2],[3,4],[5,6]]

### Concatenating two lists using extend() method

Suppose we have two lists and we would want to concatenate them
For example [1,2,3] and ['a','b','c'] to [1,2,3,'a','b','c']. We can use the extend method on the first list

In [18]:
first = [1,2,3]
second = ['a','b','c']
first.extend(second)
print (first)

[1, 2, 3, 'a', 'b', 'c']


So we have modified the first list and added the contents of the second list to it. Another way to do it is the + operator. But with + operator you have to be careful as it doesn't modify any of the list, rather it creates a new list. Let's check that out

In [19]:
first = [1,2,3]
second = ['a','b','c']
first+second
print (first)

[1, 2, 3]


Oops!!! we didn't get what we expected. The trick is to store the result in a new variable. So why don't you give it a try

What will happen if we use append instead of extend??

### Deleting an element from list using the del statement
You can delete an element from the list using del statement along with the element to extract (using index). Let's remove the first element from our fruit basket. 

In [20]:
fruitBasket = ['apple','orange','guava','cherry']
del fruitBasket[0] # delete the first element
print (fruitBasket)

['orange', 'guava', 'cherry']


Now you delete the last element from the fruit basket. 

### Some more useful functions and methods that you can use with lists

#### The len() function to count the total number of elements in a list/sequence

In [21]:
fruitBasket = ['apple','orange','guava','cherry']
print (len(fruitBasket))

4


#### max(), min(), sum() for performing clacluations in list

In [22]:
monthlyTemperature = [36,39,47,60,71,80,84,82,76,64,51,40]
print ('Maximum temperature is',max(monthlyTemperature))
print ('Minimum temperature is',min(monthlyTemperature))
print ('Average temperature us ',sum(monthlyTemperature)/len(monthlyTemperature))

Maximum temperature is 84
Minimum temperature is 36
Average temperature us  60.833333333333336


#### in statement and index method to search for elements in list

We can use in statement to check whether an element is present in our list. Let's check whether 'apple', 'guava' and 'Apple' are present in our fruit basket

In [23]:
fruitBasket = ['apple','orange','guava','cherry']
print ('apple' in fruitBasket)
print ('guava' in fruitBasket)
print ('Apple' in fruitBasket)

True
True
False


As you can see 'apple' and 'Apple' are not equal. Python strings are case sensitive.

You can also use the index method of list which will return the index of the searched element in the list or an error if the element is not present. 

In [24]:
fruitBasket = ['apple','orange','guava','cherry']
print (fruitBasket.index('apple'))
print (fruitBasket.index('guava'))
print (fruitBasket.index('Apple'))

0
2


ValueError: 'Apple' is not in list

#### sort a list using sort() method

Let's sort our temperature and fruit basket list

In [25]:
monthlyTemperature = [36,39,47,60,71,80,84,82,76,64,51,40]
monthlyTemperature.sort()
print (monthlyTemperature)
fruitBasket = ['apple','orange','guava','cherry']
fruitBasket.sort()
print (fruitBasket)

[36, 39, 40, 47, 51, 60, 64, 71, 76, 80, 82, 84]
['apple', 'cherry', 'guava', 'orange']


#### Reverse a list using the reverse() method

In [26]:
fruitBasket = ['apple','orange','guava','cherry']
fruitBasket.reverse()
print (fruitBasket)

['cherry', 'guava', 'orange', 'apple']


List is one of the commonly used datastructures in Python. While it is great for index based operations such as retrieve the 50th element or update the 100th element, it is not particularly efficent for searching with out index. For example to check whether 'cherry' is present in our fruit basket program has to go through each element and perform the check. What if we have a list of 1 billion elements and unfortunately our searched element is not present in the list. The program have to go through all the elements (1 billion elements) and then return False. We will now look at an efficient datastructure called dict which is specifically designed for searching based on key. 

## Dict (Dictionary)

As the name suggests they are very **similar to word dictionaries that we are used to**. You lookup for a word using the word as the key and get its details. 

![real_dictionary](images/real_dictionary.jpg)

Another example is **Yellow Pages**. You lookup for a business establishment using its **name as key** and get the **phone number as value**. 

So we could basically say, **dictionaries are optimized for lookups (key-->value)**. The key thing is that **dictionaries cannot contain duplicate values**

Dictionaries are declared using {} or dict(). Lets create an empty dictionary

In [27]:
firstDict = {}

Now let's create a dictionary with some data

In [28]:
myDict = {'name':'jay','age':35,'sex':'M','vaccinated':True}

As you can see the dict has a key value strcuture. {key1:value1,key2:value2,key3:value3}

### Searching for a key in a dictionary

We can use the in statement (as in list) to check whether a key exist in a dictionary. Let's look at an example. 

In [29]:
myDict = {'name':'jay','age':35,'sex':'M','vaccinated':True}
print ('name' in myDict)
print ('ethnicity' in myDict)

True
False


### Retrieving the value associated with a key

While we use index in list to retrieve a particular element at a particular location, we can use the key to retrieve a value from the dictionary. 

In [30]:
myDict = {'name':'jay','age':35,'sex':'M','vaccinated':True}
print (myDict['name'])
print (myDict['sex'])

jay
M


Now you try to retrieve the value for 'ethnicity' in our dictionary. 

### Updating the value of an eixsting key

We can again use the key to update the value in a dictionary

In [31]:
myDict = {'name':'jay','age':35,'sex':'M','vaccinated':True}
myDict['age'] = 36
print (myDict['age'])

36


If you want to increase the value corresponding to the key 'age' by 1 you can do this also

In [32]:
myDict = {'name':'jay','age':35,'sex':'M','vaccinated':True}
myDict['age'] = myDict['age']+1
print (myDict['age'])

36


Can you try changing the value for 'vaccinated' to False

### Adding a new key,value to the dictionary

Similar to updating the value for an existing key, you can add a new key,value to the dictionary by a basic assignment operation. Let's add 'ethnicity' to our dictionary. 

In [53]:
myDict = {'name':'jay','age':35,'sex':'M','vaccinated':True}
myDict['ethnicity'] = 'Asian'
print (myDict)

{'name': 'jay', 'age': 35, 'sex': 'M', 'vaccinated': True, 'ethnicity': 'Asian'}


If the key already exists then it will just be an update operation. 

### Deleting a key (and corresponding value) from a dictionary

Similar to lists you can use del statement to delete a key from a dictionary. Let's remove the key 'vaccinated' from our dictionary. 

In [54]:
myDict = {'name':'jay','age':35,'sex':'M','vaccinated':True}
del myDict['vaccinated']
print (myDict)

{'name': 'jay', 'age': 35, 'sex': 'M'}


You can also have any datatype as value for a dictionary. For example the code shown below is perfectly valid.

In [55]:
dictList = {'id':[1,2,3],'names':['jay','ben','sam']}

Now can you retrieve the second element from the list associated with 'name' key in the dictionary?

You can also use the len() function to get the size of the dictionary (ie the number of keys)

In [63]:
myDict = {'name':'jay','age':35,'sex':'M','vaccinated':True}
print (len(myDict))

4


dicts are blazingly fast for lookups while is not an efficient datastructure for index based searches (the nth element etc). Just for fun we can do a list vs dict speed test. 

### List vs Dict Speed Test for Key based searching

Let us create a list containing 10,000,000 elements (0 to 9,999,999). Don't worry about the for loop. We will cover that in the next chapter.  

In [56]:
aList = [i for i in range(10000000)]

Now lets create a dictionary containing 10,000,000 keys of the form {0:0,1:0,....9999999:0}

In [57]:
aDict = {i:0 for i in range(10000000)}

Now let's check whether the value 10000000 occurs in our list. We will use the magic operator called %%timeit to measure the speed

In [58]:
%%timeit
10000000 in aList

105 ms ± 995 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


So you can see that in our server machine the speed is around 105 milli seocnds (.105 seconds). Now lets search in dictionary

In [59]:
%%timeit
10000000 in aDict

41.7 ns ± 0.295 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)


It took only 41.7 nano seconds which is around **2.5 million times** faster than the list search. So its imperative to use the right datastructure for maximizing the eifficiency. 

There are some more built-in datastructures which we are not going to cover in detail here. Those who are interested in learning about them can go through the programming notebooks provided here https://mybinder.org/v2/gh/JayakrishnanAjayakumar/Python_Programming_2022.git/HEAD

## Tuple

Tuples are very similar to lists except for the fact that **tuples** are **immutable** (cannot be updated). A tuple is declared either using () or tuple(). Lets look at an example

In [60]:
myTuple = ('jay','M',34,True)

You can access the elements in a tuple using index (similar to list)

In [61]:
print (myTuple[0])

jay


You cannot update the value in a tuple using index. For example, this will result in an error.

In [62]:
myTuple = ('jay','M',34,True)
myTuple[0] = 'sam'

TypeError: 'tuple' object does not support item assignment

You can calculate the length of a tuple using len() function

In [64]:
myTuple = ('jay','M',34,True)
print (len(myTuple))

4


## String

String is also (surprisingly!!) a container and is very similar to list and tuples. 

### Creating a String

In [66]:
newString = 'this is a string'
#or even from a datatype conversion
ten = str(10)
print (newString)
print (ten)

this is a string
10


### Using indexing in strings

Similar to list we can use index to retrieve characters from a string (you can think of string as a list of characters)

In [68]:
newString = 'this is a string'
print (newString[0])
print (newString[0:4])
print (newString[8:])

t
this
a string


### Modifying a string

Similar to tuples, string cannot be modified using indexes. For example, this will result in an error. 

In [69]:
newString = 'this is a string'
newString[0] = 'a'

TypeError: 'str' object does not support item assignment

But you can concatenate two strings using the '+' operator. But this will create a new string and will not modify the existing strings. 

In [70]:
goodString = 'this is a good string'
badString = ' and this is a bad string'
newString = goodString+badString
print (goodString)
print (badString)
print (newString)

this is a good string
 and this is a bad string
this is a good string and this is a bad string


### Some useful functions and methods of strings

You can find the length of a string using the len() method

In [71]:
name = 'jay'
print (len(name))

3


You can convert the string to upper and lower case by using the upper() and lower() methods respectively.

In [72]:
name = 'Jay'
print (name.lower())
print (name.upper())

jay
JAY


You can replace the characters in string using the replace method.

In [74]:
name = 'Mat'
newName = name.replace('M','C')
print (newName)

Cat


Check if a substring occurs in a string with the in statement

In [76]:
comment = 'the patient was tested positive for Covid-19'
print ('Covid-19' in comment)
print ('covid-19' in comment)

True
False


Split a string to a list (this is a very useful operation for reading comma seperated data)

In [77]:
rowStr = 'id,name,age,sex,X,Y'
rowList = rowStr.split(',')
print (rowList)

['id', 'name', 'age', 'sex', 'X', 'Y']


Can you split this string to a list 
```python
rowStr = 'id|name|age|sex|X|Y'
```

Converting a list of strings to a delimited string using join() method

In [79]:
data = ['id', 'name', 'age', 'sex', 'X', 'Y']
dataStr = ",".join(data)
print (dataStr)

id,name,age,sex,X,Y


There are many more methods for strings and you can find the details in https://mybinder.org/v2/gh/JayakrishnanAjayakumar/Python_Programming_2022.git/HEAD

In the next chapter we will look at loops which is key to manipulate containers. 