# Other data structures

## Strings

Lists aren’t the only data types that represent ordered sequences of values. For example, strings and lists are actually similar, if you consider a string to be a "list" of single text characters. Many of the things you can do with lists can also be done with strings: *indexing*; *slicing*; and using them with `for` loops, with `len()`, and with the `in` and `not in` operators.

In [6]:
name = 'aiadventures'

In [7]:
name[0]

'a'

In [8]:
name[-2]

'e'

In [9]:
name[0:4]

'aiad'

In [10]:
'Zo' in name

False

In [11]:
'z' in name

False

In [12]:
'p' not in name

True

In [13]:
for i in name:
        print('* * * ' + i + ' * * *')

* * * a * * *
* * * i * * *
* * * a * * *
* * * d * * *
* * * v * * *
* * * e * * *
* * * n * * *
* * * t * * *
* * * u * * *
* * * r * * *
* * * e * * *
* * * s * * *


But lists and strings are different in an important way. A list value is a **mutable data type**: It can have values added, removed, or changed. However, a string is **immutable**: It cannot be changed. Trying to reassign a single character in a string results in a `TypeError` error, as you can see by entering the following into the cell below:

In [84]:
name = 'Lucy a cat'

In [90]:
name[5] = 'the'

TypeError: 'str' object does not support item assignment

The proper way to “mutate” a string is to use slicing and concatenation to build a new string by copying from parts of the old string

In [91]:
name = 'Lucy a cat'

In [92]:
newName = name[0:5] + 'the' + name[6:10]

In [93]:
name

'Lucy a cat'

In [94]:
newName

'Lucy the cat'

We used [0:5] and [6:10] to refer to the characters that we don’t wish to replace. Notice that the original `'Lucy a cat'` string is not modified because strings are immutable.

Although a list value is mutable, the second line in the following code does not modify the list `candies`.

In [95]:
candies = [1, 2, 3]

In [96]:
candies = [4, 5, 6]

In [97]:
candies

[4, 5, 6]

The values in `candies` isn’t being changed here; rather, an entirely new and different list value `[4, 5, 6]` is overwriting the old list value `[1, 2, 3]`.

![](../images/000076.png)

If you wanted to actually modify the original list in candies to contain `[4, 5, 6]`, you would have to do something like this:

In [98]:
candies = [1, 2, 3]

In [99]:
del candies[2]

In [100]:
del candies[1]

In [101]:
del candies[0]

In [102]:
candies.append(4)

In [103]:
candies.append(5)

In [104]:
candies.append(6)

In [105]:
candies

[4, 5, 6]

The `del` statement and the `append()` operations depicted below.
![](../images/000078.png)

Changing a value of a mutable data type (like what the `del` statement and `append()` method do in the previous example) changes the value in place, since the variable’s value is not replaced with a new list value.

## Tuple
The tuple data type is almost identical to the list data type, except in two ways. First, tuples are typed with parentheses, ( and ), instead of square brackets, [ and ].

In [106]:
candies = ('hello', 42, 0.5)

In [107]:
candies[0]

'hello'

In [108]:
candies[1:3]

(42, 0.5)

In [109]:
len(candies)

3

But the main way that tuples are different from lists is that **tuples**, like strings, are **immutable**. Tuples cannot have their values modified, appended, or removed. 

In [110]:
candies[1] = 99

TypeError: 'tuple' object does not support item assignment

You can use tuples to convey to anyone reading your code that you don’t intend for that sequence of values to change. If you need an ordered sequence of values that never changes, use a tuple. A second benefit of using tuples instead of lists is that, because they are immutable and their contents don’t change, Python can implement some optimizations that make code using tuples slightly faster than code using lists.

## Using `list()` and `tuple()`

Just like how `str(42)` will return `'42'`, the string representation of the integer `42`, the functions `list()` and `tuple()` will return list and tuple versions of the values passed to them. Try the following code into your Jupyter Notebook, and notice that the return value is of a different data type than the value passed.

In [111]:
tuple(['cat', 'dog', 5])

('cat', 'dog', 5)

In [112]:
list(('cat', 'dog', 5))

['cat', 'dog', 5]

In [113]:
list('hello')

['h', 'e', 'l', 'l', 'o']

Converting a tuple to a list is handy if you need a mutable version of a tuple value.

##  Dictionaries

Dictionary provides a flexible way to access and organize data. Like a list, a dictionary is a collection of many values. But unlike indexes for lists, indexes for dictionaries can use many different data types, not just integers. Indexes for dictionaries are called keys, and a key with its associated value is called a **key-value pair**.

In code, a dictionary is typed with braces, `{ }`.

In [114]:
myphone = {'brand': 'apple', 'color': 'gray', 'model': 'iphoneX'}

This assigns a dictionary to the `myphone` variable. This dictionary’s *keys* are `brand`, `color`, and `model`. The *values* for these keys are `apple`, `gray`, and `iphoneX`, respectively. You can access these values through their keys:

In [115]:
myphone['brand']

'apple'

In [118]:
'My phone is of ' + myphone['color'] + ' color.'

'My phone is of gray color.'

Dictionaries can still use integer values as keys, just like lists use integers for indexes, but they do not have to start at 0 and can be any number.

In [119]:
items = {12345: 'Luggage Combination', 42: 'The Answer'}

### Dictionaries vs Lists

Unlike lists, items in dictionaries are **unordered**. The first item in a list named `items` would be `items[0]`. But there is no “first” item in a dictionary. While the order of items matters for determining whether two lists are the same, it does not matter in what order the key-value pairs are typed in a dictionary.

In [14]:
animals1 = ['cats', 'dogs', 'moose']
animals2 = ['dogs', 'moose', 'cats']
animals1 == animals2

False

In [15]:
mypet1 = {'name': 'badal', 'species': 'dog', 'age': '3'}
mypet2 = {'species': 'dog', 'age': '3', 'name': 'badal'}
mypet1 == mypet2

True

Because dictionaries are not ordered, they can’t be sliced like lists.

Trying to access a key that does not exist in a dictionary will result in a `KeyError` error message, much like a list’s `IndexError` error message. Enter the following in the cell below, and notice the error message that shows up because there is no `'color'` key.

In [126]:
new = {'name': 'badal', 'age': 7}

In [127]:
new['color']

KeyError: 'color'

Though dictionaries are not ordered, the fact that you can have arbitrary values for the keys allows you to organize your data in powerful ways. Say you wanted your program to store data about your friends’ birthdays. You can use a dictionary with the names as keys and the birthdays as values.

### Dictionary Methods

There are three dictionary methods that will return list-like values of the dictionary’s keys, values, or both keys and values: `keys()`, `values()`, and `items()`. The values returned by these methods are not true lists: They cannot be modified and do not have an `append()` method. But these data types (*dict_keys*, *dict_values*, and *dict_items*, respectively) can be used in `for` loops. Lets see how these methods work.

#### `value()` method

Let's create a dictionary named as `newdict` its content will be as follows.

In [128]:
newdict = {'color': 'red', 'age': 42}

a `for` loop iterates over each of the values in the `newdict` dictionary

In [129]:
for v in newdict.values():
    print(v)

red
42


#### `keys()`method

A `for` loop can also iterate over the keys.

In [130]:
for k in newdict.keys():
    print(k)

color
age


#### `items()` method

In `items()` method you will get results in tuples according to the values of `newdict`. The tuple will contain key along with it's value.

In [131]:
for i in newdict.items():
    print(i)

('color', 'red')
('age', 42)


Using the `keys()`, `values()`, and `items()` methods, a `for` loop can iterate over the keys, values, or key-value pairs in a dictionary, respectively. Notice that the values in the `dict_items` value returned by the `items()` method are tuples of the key and value.

If you want a true list from one of these methods, pass its list-like return value to the `list()` function.

In [17]:
newdict = {'color': 'red', 'age': 42}

In [18]:
newdict.keys()

dict_keys(['color', 'age'])

In [19]:
list(newdict.keys())

['color', 'age']

The `list(newdict.keys())` line takes the *dict_keys* value returned from `keys()` and passes it to `list()`, which then returns a list value of `['color', 'age']`.

You can also use the multiple assignment trick in a `for` loop to assign the key and value to separate variables.

In [20]:
for k, v in newdict.items():
    print('Key: ' + k + ' Value: ' + str(v))

Key: color Value: red
Key: age Value: 42


### `in` operator

Recall from the previous chapter the `in` operator which can check whether a value exists in a list or not. You can also use these operators to see whether a certain key or value exists in a dictionary.

In [136]:
pet = {'name': 'Zophie', 'age': 7}

In [137]:
'name' in pet.keys()

True

In [138]:
'Zophie' in pet.values()

True

In [139]:
'color' in pet.keys()

False

### `not in` operator
 

In [140]:
'color' not in pet.keys()

True

In [141]:
'color' in pet

False

In the previous example, notice that `'color'` in `pet` is essentially a shorter version of writing `'color'` in `pet.keys()`. This is always the case: If you ever want to check whether a value is (or isn’t) a key in the dictionary, you can simply use the `in` (or `not in`) keyword with the dictionary value itself.

### Nested Dictionaries

As you model more complicated things, you may find you need dictionaries and lists that contain other dictionaries and lists. Lists are useful to contain an ordered series of values, and dictionaries are useful for associating keys with values. For example, here’s a program that uses a dictionary that contains other dictionaries in order to see who is bringing what to a picnic. The `totalBrought()` function can read this data structure and calculate the total number of an item being brought by all the guests.

In [5]:
allGuests = {'Ankur': {'samosa': 5, 'jalebi': 12},
                'Vivel': {'chakli': 3, 'samosa': 2},
                'Pranav': {'chiwda': 3, 'oreo': 1}}

In [4]:
def totalBrought(guests, item):
    numBrought = 0
    for k, v in guests.items():                   
        numBrought = numBrought + v.get(item, 0)
    return numBrought

In above defined `totalBrought()` function, the `for` loop iterates over the key-value pairs in `guests`. Inside the loop, the string of the guest’s name is assigned to `k`, and the dictionary of picnic items they’re bringing is assigned to `v`. If the item parameter exists as a key in this dictionary, it’s value (the quantity) is added to `numBrought`. If it does not exist as a key, the `get()` method will `return 0` and added to `numBrought`.

In [7]:
print('Number of things being brought:')
print(' - Chakli         ' + str(totalBrought(allGuests, 'chakli')))
print(' - Chiwda         ' + str(totalBrought(allGuests, 'chiwda')))
print(' - Cakes          ' + str(totalBrought(allGuests, 'cakes')))
print(' - Jalebi         ' + str(totalBrought(allGuests, 'jalebi')))
print(' - Oreo           ' + str(totalBrought(allGuests, 'oreo')))
print(' - Samosa         ' + str(totalBrought(allGuests, 'samosa')))

Number of things being brought:
 - Chakli         3
 - Chiwda         3
 - Cakes          0
 - Jalebi         12
 - Oreo           1
 - Samosa         7


This may seem like such a simple thing to model that you wouldn’t need to bother with writing a program to do it. But realize that this same `totalBrought()` function could easily handle a dictionary that contains thousands of guests, each bringing thousands of different picnic items. Then having this information in a data structure along with the `totalBrought()` function would save you a lot of time!

You can model things with data structures in whatever way you like, as long as the rest of the code in your program can work with the data model correctly. When you first begin programming, don’t worry so much about the “right” way to model data. As you gain more experience, you may come up with more efficient models, but the important thing is that the data model works for your program’s needs.

## Sets

The next basic collection is the `set`, which contains unordered collections of *unique items*.
They are defined much like `lists` and `tuples`, except they use the curly brackets of dictionaries, `{ }`:

In [145]:
primes = {2, 3, 5, 7}
odds = {1, 3, 5, 7, 9}

If you're familiar with the mathematics of sets, you'll be familiar with operations like the *union, intersection, difference, symmetric difference, and others*.
Python's sets have all of these operations built-in, via methods or operators.
For each, we'll show the two equivalent methods:

### Union

In [146]:
# union: items appearing in either
primes | odds      # with an operator
primes.union(odds) # equivalently with a method

{1, 2, 3, 5, 7, 9}

### Intersection

In [147]:
# intersection: items appearing in both
primes & odds             # with an operator
primes.intersection(odds) # equivalently with a method

{3, 5, 7}

### Difference

In [148]:
# difference: items in primes but not in odds
primes - odds           # with an operator
primes.difference(odds) # equivalently with a method

{2}

### Symmetric difference

In [149]:
# symmetric difference: items appearing in only one set
primes ^ odds                     # with an operator
primes.symmetric_difference(odds) # equivalently with a method

{1, 2, 9}

## More Specialized Data Structures

Python contains several other data structures that you might find useful; these can generally be found in the built-in ``collections`` module.
The collections module is fully-documented in [Python's online documentation](https://docs.python.org/3/library/collections.html), and you can read more about the various objects available there.

In particular, we've found the following very useful on occasion:

- ``collections.namedtuple``: Like a tuple, but each value has a name
- ``collections.defaultdict``: Like a dictionary, but unspecified keys have a user-specified default value
- ``collections.OrderedDict``: Like a dictionary, but the order of keys is maintained

## Conclusion

### Questionaire

1. How would you replace 65 with 92 from following example `que_1 = (0,1,2,{'Name':'Suresh', 'Marks':[65,89,72,80,77]})`
2. Which data structures are known as **Mutable** ?
3. Are keys in dictionary replaceable?
4. What is the difference between *dictionary* and *defaultdict*?
5. What is the output of the following code 

```{code-block} python
que_5 ={'key1':'value1', 'key2': 'value2', 'key3':'value3', 'key4': 'value4'} 
que_5['key1':'key3']
```

### Further Reading

Many more methods and operations are available for each of the data type discussed here.
Please refer to Python's [online documentation](https://docs.python.org/3/library/stdtypes.html) for a complete reference.

### Exercise

We think it is enough Python concepts to develop a small game named **3 Missionanries & 3 Cannibals**.You will get this in  moodle's Projects section.