# Data Structures

In simple terms, a data structure is a collection or group of data in a particular structure.

## Lists

Lists are the most commonly used data structure. Think of it as a sequence of data that is enclosed in square brackets and data are separated by a comma. Each of these data can be accessed by calling it's index value.

Lists can be declared in literal form with `[ ]`:

In [12]:
x = ['apple', 'pear', 'orange']

In [13]:
type(x)

list

### Indexing

In python, Indexing starts from 0. Thus now the list `x`, which has two elements will have apple at 0 index and orange at 1 index.

In [14]:
x[0]

'apple'

Indexing can also be done in reverse order. That is the last element can be accessed first. Here, indexing starts from -1. Thus index value -1 will be orange and index -2 will be apple.

In [15]:
x[-1]

'orange'

As you might have already guessed, `x[0] = x[-2], x[1] = x[-1]`. This concept can be extended towards lists with more many elements.

In [16]:
y = ['carrot','potato']

Here we have declared two lists `x` and `y` each containing its own data. Now, these two lists can again be put into another list say `z` which will have it's data as two lists. This list inside a list is called as nested lists and is how an array would be declared which we will see later.

In [17]:
z = [x,y]
z

[['apple', 'pear', 'orange'], ['carrot', 'potato']]

Indexing in nested lists can be quite confusing if you do not understand how indexing works in python. So let us break it down and then arrive at a conclusion.

Let us access the data 'apple' in the above nested list.
First, at index 0 there is a list ['apple','orange'] and at index 1 there is another list ['carrot','potato']. Hence `z[0]` should give us the first list which contains 'apple'.

In [18]:
z1 = z[0]
z1

['apple', 'pear', 'orange']

Now observe that `z1` is not at all a nested list thus to access 'apple', `z1` should be indexed at 0.

In [19]:
z1[0]

'apple'

Instead of doing the above, In Python, you can access 'apple' by just writing the index values each time side by side.

In [20]:
z[0][0]

'apple'

If there was a list inside a list inside a list then you can access the innermost value by executing `z[ ][ ][ ]`.

### Slicing

Indexing was only limited to accessing a single element. Slicing on the other hand is accessing a sequence of data inside the list. In other words "slicing" the list.

Slicing is done by defining the index values of the first element and the last element from the parent list that is required in the sliced list. It is written as parentlist `[ a : b ]` where `a,b` are the index values from the parent list. If a or b is not defined then the index value is considered to be the first value for a if a is not defined and the last value for b when b is not defined.

In [26]:
num = [0,1,2,3,4,5,6,7,8,9]
num[0:4]

[0, 1, 2, 3]

In [23]:
num[4:]

[4, 5, 6, 7, 8, 9]

You can also slice a parent list with a fixed length or step length.

In [25]:
num[:9:3]

[0, 3, 6]

### Built in List Functions

To find the length of the list or the number of elements in a list, `len( )` is used.

In [27]:
len(num)

10

If the list consists of all integer elements then `min( )` and `max( )` gives the minimum and maximum value in the list.

In [28]:
min(num)

0

In [29]:
max(num)

9

Lists can be concatenated by adding, '+' them. The resultant list will contain all the elements of the lists that were added. The resultant list will not be a nested list.

In [30]:
[1,2,3] + [5,4,7]

[1, 2, 3, 5, 4, 7]

There might arise a requirement where you might need to check if a particular element is there in a predefined list. Consider the below list.

In [31]:
names = ['Earth','Air','Fire','Water']

To check if 'Fire' and 'Rajath' is present in the list names. A conventional approach would be to use a for loop and iterate over the list and use the if condition. But in python you can use 'a in b' concept which would return 'True' if a is present in b and 'False' if not.

In [32]:
'Fire' in names

True

In [33]:
'Rajath' in names

False

In a list with elements as string, `max( )` and `min( )` is applicable. `max( )` would return a string element whose ASCII value is the highest and the lowest when `min( )` is used. Note that only the first index of each element is considered each time and if they value is the same then second index considered so on and so forth.

In [34]:
mlist = ['bzaa','ds','nc','az','z','klm']

In [35]:
max(mlist), min(mlist)

('z', 'az')

Here the first index of each element is considered and thus z has the highest ASCII value thus it is returned and minimum ASCII is a. But what if numbers are declared as strings?

In [36]:
nlist = ['1','94','93','1000']

In [38]:
max(nlist), min(nlist)

('94', '1')

Here, the strings are still sorted as string (that is, lexically) even though they "look like" numbers.

But if you want to find the `max( )` string element based on the length of the string then another parameter 'key=len' is declared inside the `max( )` and `min( )` function.

In [39]:
max(names, key=len), min(names, key=len)

('Earth', 'Air')

But even 'Water' has length 5. `max()` or `min()` function returns the first element when there are two or more elements with the same length.

Any other built in function can be used or lambda function (will be discussed later) in place of `len`.

A string can be converted into a list by using the `list()` function.

In [40]:
list('hello')

['h', 'e', 'l', 'l', 'o']

`append( )` is used to add a element at the end of the list.

In [41]:
lst = [1,1,4,8,7]

In [42]:
lst.append(1)
lst

[1, 1, 4, 8, 7, 1]

`count( )` is used to count the number of a particular element that is present in the list. 

In [43]:
lst.count(1)

3

`append( )` function can also be used to add a entire list at the end. Observe that the resultant list becomes a nested list.

In [44]:
lst1 = [5,4,2,8]

In [45]:
lst.append(lst1)
lst

[1, 1, 4, 8, 7, 1, [5, 4, 2, 8]]

But if nested list is not what is desired then `extend( )` function can be used.

In [46]:
lst.extend(lst1)
lst

[1, 1, 4, 8, 7, 1, [5, 4, 2, 8], 5, 4, 2, 8]

`index( )` is used to find the index value of a particular element. Note that if there are multiple elements of the same value then the first index value of that element is returned.

In [47]:
lst.index(1)

0

`insert(x,y)` is used to insert a element y at a specified index value x. `append( )` function made it only possible to insert at the end. 

In [48]:
lst.insert(5, 'name')
lst

[1, 1, 4, 8, 7, 'name', 1, [5, 4, 2, 8], 5, 4, 2, 8]

`insert(x,y)` inserts but does not replace element. If you want to replace the element with another element you simply assign the value to that particular index.

In [49]:
lst[5] = 'Python'
lst

[1, 1, 4, 8, 7, 'Python', 1, [5, 4, 2, 8], 5, 4, 2, 8]

`pop( )` function return the last element in the list. This is similar to the operation of a stack. Hence it wouldn't be wrong to tell that lists can be used as a stack.

In [50]:
lst.pop()

8

Index value can be specified to pop a ceratin element corresponding to that index value.

In [51]:
lst.pop(0)

1

`pop( )` is used to remove element based on it's index value which can be assigned to a variable. One can also remove element by specifying the element itself using the `remove( )` function.

In [52]:
lst.remove('Python')
lst

[1, 4, 8, 7, 1, [5, 4, 2, 8], 5, 4, 2]

Alternative to `remove` function but with using index value is `del`

In [54]:
del lst[1]
lst

[1, 8, 7, 1, [5, 4, 2, 8], 5, 4, 2]

The entire elements present in the list can be reversed by using the `reverse()` function.

In [55]:
lst.reverse()
lst

[2, 4, 5, [5, 4, 2, 8], 1, 7, 8, 1]

Note that the nested list [5,4,2,8] is treated as a single element of the parent list lst. Thus the elements inside the nested list is not reversed.

Python offers built in operation `sort( )` to arrange the elements in ascending order.

In [57]:
lst = [9, 6, 3, 7, 2, 1, 42]
lst.sort()
lst

[1, 2, 3, 6, 7, 9, 42]

For descending order, By default the reverse condition will be False for reverse. Hence changing it to True would arrange the elements in descending order.

In [59]:
lst.sort(reverse=True)
lst

[42, 9, 7, 6, 3, 2, 1]

Similarly for lists containing string elements, `sort( )` would sort the elements based on it's ASCII value in ascending and by specifying `reverse=True` in descending.

In [61]:
names.sort()
names

['Air', 'Earth', 'Fire', 'Water']

In [62]:
names.sort(reverse=True)
names

['Water', 'Fire', 'Earth', 'Air']

To sort based on length key=len should be specified as shown.

In [63]:
names.sort(key=len)
names

['Air', 'Fire', 'Water', 'Earth']

In [64]:
names.sort(key=len,reverse=True)
names

['Water', 'Earth', 'Fire', 'Air']

### Copying a list

To copy a list, it's not sufficient to simply assign a new name to it. Consider the following,

In [66]:
lista = [2,1,4,3]

In [67]:
listb = lista
listb

[2, 1, 4, 3]

Here, We have declared a list, `lista = [2,1,4,3]`. This list is assigned to `listb`. While we might expect `lista` and `listb` to be different lists, we can see that changes to `lista` are reflected in `listb`:

In [68]:
lista.pop()
lista.append(9)
lista

[2, 1, 4, 9]

In [70]:
listb

[2, 1, 4, 9]

`listb` has also changed though no operation has been performed on it. This is because `lista` and `listb` refer to the same object. So how do fix this?

One way to copy a list is to construct a new list around it:

In [73]:
lista = [2,1,4,3]
listb = list(lista)

lista.sort()

lista, listb

([1, 2, 3, 4], [2, 1, 4, 3])

Another approach is to use slicing. Since slicing creates copies, a full slice of a list will effectively create a copy of it:

In [74]:
listb = lista[:]
lista, listb

([1, 2, 3, 4], [1, 2, 3, 4])

These lists are different objects but have identical contents. We can see this by modifying one of them:

In [75]:
lista.pop()
lista, listb

([1, 2, 3], [1, 2, 3, 4])

## Tuples

Tuples are similar to lists but only big difference is the elements inside a list can be changed but in tuple it cannot be changed. We can create tuples literals similar to how we create lists but using `()` instead:

In [76]:
xyz = (3, 8, 5)
xyz

(3, 8, 5)

We can see that this is a tuple with the `type()` function:

In [78]:
type(xyz)

tuple

To define a tuple, A variable is assigned to paranthesis ( ) or tuple( ).

In [79]:
tup = ()
tup2 = tuple()

One thing to keep in mind is that if you want to declare a tuple with a single element, you need a trailing comma:

In [81]:
(27,)

(27,)

You can multiply a tuple by an integer to repeat it that many times:

In [83]:
3 * ('a', 'b', 'c')

('a', 'b', 'c', 'a', 'b', 'c', 'a', 'b', 'c')

Values can be assigned while declaring a tuple. It takes a list as input and converts it into a tuple or it takes a string and converts it into a tuple.

In [84]:
tup3 = tuple([1,2,3])
tup3

(1, 2, 3)

In [85]:
tup4 = tuple('Hello')
tup4

('H', 'e', 'l', 'l', 'o')

It follows the same indexing and slicing as Lists.

In [86]:
tup3[1]

2

In [87]:
tup5 = tup4[:3]
tup5

('H', 'e', 'l')

### Tuple unpacking

You can assign the elements of a tuple to variables by separating the variable names by commans:

In [89]:
a,b,c = ('alpha','beta','gamma')
"The elements are {}, {}, and {}".format(a, b, c)

'The elements are alpha, beta, and gamma'

### Built In Tuple functions

`count()` function counts the number of specified element that is present in the tuple.

In [93]:
d = tuple('alphabetical')
d.count('a')

3

`index()` function returns the index of the specified element. If the elements are more than one then the index of the first element of that specified element is returned

In [94]:
d.index('a')

0

## Sets

Sets are collections of unique, immutable objects. It is also used to perform some standard set operations.

Sets are declared as `set()` which will initialize a empty set. Also `set([sequence])` can be executed to declare a set with elements

In [95]:
set1 = set()
type(set1)

set

In [96]:
set0 = set([1,2,2,3,3,4])
set0

{1, 2, 3, 4}

elements 2,3 which are repeated twice are seen only once. Thus in a set each element is distinct.

### Built-in Functions

In [97]:
set1 = set([1,2,3])

In [98]:
set2 = set([2,3,4,5])

`union( )` function returns a set which contains all the elements of both the sets without repition.

In [99]:
set1.union(set2)

{1, 2, 3, 4, 5}

`add( )` will add a particular element into the set. Note that the index of the newly added element is arbitrary and can be placed anywhere not neccessarily in the end.

In [100]:
set1.add(0)
set1

{0, 1, 2, 3}

`intersection( )` function outputs a set which contains all the elements that are in both sets.

In [101]:
set1.intersection(set2)

{2, 3}

`difference( )` function ouptuts a set which contains elements that are in set1 and not in set2.

In [102]:
set1.difference(set2)

{0, 1}

`symmetric_difference( )` function ouputs a function which contains elements that are in one of the sets.

In [103]:
set2.symmetric_difference(set1)

{0, 1, 4, 5}

`issubset( ), isdisjoint( ), issuperset( )` is used to check if the set1/set2 is a subset, disjoint or superset of set2/set1 respectively.

In [104]:
set1.issubset(set2)

False

In [105]:
set2.isdisjoint(set1)

False

In [106]:
set2.issuperset(set1)

False

`pop( )` is used to remove an arbitrary element in the set

In [107]:
set1.pop()
set1

{1, 2, 3}

`remove( )` function deletes the specified element from the set.

In [108]:
set1.remove(2)
set1

{1, 3}

`clear( )` is used to clear all the elements and make that set an empty set.

In [109]:
set1.clear()
set1

set()