# Data Structures

**Data Structures** are important for implementing algorithms efficiently. 

A data structure is a logical way to organize and operate on data. 

In python, we can use classes to implement a data structure. **Queues**, **Stacks**, **Hash-tables** are common data structures used extensively in algorithms dealing with searching and sorting problems.

## Queue

The queue is our first encounter with what computer scientists call a data structure. The queue acts like a list, but has one main rule called *First In/First Out*

This means the following rules apply:

- You can only ever add data to a queue one object at a time

- If you remove an object it has to be the one at the front of the queue (the one added the earliest). i.e. `.pop(0)`

In [None]:
# Queue is similar to a line in a store, where the first item is dequeued.
# Front is the element being dequeued, rear is where an element is queued.
# First arrived, first served (dequeued), joining a queue at the back (rear).

In [1]:
class Queue:
    def __init__(self):
        self.elements = []

    def enqueue(self, data):
        self.elements.append(data)
        return data

    def dequeue(self):
        return self.elements.pop(0)

    def rear(self):
        return self.elements[-1]

    def front(self):
        return self.elements[0]

    def is_empty(self):
        return len(self.elements) == 0
    
    def info(self):
        return self.elements


In [2]:
q = Queue()
print(q.is_empty())
q.enqueue(5)
q.enqueue(2)
q.enqueue(9)
print(q.info())
print(q.front())
print(q.rear())
print(q.is_empty())

True
[5, 2, 9]
5
9
False


In [3]:
q.dequeue()
print(q.info())

[2, 9]


In [6]:
# simple queue
my_queue1=[]
my_queue1.append("a")
my_queue1.append("b")
my_queue1.append("c")
my_queue1

['a', 'b', 'c']

In [7]:
# my_queue1.pop() #This doesn't do what we want (acts like a stack)
my_queue1.pop(0) #This does, we need to define the item to pop.

'a'

In [9]:
from collections import deque
my_queue2=deque() #This has the same methods as a queue

## Stack

The stack is the inverse of the Queue data structure, it's a *Last In/First Out* model

So when removing from a stack, we remove the most recently added element instead. i.e. `.pop(-1)` rather than *0*



In [5]:
# Stacks are used for recursive problem
# Stacks are like a pile of plates where the last item is retrived first
# Pop method removes the last element from the list
# dir(deque()), use dir to find methods

from collections import deque
method2 = deque()

In [4]:
class Stack: 
    def __init__(self): 
        self.elements = [] 
    
    def push(self, data): 
        self.elements.append(data) 
        return data 
    
    def pop(self): 
        return self.elements.pop() 
        
    def peek(self): 
        return self.elements[-1] 
        
    def is_empty(self): 
        return len(self.elements) == 0
    
    def info(self):
        return self.elements

Class exercise:

Create a stack and populate it with a few elements.

In [10]:
#Class exercise:

stack = Stack()
stack.is_empty()

True

## Hash Tables - Making search faster O(1)

The Hash table data structure stores elements in key-value pairs where

**Key:** unique integer that is used for indexing the values

**Value:**  data that are associated with keys.

**Hashing (Hash Function)**

**Hashing** is a technique of mapping a large set of arbitrary data(value) to tabular indexes(keys) using a hash function. The hash function maps the value to a key (an index in the hash table) using some computation. Then, the value is stored using the key in the hash table. [More info here](https://www.cs.hmc.edu/~geoff/classes/hmc.cs070.200101/homework10/hashfuncs.html)

Search algorithms that use **hashing** consist of two separate parts. The first step is to compute a hash function that transforms the value into a key index. Ideally, different values would map to different keys. This ideal is generally beyond our reach, so we have to face the possibility that two or more different values may hash to the same index. Thus, the second part of a hashing search is a collision-resolution process that deals with this situation.

On an average, the insertion, deletion and searching operations using hash tables runs in *O(1)* time which is much faster compared to searching in arrays

Python data structures like sets and dictionaries are implemented using hash table technologies. 

The running time complexities of various operations in different Python dictionaries can be found in [here](https://wiki.python.org/moin/TimeComplexity)

In [11]:
# Stores key and value separately, stores the value with another key (index)
# Ascii code takes the string in separate characters and converts it into a 
# number more than 10
# A hash table is pretty much a dictionary

my_dict={'Emp1':'001','Emp2':'002'}
my_dict

{'Emp1': '001', 'Emp2': '002'}

In [12]:
type(my_dict)

dict

In [21]:
# Only one method to give an index, there are more...

def get_hash(key):
    a=0
    for char in key:
        a = a+ord(char)
    return a%10

In [22]:
# Gives ascii value for str
ord('p')

112

In [23]:
get_hash('Emp1')

9

In [24]:
get_hash('Emp2')

0

In [53]:
class Hashtable:
    def __init__(self):
        self.Max = 50
        self.arr = [None for i in range(self.Max)] # Used as a memory
    
    def get_hash(self, key):
        a=0
        for char in key: # gets hash index for key
            a = a+ord(char)
        return a%10
    
    def add_ele(self, key, value): # Stores value at hash(key) position in arr
        a = self.get_hash(key)
        self.arr[a] = value
    
    def get_ele(self, key): # Same as getitem function, gets value at hash key
        a = self.get_hash(key)
        return self.arr[a]

In [48]:
my_dict=Hashtable()

In [49]:
my_dict.get_hash('Emp1')

9

In [50]:
my_dict.add_ele('Emp1', 100)
my_dict.add_ele('Emp2', 200)
my_dict.add_ele('Emp3', 300)
my_dict.add_ele('Emp4', 400)
my_dict.add_ele('Emp5', 500)

In [52]:
my_dict.get_ele('Emp1')

100

In [51]:
my_dict.arr

[200,
 300,
 400,
 500,
 None,
 None,
 None,
 None,
 None,
 100,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None,
 None]

In [None]:
# Linked list
# Linear data structure, not stored in continuous
# Each item is linked to next one
# Contains Value, Next (Node)
# Next is the reference to the next node (pointer)
# The last element (tail) is Null
# Double linked list also links element to previous element (Node)
# Circular list is where the last Node links back to the firt element (loop)