### Data Structures

Data Structures, also called collections, are objects that can store and process data.

The following data structures are explored below:
   - Arrays
   - Stacks
   - Queues
   - Linked Lists
   - Sets (hash sets)
   - Dictionaries (hash maps)
   - Tables

**Arrays**

An array is a collection of items stored at contiguous memory locations.

In python Arrays can be natively implemented through Lists (heterogenous data types), Arrays as homogenous containers can be implemented through the numpy library.

In [1]:
# list even numbers till 20

l = [i for i in range(20) if i%2==0]
l

[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

Lists are mutable and called by assignment within functions.

In [2]:
# Lists are mutable and updated by assignment within functions

lx = ['a', 1, True]

# case 1: assignment

def update(l):
    l[0] = 'b'

update(lx)

print('Case 1:',lx)

# case 2: reference

lx = ['a', 1, True]

def update2(l):
    l = ['c', 'd', 'e']

update2(lx)

print('Case 2:', lx)


Case 1: ['b', 1, True]
Case 2: ['a', 1, True]


In [32]:
# list methods

l = []

# append(x)
l.append(0)  # adds item to the end of a list
print(l)

# extend(itr)
l.extend((2,3)) # appends iterable to end of list
print(l)

# insert(i,x)
l.insert(1, 4) # insert x at given position i
print(l)

# remove(x)
l.remove(2) # removes first occurence of x, raises value error in x's absence
print(l)

# pop([i])
i = l.pop(1)  # pops element at i and returns it. if no argument is specified returns the last element
print(i, l)

# index(x[,start[,end]])
i = l.index(3)  # returns first index of list where value is x. optionally start and end can be specified
print(i)

# reverse()
l.reverse()
print(l)  #reverses elements of list in place


[0]
[0, 2, 3]
[0, 4, 2, 3]
[0, 4, 3]
4 [0, 3]
1
[3, 0]


In [36]:
# del statement

l = [1,2,3,4,5]
del l[1:]
l

[1]

In [60]:
# Write a function countInRange that accepts a vector of integers along with a min and max integer as parameters, 
# and returns the number of elements within that range inclusive

def countInRange(l, lower, upper):
    l = [i for i in l if i>=lower and i<=upper]
    return len(l)

v = [28, 1, 17, 4, 41, 9, 59, 8, 31, 30, 25]

countInRange(v, 10, 30)

4

In [59]:
# Write a function removeAll that accepts a vector of strings along with an element value string as paramters,
# and modifies the vector to remove all occurences of that string

def removeAll(l, s):
    for i in range(len(l)-1, -1, -1):              # the list is traversed from the end to not mess the indices
        if l[i] == s:
            l.pop(i)
    return l

v = ['a', 'b', 'c', 'b', 'b', 'a', 'b']

removeAll(v, 'b')

['a', 'c', 'a']

Lists may be used as stacks (Last In First Out) with append and pop methods.

Lists may be used as queues (First In First Out) with insert at 0 index and pop. However, lists are not efficient for Queue operations, and dequeue object from the collection library is recommended in its place

Tuple is another sequence data structure.

Tuples are immutable and does not support item assignment

In [3]:
t = (5,6,7)
t

(5, 6, 7)

In [10]:
# Grid

# 2 dimensional arrays are referred to as grids

grid = [[1,2,3],[4,5,6],[7,8,9]]   # construct
print(grid)

# Access elements

grid[0][1]   # first row second column

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]


2

In [9]:
print(len(grid))  # number of rows

print(len(grid[0]))  # number of columns 

3
3


**Linked List**

A linear collection of data elements built out of many nodes, each of which stores one value and a link to the next node.

The benefit over a regular array is that elements can be easily inserted and removed without updating the memory indices
of all the other elements. Unlike a list memory used to store a linked list does not have to be contiguous.

However we can't access items in constant time (O(1)) as we could with an array. looking up an item in linked list has a linear time complexity (O(n))




In [3]:
# python implementation

# class Node

class Node:
    def __init__(self, val):
        self.val = val
        self.next = None
        
    def get_data(self):
        return self.val
    
    def set_data(self, val):
        self.val = val
        
    def get_next(self):
        return self.next
    
    def set_next(self, next):
        self.next = next

In [4]:
j = Node(5)

In [5]:
j.set_data(6)

In [6]:
j.get_data()

6

In [7]:
j.set_next(7)

In [8]:
j.get_next()

7

In [64]:
# class LinkedList

class LinkedList:
    
    def __init__(self):
        self.head = None
        self.count = 0
        
    def insertHead(self, val):
        new_node = Node(val)
        new_node.set_next(self.head)
        self.head = new_node
        self.count += 1
    
    def find(self, val):
        item = self.head
        while item != None:
            if item.get_data() == val:
                return item
            item = item.get_next()
        return None
            
    def remove(self, val):
        curr = self.head()
        prev = None
        while curr != None:
            if curr.get_data() == val:
                if prev == None:
                    self.head = curr.get_next()
                else:
                    prev.set_next(curr.get_next())
                self.count -= 1
            prev = curr
            curr = curr.get_next()
        
    def get_count(self):
        return self.count
    
    def is_empty(self):
        return self.head == None
                
        
# this can be extended to a double-linked list by also linking each node to its previous node      


In [65]:
l = LinkedList()

In [66]:
l.insertHead(7)
l.insertHead(6)
l.insertHead(5)

In [67]:
l.get_count()

3

In [68]:
l.find(7)

<__main__.Node at 0x2110220eca0>

In [69]:
l.is_empty()

False

In [70]:
i = l.head
while i != None:
    print(i.get_data())
    i = i.get_next()

5
6
7


**Stacks**

LIFO : last in first out

stacks are often implemnted through arrays
stack operations: push, pop, peek

Stacks in computer science: function calls are placed onto a stack (call = push, return = pop)
undo operation uses stack

In [9]:
# python implementation
# Deque (short for 'double ended queue') object from collections module can be used as stacks

from collections import deque  # time complexity to push and pop is O(1)

j = deque('abef')

j.append('i') # push

j.append('k')

print(j)

j.pop()

deque(['a', 'b', 'e', 'f', 'i', 'k'])


'k'

In [22]:
# Exercise: write a checkBalance function that accepts a string source code and checks whether the braces/parentheses are closed
# return the index at which an imbalance occurs or -1 if balanced. If any parentheses is not closed return string length

from collections import deque

def checkBalance(string):
    paren = deque()
    for i in range(len(string)):
        c = string[i]
        if c in ('[', '{', '('):
            paren.append(c)
        elif c in (']', '}', ')'):
            if len(paren) == 0:
                return i
            else:
                p = paren[-1]
                if (c == ')' and p == '(') or (c == '}' and p == '{') or (c == ']' and p == '['):
                    paren.pop()
                else:
                    return i
    if len(paren) > 0:
        return len(string)
    else:
        return -1

In [23]:
# tests

string1 = "if (a(4) > 9) { foo(a(2)); }"  # returns -1
string2 = "for (i=0;i<3(a};i++){foo{);)}" # returns 14
string3 = "while (true) foo(); }{ ()"  # returns 20
string4 = "if (x) {"  # returns 8

print(checkBalance(string1))
print(checkBalance(string2))
print(checkBalance(string3))
print(checkBalance(string4))


-1
14
20
8


**Queues**

FIFO first in first out
queues are often implemented through linked list

queue operations: enqueue, dequeue, peek

can be implemented as deque from collection package in python

Queue in computer science: print jobs, network packets etc.

In [3]:
from collections import deque

j = deque()

j.appendleft(1)    # queue
j.appendleft(2)
j.appendleft(3)
print(j)
j.pop()           # FIFO

deque([3, 2, 1])


1

 we often mix stacks and queues to achieve certain effects ex: reverse the order of elements of a queue

In [10]:
# exercise: write a function 'stutter' where you pass in a queue and it returns a queue with 2 copies of every element

from collections import deque

def stutter(q):
    for i in range(len(q)):
        foo = q.pop()
        q.appendleft(foo)
        q.appendleft(foo)
    return q
    

In [11]:
# test
j = deque('abc')
j
stutter(j)

deque(['a', 'a', 'b', 'b', 'c', 'c'])

In [23]:
# exercise: write a function 'mirror' that accepts a queue of strings and appends the queue's contents to itself reversed

def mirror(q):
    q2 = deque()
    while len(q) > 0:
        foo = q.pop()
        q2.append(foo)
        q2.appendleft(foo)
    return q2

In [24]:
# test
j = deque('abc')
mirror(j)

deque(['a', 'b', 'c', 'c', 'b', 'a'])

**Sets**

Sets are collections of elements with no duplicates; fast for searching
items inside a set are sorted
sets are implemented using a binary tree

hash-sets are sightly faster that sets; elements inside a hash-set are unordered; implemented using hast tables

Set operations: add, remove, search(contains)


**Maps**

Maps are a collection of key value pairs (k:v), where the value can be looked up very quickly for any given key 
eg. dictionary, phonebook

implmented using a linked structure called binary search tree; keys are stored in a sorted order

another variant is a HashMap implemented using a special array called a hash table; very fast; keys are unordered



