## why use data structure

- Particular structure make particular calculation more efficiently.

## what are data structures

According to the way they are accessed/indexed:
#### Linear Data Structures
- data elements are accessed in sequential order (not necessarily stored in linear sequence)
    - list
    - linked list
    - queue
    - stack
    
#### Non-linear Data Structures
- data elements are accessed in non-sequential order
    - tree
    - heap
    - graph


According to the way they are defined:
#### Primitive Data Type
- fixed implementation
    - int, str, char, float
    - Array,list

#### Abstract Data Type
- defined in terms of the operations on it, implementation may vary
    - Linked Lists 
    - Stacks 
    - Queues 
    - Priority Queues
    - Binary Trees
    - Dictionaries
    - Disjoint Sets (Union and Find)
    - Hash Tables
    - Graphs

## Array / List

- data structure used to store homogeneous elements at contiguous memory locations.


Pros:
- Fast Random Access O(1)

Cons:
- Pre-defined static size
- Strict continuous memory allocation
- Expensive for edit: insert, delete, append

Time Complexity:
- index  O(1)
- search O(n)
- insert O(n)
- delete O(n)

In [1]:
from Array import *
A = array([1,2,3])
A.insert(4,3)
A.printArray()
A.delete(2)
A.printArray()

@insert 3 to position 4
[1, 2, 3, 4]
@find and delete 2 at position 1
[1, 3, 4]


### Singly Linked List
- sequential data arrangement using pointers to indicate the order, not bounded to continuous physical memory locations. 

Pros:
- Dynamic size
- Memory efficient Insert/Delete
- Use fractional memory

Cons:
- no random access, only sequential search possible
- spend extra memory on pointers

Complexity:
* Insertion: O(1)
    * Insertion at beginning (or front): O(1)
    * Insertion at End: O(n)
* Deletion: O(1)
* Indexing: O(n)
* Searching: O(n)
* Reverse : O(n)

In [2]:
from singlyLinkedList import *
lls = singlyLinkedList()
lls.insert_at_tail(20)
lls.insert_at_head(10)
lls.insert_at_tail(30)
lls.insert_at_head(0)
lls.printSLL()
lls.reverse()
lls.printSLL()
print()

@insert at tail to empty sll!
@insert at head of sll!
@insert at tail of sll!
@insert at head of sll!
(head)0 -->10 -->20 -->30 -->
@finished printing sll of 4 nodes
@finished reversing sll!
(head)30 -->20 -->10 -->0 -->
@finished printing sll of 4 nodes



### Doubly Linked List

each node contains data, and one pointer to pre, one pointer for nxt

class methods:
- print
- length
- insert
- delete

In [3]:
from doublyLinkedList import *
dll = doublyLinkedList()

## insert
dll.insert_at_head(10)
dll.insert_at_head(20)
dll.insert_at_head(30)
dll.insert_at_head(40)
dll.insert_at_head(50)

## search
dll.search(10)
dll.search(100)
dll.printDLL()

## delete
dll.delete(10)
dll.delete(50)
dll.delete(30)

dll.printDLL()
dll.delete(30)
dll.delete(20)
dll.delete(40)

dll.printDLL()
dll.delete(30)

print()



@insert node to empty dll!
@insert node at head of dll!
@insert node at head of dll!
@insert node at head of dll!
@insert node at head of dll!
@found node with value 10 in dll!
@no node found with value 100!
@print dll from head!
(root)50 <==>40 <==>30 <==>20 <==>10
@finished printing 5 nodes of dll!
@delete 1 node with value 10 at tail of dll!
@delete 1 node with value 50 at head in dll!
@delete 1 node with value 30 inside of dll!
@print dll from head!
(root)40 <==>20
@finished printing 2 nodes of dll!
@no matched node with value 30 in dll to delete!
@delete 1 node with value 20 at tail of dll!
@delete 1 node with value 40, now dll is empty!
@print dll from head!
@print empty dll!
@finished printing 0 nodes of dll!
@not able to delete node in empty dll!



### Trees
* data organized in **hierachical** order, one node connects to multi-nodes, noted as parent or children.
* **root** node has no parent (None), only one root exists in a tree
* **leaf** nodes have no children
* nodes are linked by **edge**

## Binary Tree
* each tree node has no more than **2** children nodes (including empty tree)
    * **Full binary tree** when each node has two children, except for leaf nodes
    * **Complete binary tree** when all layers are fully filled, except last layer which should has no gap from left
    * **Perfect binary tree** when all except last layer nodes have two children, and all leaf nodes are in one layer
    * **Balanced binary tree** when height of the tree is O(Log n) where n is number of nodes
        * **AVL tree** maintain O(Log n) height by making sure that the difference between heights of left and right subtrees is 1
        * **Red-Black trees** maintain O(Log n) height by making sure that the number of Black nodes on every root to leaf paths are same and there are no adjacent red nodes.
    * **degenerate (or pathological) tree** when one child per internal node, tree becomes doubly linked list
    
For visualizations:
http://www.geeksforgeeks.org/binary-tree-set-3-types-of-binary-tree/

For properties:
http://www.geeksforgeeks.org/binary-tree-set-2-properties/

### Tree traversal

Due to non-linear arrangement, trees can be traversed in different ways:
* **DFS Traversal**:
    * Inorder     (left, data, right) 中序遍历
    * Preorder   (data, left, right)  前序遍历
    * Postorder (left, right, data)   后序遍历
* **BFS Traversal**

### Naive Tree
* no nodes relationship constraint, e.g in binary search tree
* construct manually with setters

In [4]:
from naiveBinaryTree import *
nbt = naiveBinaryTree()
for ii in range(9):
    nbt.insert(ii)
nbt.traverse_bfs()
nbt.traverse_dfs_preorder(nbt.root)
print()
nbt.traverse_dfs_inorder(nbt.root)
print()
nbt.traverse_dfs_postorder(nbt.root)
print()

@add as root node of the tree!
@add as lf-child of a node!
@add as rt-child of a node
@add as lf-child of a node!
@add as rt-child of a node
@add as lf-child of a node!
@add as rt-child of a node
@add as lf-child of a node!
@add as rt-child of a node
0-
1-2-
3-4-5-6-
7-8-
root node, layer 0 with key 0
left node, layer 1 with key 1
left node, layer 2 with key 3
left node, layer 3 with key 7
right node, layer 3 with key 8
right node, layer 2 with key 4
right node, layer 1 with key 2
left node, layer 2 with key 5
right node, layer 2 with key 6

left node, layer 1 with key 1
left node, layer 2 with key 3
left node, layer 3 with key 7
right node, layer 3 with key 8
right node, layer 2 with key 4
root node, layer 0 with key 0
right node, layer 1 with key 2
left node, layer 2 with key 5
right node, layer 2 with key 6

left node, layer 1 with key 1
left node, layer 2 with key 3
left node, layer 3 with key 7
right node, layer 3 with key 8
right node, layer 2 with key 4
right node, layer 1 with 

#### another way of making tree

remember the None for parent pointer

In [5]:
root = treeNode(1, None, treeNode(4, None, treeNode(14), treeNode(24)), treeNode(2, None, treeNode(12), treeNode(22)))
tree = naiveBinaryTree(root)
tree.traverse_bfs()
tree.traverse_dfs_preorder(tree.root)
tree.traverse_dfs_inorder(tree.root)
tree.traverse_dfs_postorder(tree.root)

1-
4-2-
14-24-12-22-
root node, layer 0 with key 1
left node, layer 1 with key 4
left node, layer 2 with key 14
right node, layer 2 with key 24
right node, layer 1 with key 2
left node, layer 2 with key 12
right node, layer 2 with key 22
left node, layer 1 with key 4
left node, layer 2 with key 14
right node, layer 2 with key 24
root node, layer 0 with key 1
right node, layer 1 with key 2
left node, layer 2 with key 12
right node, layer 2 with key 22
left node, layer 1 with key 4
left node, layer 2 with key 14
right node, layer 2 with key 24
right node, layer 1 with key 2
left node, layer 2 with key 12
right node, layer 2 with key 22
root node, layer 0 with key 1


### Binary Search Tree
criteria:
* is binary tree
* left < cur < right
* no duplicate keys

In [8]:
from bst import *
import random
bstree = bst()
for _ in range(100):
    bstree.insert(random.randint(0,100))

@insert 74 at root!
@insert 73 as left leaf!
@insert 23 as left leaf!
@insert 31 as right leaf!
@insert 39 as right leaf!
@insert 87 as right leaf!
@insert 63 as right leaf!
@insert 85 as left leaf!
@insert 70 as right leaf!
@insert 93 as right leaf!
@cannot insert duplicated keyue to bst!
@insert 37 as left leaf!
@insert 89 as left leaf!
@insert 86 as right leaf!
@insert 47 as left leaf!
@insert 43 as left leaf!
@cannot insert duplicated keyue to bst!
@insert 90 as right leaf!
@insert 100 as right leaf!
@insert 72 as right leaf!
@insert 32 as left leaf!
@insert 42 as left leaf!
@insert 48 as right leaf!
@cannot insert duplicated keyue to bst!
@cannot insert duplicated keyue to bst!
@cannot insert duplicated keyue to bst!
@insert 21 as left leaf!
@insert 10 as left leaf!
@insert 38 as right leaf!
@cannot insert duplicated keyue to bst!
@insert 17 as right leaf!
@insert 54 as right leaf!
@insert 64 as left leaf!
@insert 67 as right leaf!
@insert 69 as right leaf!
@insert 8 as left leaf!

In [10]:
for _ in range(1):
    key = random.randint(0, 100)
    bstree.delete(key)
    if not bstree.is_valid_bst():
        bstree.breadFirst()
        bstree.inorder()
        print('invalid tree')
        break
    else:
        bstree.breadFirst()

@delete failed, no match found!
74-
23-90-
21-31-85-95-
10-27-39-81-86-99-
1-17-36-64-76-
2-15-18-55-
45-
42-


In [11]:
bstree.inorder()

1-2-10-15-17-18-21-23-27-31-36-39-42-45-55-64-74-76-81-85-86-90-95-99-