# Data Structures

## 1 Introduction to Data Structures

A data structure is a model where data is organized, managed and stored in a format that enables efficient access and modification of data. 

More precisely,

* **a data structure is a `collection of data` values, the `relationships ` them, and the `functions or operations` that can be applied to the data**

There are various types of data structures commonly available. It is up to the programmer to choose which data structure to use depending on the data.

The choice of a particular one can be considered based on the following points:

* It must be able to `represent` the `inherent relationship` of the data in the real world.

* It must be able to `process` the data `efficiently` when necessary.


### 1.1 Why study Data Structures?
>“Bad programmers worry about the `code`. 
> 
>  Good programmers worry about `data structures and their relationships`.”
>
>      Linus Torvalds

Time and energy are both required to process any instruction. Every CPU cycle that is saved will have an effect on both the time and energy consumed and can be put to better use in processing other instructions.

A program built using improper data structures will be therefore inefficient or unnecessarily complex. It is necessary to have a good knowledge of data structures and understand where to use the best one. 

The study includes the description, implementation and quantitative performance analysis of the data structure.

### 1.2 Concept of a Data Type

**Primitive Data Types**

A primitive data type is one that is inbuilt into the programming language for defining the most basic types of data. These may be different for the various programming languages available.

For example, the C programming language has inbuilt support for characters (char), integers (int, long) and real numbers (float, double).

**User-Defined Data Type**

User-defined data type, as the name suggests is the one that the user defines as per the requirements of the data to be stored. Most programming languages provide support for creating user-defined data types.

For example, C provides support through structures (struct), unions (union) and enumerations (enum).

**Abstract Data Type (ADT)**

Abstract Data Types are defined by its behaviour from the point of view of the user of the data. It defines it in terms of possible values, operations on data, and the behaviour of these operations.

For example. The user of the stack data structure only knows about the `push` and `pop` operations in a stack. They do not care how the push operation interacts with the memory to store the data. They only expect it to store it in the way specified.

```cpp
The Stack ADT{

Data attribute：
    the linear list of items,top of stack,bottom of the stack
Operation:
  Create()：Create the empty stack
  IsEmpty()：if the stack is empty,return true，otherwise return false
  push(item): push(add)element to the stack. The element always gets added to the `top` of the current stack items.
  pop(item):  pop(remove) an element from the stack. The element always gets popped off `from the top` of the stack
}



### 1.3 Common Operations in a Data Structure

A data structure is only **useful** when you can perform **operations** on it, right? 

These are the basic operations that should be able to be performed on every data structure.

**Access**

* This operation handles how the elements currently stored in the structure can be accessed.

**Search**
* This operation handles finding the location of a given element of a given structure.

**Insertion**
* This operation specifies how new elements are to be added to the structure.

**Deletion**
* This operation specifies how existing elements can be removed from the structure.

### 1.4 Classification of data structure

A data structure can be broadly classified into 2 types:
    
* Linear Data Structures     
* Non-Linear Data Structures     

#### 1.4.1 Linear Data Structures

A linear data structure’s elements form a sequence. Every element in the structure has some element before and after it.

Examples of linear structures are:

**Arrays**
* An array holds a `fixed` number of similar elements that are stored under one name. 

These elements are stored in `contagious` memory locations . The elements of an array can be accessed using one `identifier`..

**Linked Lists**
* A linked list is a linear data structure where each element is a `separate` object, known as a `node` .

Each node contains some `data and points` to the next node in the structure, `forming a sequence` .

**Stacks**
* Stacks are a type of linear data structures that store data in an **order** known as the **Last In First Out (LIFO)** order. 

This property is helpful in certain programming cases where the data needs to be ordered.

**Queues**
* Queues are a type of linear data structures that store data in an order known as the **First In First Out (FIFO)** order.

This property is helpful in certain programming cases where the data needs to be ordered. R

#### 1.4.2  Non-Linear Data Structures

non-linear data structure’s elements do not form a sequence. Every element **may not have a unique element** before and after it.

**Trees**

* A tree is a data structure that simulates a `hierarchical` tree, with a `root` value and the `children` as the `subtrees`, represented by `a set of linked nodes`.

**Heaps**

* A heap is a complete binary tree that satisfies the  heap property. 
>The heap property says that is the value of Parent is either greater than or equal to (in a max heap ) or >less than or equal to (in a min heap) the value of the Child.

**Graphs**

* A graph data structure is used to represent `relations` between `pairs` of objects.

It consists of `nodes` (known as vertices) that are `connected` through `links` (known as edges).

The relationship between the nodes can be used to model the relation between the objects in the graph.

**Hash Tables**

* A Hash Table is a data structure where data is stored in an `associative` manner.

The data is `mapped` to `array positions` by a `hash function` that generates a `unique` value from each `key`.

The value stored in a hash table can then be **searched in $O(1)$ time** using the same hash function which generates an address from the key

## 2 Linear Data Structures

A linked structure consists of items that are linked to other items. 

Although many links among items are possible, the two simplest linked structures are `the singly linked structure` and `the doubly linked structure.`

### 2.1 Single Linked List

The linked list consists of a series of structures, which are not necessarily adjacent in memory.

Each structure contains the element and a pointer to a structure containing its successor. We call this
the `next` pointer.

The `last` cell's next pointer points to: this value is defined by C and cannot be confused with another pointer. ANSI C
specifies that is **zero**.

![](./img/ds/DS_SingleLinkedList.png)

A user of a singly linked structure accesses the first item by following a single external `head` link. 

The user then accesses other items by chaining through `the single links` (represented by arrows in the figure) that emanate from the items. Thus, in a singly linked structure, it is easy to get to the `successor` of an item, but not so easy to get to the `predecessor` of an item.

**Operations in a Linked List**

The few basic operations in a linked list including adding, deleting and modifying.

* Create an empty list without any node

* Remove all the dynamically allocated nodes

* Is list empty?

* Push the data in `front` by dynamically allocate a new node

* Push the data at the `end` by dynamically allocate a new node

* Pop and the data at the `end` to value and remove the node
* Pop and the data in `front` to value and remove the node


**node.h**

In [97]:
%%file ./demo/include/node.h
#ifndef NODE_H
#define NODE_H
 
template <typename T> class List;  // Forward reference
 
template <typename T>
class Node {
private:
   T data;
   Node * nextPtr;
public:
   Node (T d) : data(d), nextPtr(0) { }; // Constructor
   T getData() const { return data; };   // Public getter for data
   Node * getNextPtr() const { return nextPtr; } // Public getter for nextPtr
 
friend class List<T>;  // Make List class a friend to access private data
};
 
#endif

Overwriting ./demo/include/node.h


In [99]:
%%file ./demo/include/list.h
#ifndef LIST_H
#define LIST_H
 
#include <iostream>
#include "Node.h"
 
// Forward Reference
template <typename T>
std::ostream & operator<<(std::ostream & os, const List<T> & lst);
 
template <typename T>
class List {
private:
   Node<T> * frontPtr;  // First node
   Node<T> * backPtr;   // Last node
public:
   List();   // Constructor
   ~List();  // Destructor
   void pushFront(const T & value);
   void pushBack(const T & value);
   bool popFront(T & value);
   bool popBack(T & value);
   bool isEmpty() const;
 
friend std::ostream & operator<< <>(std::ostream & os, const List<T> & lst);
      // Overload the stream insertion operator to print the list
};
 
// Constructor - Create an empty list without any node
template <typename T>
List<T>::List() : frontPtr(0), backPtr(0) { }
 
// Destructor - Remove all the dynamically allocated nodes
template <typename T>
List<T>::~List() {
   while (frontPtr) {
      Node<T> * tempPtr = frontPtr;
      frontPtr = frontPtr->nextPtr;
      delete tempPtr;
   }
   // std::cout << "Destructor completed..." << std::endl;
}
 
// Is list empty? Check if frontPtr is null
template <typename T>
bool List<T>::isEmpty() const { return frontPtr == 0; }
 
// Push the data in front by dynamically allocate a new node
template <typename T>
void List<T>::pushFront(const T & value) {
   Node<T> * newNodePtr = new Node<T>(value);
   if (isEmpty()) {
      frontPtr = backPtr = newNodePtr;
   } else {
      newNodePtr->nextPtr = frontPtr;
      frontPtr = newNodePtr;
   }
}
 
// Push the data at the end by dynamically allocate a new node
template <typename T>
void List<T>::pushBack(const T & value) {
   Node<T> * newNodePtr = new Node<T>(value);
   if (isEmpty()) {
      frontPtr = backPtr = newNodePtr;
   } else {
      backPtr->nextPtr = newNodePtr;
      backPtr = newNodePtr;
   }
}
 
// Pop and the data in front to value and remove the node
template <typename T>
bool List<T>::popFront(T & value) {
   if (isEmpty()) {
      return false;
   } else if (frontPtr == backPtr) {  // only one node
      value = frontPtr->data;
      delete frontPtr;         // remove node
      frontPtr = backPtr = 0;  // empty
   } else {
      value = frontPtr->data;
      Node<T> * tempPtr = frontPtr;
      frontPtr = frontPtr->nextPtr;
      delete tempPtr;
   }
   return true;
}
 
// Pop and the data at the end to value and remove the node
template <typename T>
bool List<T>::popBack(T & value) {
   if (isEmpty()) {
      return false;
   } else if (frontPtr == backPtr) {  // only one node
      value = backPtr->data;
      delete backPtr;          // remove node
      frontPtr = backPtr = 0;  // empty
   } else {
      // Locate second to last node
      Node<T> * currentPtr = frontPtr;
      while (currentPtr->nextPtr != backPtr) {
         currentPtr = currentPtr->nextPtr;
      }
      value = backPtr->data;
      delete backPtr;          // remove last node
      backPtr = currentPtr;
      currentPtr->nextPtr = 0;
   }
   return true;
}
 
// Overload the stream insertion operator to print the list
template <typename T>
std::ostream & operator<< (std::ostream & os, const List<T> & lst) {
   os << '{';
   if (!lst.isEmpty()) {
      Node<T> * currentPtr = lst.frontPtr;
      while (currentPtr) {
         os << currentPtr->getData();
         if (currentPtr != lst.backPtr) os << ',';
         currentPtr = currentPtr->getNextPtr();
      }
   }
   os << '}';
}
 
#endif

Overwriting ./demo/include/list.h


**TestList.cpp**

In [101]:
%%file ./demo/src/TestList.cpp

/*
Test Driver for List class (TestList.cpp) 
*/
#include <iostream>
#include "List.h"
using namespace std;
 
int main() {
 
   List<int> lst1;
   cout << lst1 << endl;
   lst1.pushFront(8);
   lst1.pushBack(88);
   lst1.pushFront(9);
   lst1.pushBack(99);
   cout << lst1 << endl;
 
   int result;
   lst1.popBack(result)
      ? cout << "value is " << result << ", list is " << lst1 << endl
      : cout << "empty list" << endl;
   lst1.popBack(result)
      ? cout << "value is " << result << ", list is " << lst1 << endl
      : cout << "empty list" << endl;
   lst1.popFront(result)
      ? cout << "value is " << result << ", list is " << lst1 << endl
      : cout << "empty list" << endl;
   lst1.popFront(result)
      ? cout << "value is " << result << ", list is " << lst1 << endl
      : cout << "empty list" << endl;
   lst1.popBack(result)
      ? cout << "value is " << result << ", list is " << lst1 << endl
      : cout << "empty list" << endl;
}

Overwriting ./demo/src/TestList.cpp


In [103]:
!g++  -w -o ./demo/bin/TestList ./demo/src/TestList.cpp -I./demo/include/

In [105]:
!.\demo\bin\TestList

{}
{9,8,88,99}
value is 99, list is {9,8,88}
value is 88, list is {9,8}
value is 9, list is {8}
value is 8, list is {}
empty list


### 2.2  Double Linked List

Sometimes it is convenient to traverse lists backwards. The standard implementation does not help here, but the solution is simple.

* Merely add an `extra` field to the data structure, containing `a pointer to the previous` cell.

The cost of this is an extra link, which adds to the space requirement and also doubles the cost of insertions and deletions because there are more pointers to fix. 

On the other hand, it simplifies deletion, because you no longer have to refer to a key by using a pointer to the previous cell; this information is now at hand.

![](./img/ds/DS_DoubleLinkedList.png)

**Operations in a Double Linked List**


* Create an empty list without any node

* Remove all the dynamically allocated nodes

* Is list empty?

* Push the data in `front` by dynamically allocate a new node

* Push the data at the `end` by dynamically allocate a new node

* Pop and the data at the `end` to value and remove the node
* Pop and the data in `front` to value and remove the node


**DoubleLinkedNode.h**

In [107]:
%%file ./demo/include/DoubleLinkedNode.h
/* 
  DoubleLinkedNode template class for double linked list (DoubleLinkedNode.h)
*/
#ifndef DOUBLE_LINKED_NODE_H
#define DOUBLE_LINKED_NODE_H
 
template <typename T> class DoubleLinkedList; // Forward reference
 
template <typename T>
class DoubleLinkedNode {
private:
   T data;
   DoubleLinkedNode * nextPtr;
   DoubleLinkedNode * prevPtr;
public:
   DoubleLinkedNode (T d) : data(d), nextPtr(0), prevPtr(0) { };
   T getData() const { return data; };
   DoubleLinkedNode * getNextPtr() const { return nextPtr; }
   DoubleLinkedNode * getPrevPtr() const { return prevPtr; }
 
friend class DoubleLinkedList<T>;
   // Make DoubleLinkedList class a friend to access private data
};
 
#endif

Overwriting ./demo/include/DoubleLinkedNode.h


**DoubleLinkedList.h**

In [109]:
%%file ./demo/include/DoubleLinkedList.h
/* 
    DoubleLinkedList template class for double linked list
   (DoubleLinkedList.h)
*/
#ifndef DOUBLE_LINKED_LIST_H
#define DOUBLE_LINKED_LIST_H
 
#include <iostream>
#include "DoubleLinkedNode.h"
 
// Forward Reference
template <typename T>
std::ostream & operator<<(std::ostream & os,
      const DoubleLinkedList<T> & lst);
 
template <typename T>
class DoubleLinkedList {
private:
   DoubleLinkedNode<T> * frontPtr;
   DoubleLinkedNode<T> * backPtr;
public:
   DoubleLinkedList();   // Constructor
   ~DoubleLinkedList();  // Destructor
   void pushFront(const T & value);
   void pushBack(const T & value);
   bool popFront(T & value);
   bool popBack(T & value);
   bool isEmpty() const;
 
friend std::ostream & operator<< <>(std::ostream & os,
      const DoubleLinkedList<T> & lst);
      // Overload the stream insertion operator to print the list
};
 
// Constructor - Create an empty list with no node
template <typename T>
DoubleLinkedList<T>::DoubleLinkedList() : frontPtr(0), backPtr(0) { }
 
// Destructor - Remove all the dynamically allocated nodes
template <typename T>
DoubleLinkedList<T>::~DoubleLinkedList() {
   while (frontPtr) {
      DoubleLinkedNode<T> * tempPtr = frontPtr;
      frontPtr = frontPtr->nextPtr;
      delete tempPtr;
   }
   // std::cout << "Destructor completed..." << std::endl;
}
 
// Is list empty? Check if frontPtr is null
template <typename T>
bool DoubleLinkedList<T>::isEmpty() const { return frontPtr == 0; }
 
// Push the data in front by dynamically allocate a new node
template <typename T>
void DoubleLinkedList<T>::pushFront(const T & value) {
   DoubleLinkedNode<T> * newPtr = new DoubleLinkedNode<T>(value);
   if (isEmpty()) {
      frontPtr = backPtr = newPtr;
   } else {
      frontPtr->prevPtr = newPtr;
      newPtr->nextPtr = frontPtr;
      frontPtr = newPtr;
   }
}
 
// Push the data at the end by dynamically allocate a new node
template <typename T>
void DoubleLinkedList<T>::pushBack(const T & value) {
   DoubleLinkedNode<T> * newPtr = new DoubleLinkedNode<T>(value);
   if (isEmpty()) {
      frontPtr = backPtr = newPtr;
   } else {
      backPtr->nextPtr = newPtr;
      newPtr->prevPtr = backPtr;
      backPtr = newPtr;
   }
}
 
// Pop and the data in front to value and remove the node
template <typename T>
bool DoubleLinkedList<T>::popFront(T & value) {
   if (isEmpty()) {
      return false;
   } else if (frontPtr == backPtr) {  // only one node
      value = frontPtr->data;
      delete frontPtr;         // remove node
      frontPtr = backPtr = 0;  // empty
   } else {
      value = frontPtr->data;
      DoubleLinkedNode<T> * tempPtr = frontPtr;
      frontPtr = frontPtr->nextPtr;
      frontPtr->prevPtr = 0;
      delete tempPtr;
   }
   return true;
}
 
// Pop and the data at the end to value and remove the node
template <typename T>
bool DoubleLinkedList<T>::popBack(T & value) {
   if (isEmpty()) {
      return false;
   } else if (frontPtr == backPtr) {  // only one node
      value = backPtr->data;
      delete backPtr;          // remove node
      frontPtr = backPtr = 0;  // empty
   } else {
      value = backPtr->data;
      DoubleLinkedNode<T> * tempPtr = backPtr;
      backPtr = backPtr->prevPtr;  // 2nd last node
      backPtr->nextPtr = 0;
      delete tempPtr;
   }
   return true;
}
 
// Overload the stream insertion operator to print the list
template <typename T>
std::ostream & operator<< (std::ostream & os, const DoubleLinkedList<T> & lst) {
   os << '{';
   if (!lst.isEmpty()) {
      DoubleLinkedNode<T> * currentPtr = lst.frontPtr;
      while (currentPtr) {
         os << currentPtr->getData();
         if (currentPtr != lst.backPtr) os << ',';
         currentPtr = currentPtr->getNextPtr();
      }
   }
   os << '}';
}
 
#endif

Overwriting ./demo/include/DoubleLinkedList.h


**TestDoubleLinkedList.cpp**

In [111]:
%%file ./demo/src/TestDoubleLinkedList.cpp

/* 
 Test Driver for List class (TestDoubleLinkedList.cpp) 
*/
#include <iostream>
#include "DoubleLinkedList.h"
using namespace std;
 
int main() {
 
   DoubleLinkedList<int> lst1;
   cout << lst1 << endl;
   lst1.pushFront(8);
   lst1.pushBack(88);
   lst1.pushFront(9);
   lst1.pushBack(99);
   cout << lst1 << endl;
 
   int result;
   lst1.popBack(result)
      ? cout << "value is " << result << ", list is " << lst1 << endl
      : cout << "empty list" << endl;
   lst1.popBack(result)
      ? cout << "value is " << result << ", list is " << lst1 << endl
      : cout << "empty list" << endl;
   lst1.popFront(result)
      ? cout << "value is " << result << ", list is " << lst1 << endl
      : cout << "empty list" << endl;
   lst1.popFront(result)
      ? cout << "value is " << result << ", list is " << lst1 << endl
      : cout << "empty list" << endl;
   lst1.popBack(result)
      ? cout << "value is " << result << ", list is " << lst1 << endl
      : cout << "empty list" << endl;
}

Overwriting ./demo/src/TestDoubleLinkedList.cpp


In [113]:
!g++ -w -o ./demo/bin/TestDoubleLinkedList ./demo/src/TestDoubleLinkedList.cpp -I./demo/include/

In [115]:
!.\demo\bin\TestDoubleLinkedList 

{}
{9,8,88,99}
value is 99, list is {9,8,88}
value is 88, list is {9,8}
value is 9, list is {8}
value is 8, list is {}
empty list


### 2.3 Stack(LIFO)

#### 2.3.1 Overview of Stacks

Stacks are linear collections in which access is completely restricted to just one end, called the **top**. 

The classic analogous example is the stack of clean trays found in every cafeteria.Whenever a tray is needed, it is removed from the top of the stack, and whenever clean ones come back from the kitchen, they are again placed on the top. No one ever takes some particularly fine tray from the middle of the stack, and trays near the bottom might never be used.

Stacks are said to adhere to a last-in first-out (LIFO) protocol. The last tray brought back from the dishwasher is the first one a customer takes.

The two primary operations for putting items on and removing items from a stack are called **push** and **pop**, respectively. 

1. Push Operation

  * This is used to add (or `push`) an element to the stack. The element always gets added to the `top` of the current stack items.

2. Pop Operation
  * This is used to remove (or `pop`) an element from the stack. The element always gets popped off `from the top` of the stack


The Figure shows a stack as it might appear at various stages. The item at the top of the stack is shaded.

![](./img/ds/stacklife.jpg)

#### 2.3.2 Implementations of Stacks

Because of their simple behavior and linear structure, stacks are implemented easily using arrays or linked structures.



##### Array Implementation of Stacks in CPP

we need to declare:
*  an array:`T *data`
*  an array size ahead of time: `int capacity`
*  Top of stack, start at index -1: `int tos`

The two primary operations:  push and pop, .

*  push(const T & value)
* pop(T & value)



**Stack.h**

In [117]:
%%file ./demo/include/Stack.h

#ifndef STACK_H
#define STACK_H
 
#include <iostream>
 
// Forward Reference
template <typename T>
class Stack;
template <typename T>
std::ostream & operator<<(std::ostream & os, const Stack<T> & s);
 
template <typename T>
class Stack {
private:
   T * data;      // Array
   int tos;       // Top of stack, start at index -1
   int capacity;  // capacity of the array
   int increment; // each subsequent increment size
public:
   explicit Stack(int capacity = 10, int increment = 10);
   ~Stack();  // Destructor
   void push(const T & value);
   bool pop(T & value);
   bool isEmpty() const;
 
friend std::ostream & operator<< <>(std::ostream & os, const Stack<T> & s);
      // Overload the stream insertion operator to print the list
};
 
// Constructor - Create an empty list without any node
template <typename T>
Stack<T>::Stack(int cap, int inc) : capacity(cap), increment(inc) {
   data = new T[capacity];
   tos = -1;
}
 
// Destructor - Remove all the dynamically allocated nodes
template <typename T>
Stack<T>::~Stack() {
   delete[] data;  // remove the dynamically allocate storage
   // std::cout << "Destructor completed..." << std::endl;
}
 
// Is list empty? Check if frontPtr is null
template <typename T>
bool Stack<T>::isEmpty() const { return tos < 0; }
 
// Push the data on top of the stack
template <typename T>
void Stack<T>::push(const T & value) {
   if (tos < capacity - 1) {
      // Have space, simply add in the value
      data[++tos] = value;
   } else {
      // No more space. Allocate a bigger array
      T * newDataPtr = new T[capacity + increment];
      for (int i = 0; i <= tos; ++i) {
         newDataPtr[i] = data[i];   // copy over
      }
      delete[] data;
      data = newDataPtr;
   }
}
 
// Pop the data from the TOS
template <typename T>
bool Stack<T>::pop(T & value) {
   if (isEmpty()) {
      return false;
   } else {
      value = data[tos--];
   }
   return true;
}
 
// Overload the stream insertion operator to print the list
template <typename T>
std::ostream & operator<< (std::ostream & os, const Stack<T> & stack) {
   os << '{';
   for (int i = stack.tos; i >= 0; --i) {
      os << stack.data[i];
      if (i > 0) os << ',';
   }
   os << '}';
}
 
#endif

Overwriting ./demo/include/Stack.h


In [119]:
%%file ./demo/src/TestStack.cpp
/* 
  Test Driver for Stack class (TestStack.cpp)
    
*/
#include <iostream>
#include "Stack.h"
using namespace std;
 
int main() {
 
   Stack<int> s1;
   cout << s1 << endl;
   s1.push(8);
   s1.push(88);
   cout << s1 << endl;
 
   int result;
   s1.pop(result)
      ? cout << "value is " << result << ", stack is " << s1 << endl
      : cout << "empty stack" << endl;
 
   s1.push(9);
   s1.push(99);
   cout << s1 << endl;
 
   s1.pop(result)
      ? cout << "value is " << result << ", stack is " << s1 << endl
      : cout << "empty stack" << endl;
 
   s1.pop(result)
      ? cout << "value is " << result << ", stack is " << s1 << endl
      : cout << "empty stack" << endl;
   s1.pop(result)
      ? cout << "value is " << result << ", stack is " << s1 << endl
      : cout << "empty stack" << endl;
   s1.pop(result)
      ? cout << "value is " << result << ", stack is " << s1 << endl
      : cout << "empty stack" << endl;
}

Overwriting ./demo/src/TestStack.cpp


In [104]:
!g++ -w -o ./demo/bin/TestStack ./demo/src/TestStack.cpp -I./demo/include/

In [106]:
!.\demo\bin\TestStack 

{}
{88,8}
value is 88, stack is {8}
{99,9,8}
value is 99, stack is {9,8}
value is 9, stack is {8}
value is 8, stack is {}
empty stack


### 2.4 Queue(FIFO)

Like stacks, queues are linear collections. However, with queues, insertions are restricted to one end, called the **rear**, and removals to the other end, called the **front**.

A queue thus supports **a first-in first-out (FIFO)** protocol. 

Queues are omnipresent in everyday life and occur in any situation where people or things are lined up for service or processing on a
first-come, first-served basis. Checkout lines in stores, highway tollbooth lines, and airport baggage check-in lines are familiar examples of queues.


![Queue](./img/ds/queue.png)


**Basic Operation**

Queues have two fundamental operations: 

* add(Enqueue-入队）, which adds an item to the `rear` of a queue, and

* pop(Dequeue-出队）, which removes an item from the `front`. 


![Enqueue](./img/ds/enqueue.png)


![dequeue](./img/ds/dequeue.png)


**Array Implementation of Queue in CPP**

In the queue data structure, we keep an array `double *values`, and the positions `front` and `rear`, which represent the ends of the queue.

We set the max number of elements in the queue, `maxSize` and keep track of the number of elements that are actually in the queue, `counter`

In [108]:
%%file ./demo/include/queue.h
class Queue 
{
	public:
		Queue(int size);// constructor
		~Queue();// destructor
		bool IsEmpty(void);
		bool IsFull(void);
		bool Enqueue(double x);
		bool Dequeue(double &x);
		void DisplayQueue(void);
	private:
		int front;// front index
		int rear;// rear index
		int counter;// number of elements
		int maxSize;// size of array queue
		double* values;// element array
};


Overwriting ./demo/include/queue.h


In [110]:
%%file ./demo/src/queue.cpp
#include <iostream>
#include "queue.h"

using namespace std;

Queue::Queue(int size) 
{
	values = new double[size];
	maxSize = size;
	front = 0;
	rear = -1;
	counter = 0;
}

Queue::~Queue() 
{ 
	delete [] values; 
}

bool Queue::IsEmpty() 
{
	if (counter)
		return false;
	else 
		return true;
}

bool Queue::IsFull() 
{
	if (counter < maxSize)
		return false;
	else 
		return true;
}

bool Queue::Enqueue(double x) 
{
	if (IsFull()) 
	{
		cout<< "Error: the queue is full." << endl;
		return false;
	}
	else 
	{
		// calculate the new rear position (circular)
		rear= (rear + 1) % maxSize; 
		// insert new item
		values[rear]= x;
		// update counter
		counter++;
		return true;
	}
}

bool Queue::Dequeue(double &x) 
{
	if (IsEmpty()) 
	{
		cout<< "Error: the queue is empty." << endl;
		return false;
	}
	else 
	{
		// retrieve the front item
		x= values[front];
		// move front 
		front= (front + 1) % maxSize;
		// update counter
		counter--;
		return true;
	}
}

void Queue::DisplayQueue()
{
	cout<< "front -->";
	for (int i = 0; i < counter; i++) 
	{
		if (i == 0) 
			cout << "\t";
		else 
			cout << "\t\t"; 
		cout<< values[(front + i) % maxSize];
		if (i != counter - 1)
			cout << endl;
		else
			cout << "\t<--rear" << endl;
	}
}

Overwriting ./demo/src/queue.cpp


In [112]:
%%file ./demo/src/TestQueue.cpp
#include <iostream>
#include "queue.h"

using namespace std;

int main(void) 
{
	Queue queue(5);
	cout<< "Enqueue 5 items." << endl;
	for (int x = 0; x < 5; x++)
		queue.Enqueue(x);
	cout<< "Now attempting to enqueue again..." << endl;
	queue.Enqueue(5);
	queue.DisplayQueue();
	double value;
	queue.Dequeue(value);
	cout<< "Retrieved element = " << value << endl;
	queue.DisplayQueue();
	queue.Enqueue(7);
	queue.DisplayQueue();
	return 0;
}

Overwriting ./demo/src/TestQueue.cpp


In [114]:
!g++ -w -o ./demo/bin/TestQueue ./demo/src/TestQueue.cpp  ./demo/src/queue.cpp -I./demo/include/

In [116]:
!.\demo\bin\TestQueue

Enqueue 5 items.
Now attempting to enqueue again...
Error: the queue is full.
front -->	0
		1
		2
		3
		4	<--rear
Retrieved element = 0
front -->	1
		2
		3
		4	<--rear
front -->	1
		2
		3
		4
		7	<--rear


## 3 Non-Linear Data Structures



### 3.1 Tree

In the linear data structures you have studied thus far, all items except for the first have a distinct predecessor, and all items except the last have a distinct successor. 

In a tree, the ideas of predecessor and successor are replaced with those of `parent` and `child`.

Trees have two main characteristics:

* Each item can have multiple children.

* All items, except a privileged item called the root, have exactly one parent. The root has no parent.

#### 3.1.1 Tree Terminology

Tree terminology is a peculiar mix of biological, genealogical, and geometric terms. 

* a **root(根)** node.

* Each **parent(父)** node could have **child(子)** nodes.

* A node **without child** is called a **leaf(叶)** node.

![](./img/ds/tree.jpg)


The Table  provides a quick summary of these terms.

![](./img/ds/tree-1.jpg)
![](./img/ds/tree-2.jpg)


A tree with `only the root node` is called a `null(empty)` tree.

**Depth or level** :The `depth or level of a node` equals the length of the path connecting it to the root. Thus, 

* the `root depth or level of the root is 0`. 

* Its children are at level 1,and so on.

**Height**: The length of the longest path in the tree

* the height of a tree is different from the number of nodes contained in it. The height of a tree containing one node is 0, and, by convention, `the height of an empty tree is −1`.

The Figure shows a tree and some of its properties.

![](./img/ds/tree-2.jpg)





#### 3.1.2  Binary Trees(二叉树)

In a binary tree, each node has at **most** two children, referred to as the left child and the right child.

In a binary tree, when a node has only one child, you distinguish it as being either a left child or a right child.



#### 3.1.3 Binary Tree Traversals

There are four standard types of traversals for binary trees: preorder, inorder, postorder, and level order.

Each type of traversal follows a particular path and direction as it visits the nodes in the tree. 

![DS_BinaryTree](./img/ds/DS_BinaryTree.png)

**Depth-First Search (DFS 深度优先搜索)**

In general, a depth-first-search algorithm begins by choosing one child of the start node. It then chooses one child of that node and so on, going deeper and deeper until it either reaches the goal node or a node with no children. The search then backtracks, returning to the most recent node with children that it has not yet visited.

* 深度优先搜索首先选择一个节点作为起始节点，然后，选择这个节点的一个子节点，继续这个过程，越走越深，直到到达目标节点或者没有子节点的节点，之后，开始回溯，退回到子节点没有被访问的最近节点

They are 3 types of depth-first search:

* **Pre-order(前序)**: visit the root, traverse the left subtree, then the right subtree. 

> E.g., `6 ->5 -> 4` ->` 10 -> 7 -> 9` ->15.

* **In-order(中序):**  traverse the left subtree, visit the root, then the right subtree.

> E.g., `4 -> 5` -> 6 -> `7 -> 9` ->`10 -> 15`.

* **Post-order(后序):**  traverse the left subtree, the right subtree, then visit the root.

> E.g, `4 -> 5` -> `9 -> 7` -> `15 -> 10` -> 6.

**Pre-**, **in-** and **post-** refer to the `order` of visiting the `root.`

**Breadth-First Search(BFS-广度优先搜索)**

Begin at the `root`, visit `all its child` nodes. Then for each of the child nodes visited, visit their child nodes in turn

* 广度优先搜索，从根节点开始访问其所有子节点。如果这些节点不是结束节点，那么它将访问每个节点的所有子节点

> E.g., 6 -> `5 -> 10` -> `4 -> 7 -> 15` -> 9.



#### 3.1.4 Binary Search Tree(二叉查找树)

A binary search tree, **without** duplicate elements, has these properties:

* All values in the `left` subtree are `smaller` than the parent node.

* All values in the `right` subtree are `larger` than the parent node.

The above diagram illustrates a binary search tree.

You can retrieve the sorted list or perform searching via `in-order depth-first traversal`. 

* Take note that the actual shape of the tree depends on the order of insertion.

**Node template class for binary tree**

* Node.h

In [118]:
%%file ./demo/include/Node.h
/* 
   Node template class for binary tree (Node.h)
*/
#ifndef NODE_H
#define NODE_H
 
template <typename T> class BinaryTree; // Forward reference
 
template <typename T>
class Node {
private:
   T data;
   Node * rightPtr;
   Node * leftPtr;
public:
   Node (T d) : data(d), rightPtr(0), leftPtr(0) { };
   T getData() const { return data; };
   Node * getRightPtr() const { return rightPtr; }
   Node * getLeftPtr() const  { return leftPtr;  }
 
friend class BinaryTree<T>;
   // Make BinaryTree class a friend to access private data
};
 
#endif

Overwriting ./demo/include/Node.h


**BinaryTree template class for binary tree: BinaryTree.h**


In [120]:
%%file ./demo/include/BinaryTree.h
/* 
   BinaryTree template class for binary tree (BinaryTree.h)
*/
#ifndef BINARY_TREE_H
#define BINARY_TREE_H
 
#include <iostream>
#include <queue>
#include "Node.h"
 
// Forward Reference
template <typename T>
std::ostream & operator<<(std::ostream & os, const BinaryTree<T> & lst);
 
template <typename T>
class BinaryTree {
private:
   Node<T> * rootPtr;
 
   // private helper functions
   void insertNode(Node<T> * & ptr, const T & value);
   void preOrderSubTree(const Node<T> * ptr, std::ostream & os = std::cout) const;
   void inOrderSubTree(const Node<T> * ptr, std::ostream & os = std::cout) const;
   void postOrderSubTree(const Node<T> * ptr, std::ostream & os = std::cout) const;
   void removeSubTree(Node<T> * & ptr);
public:
   BinaryTree();   // Constructor
   ~BinaryTree();  // Destructor
   void insert(const T & value);
   bool isEmpty() const;
   void preOrderTraversal(std::ostream & os = std::cout) const;
   void inOrderTraversal(std::ostream & os = std::cout) const;
   void postOrderTraversal(std::ostream & os = std::cout) const;
   void breadthFirstTraversal(std::ostream & os = std::cout) const;
 
friend std::ostream & operator<< <>(std::ostream & os, const BinaryTree<T> & lst);
      // Overload the stream insertion operator to print the list
};
 
// Constructor - Create an empty list with no node
template <typename T>
BinaryTree<T>::BinaryTree() : rootPtr(0) { }
 
// Destructor - Remove all the dynamically allocated nodes
template <typename T>
BinaryTree<T>::~BinaryTree() {
   removeSubTree(rootPtr);
   // std::cout << "Destructor completed..." << std::endl;
}
 
template <typename T>
void BinaryTree<T>::removeSubTree(Node<T> * & ptr) {
   if (ptr) {
      removeSubTree(ptr->leftPtr);   // remove left subtree
      removeSubTree(ptr->rightPtr);  // remove right subtree
      delete ptr;
   }
}
 
// Is list empty? Check if rootPtr is null
template <typename T>
bool BinaryTree<T>::isEmpty() const { return rootPtr == 0; }
 
// Push the data in front by dynamically allocate a new node
template <typename T>
void BinaryTree<T>::insert(const T & value) {
   insertNode(rootPtr, value);
}
 
// Need to pass the pointer by reference so as to modify the caller's copy
template <typename T>
void BinaryTree<T>::insertNode(Node<T> * & ptr, const T & value) {
   if (ptr == 0) {
      ptr = new Node<T>(value);
   } else {
      if (value < ptr->data)
         insertNode(ptr->leftPtr, value);
      else if (value > ptr->data)
         insertNode(ptr->rightPtr, value);
      else
         std::cout << "duplicate value" << std::endl;
   }
}
 
template <typename T>
void BinaryTree<T>::preOrderTraversal(std::ostream & os) const {
   os << "{ ";
   preOrderSubTree(rootPtr);
   os << '}' << std::endl;
}
 
template <typename T>
void BinaryTree<T>::preOrderSubTree(const Node<T> * ptr, std::ostream & os) const {
   if (ptr) {
      os << ptr->data << ' ';
      preOrderSubTree(ptr->leftPtr);
      preOrderSubTree(ptr->rightPtr);
   }
}
 
template <typename T>
void BinaryTree<T>::inOrderTraversal(std::ostream & os) const {
   os << "{ ";
   inOrderSubTree(rootPtr);
   os << '}' << std::endl;
}
 
template <typename T>
void BinaryTree<T>::inOrderSubTree(const Node<T> * ptr, std::ostream & os) const {
   if (ptr) {
      inOrderSubTree(ptr->leftPtr);
      os << ptr->data << ' ';
      inOrderSubTree(ptr->rightPtr);
   }
}
 
template <typename T>
void BinaryTree<T>::postOrderTraversal(std::ostream & os) const {
   os << "{ ";
   postOrderSubTree(rootPtr);
   os << '}' << std::endl;
}
 
template <typename T>
void BinaryTree<T>::postOrderSubTree(const Node<T> * ptr, std::ostream & os) const {
   if (ptr) {
      postOrderSubTree(ptr->leftPtr);
      postOrderSubTree(ptr->rightPtr);
      os << ptr->data << ' ';
   }
}
 
// Breadth First Search (BFS)
template <typename T>
void BinaryTree<T>::breadthFirstTraversal(std::ostream & os) const {
   std::queue<Node<T> * > q;
   if (!isEmpty()) q.push(rootPtr);
 
   os << "{ ";
   Node<T> * currentPtr;
   while (currentPtr = q.front()) {
      std::cout << currentPtr->data << ' ';
      if (currentPtr->leftPtr) q.push(currentPtr->leftPtr);
      if (currentPtr->rightPtr) q.push(currentPtr->rightPtr);
      q.pop();  // remove this node
   }
   os << '}' << std::endl;
}
 
// Overload the stream insertion operator to print the list in in-order traversal
template <typename T>
std::ostream & operator<< (std::ostream & os, const BinaryTree<T> & lst) {
   lst.inOrderTraversal(os);
   return os;
}
 
#endif

Overwriting ./demo/include/BinaryTree.h


**Test Driver for BinaryTree class :TestBinaryTree.cpp**


In [80]:
%%file ./demo/src/TestBinaryTree.cpp
/* 

Test Driver for BinaryTree class (TestBinaryTree.cpp) 
    
*/
#include <iostream>
#include "BinaryTree.h"
using namespace std;
 
int main() {
   BinaryTree<int> t1;
   t1.insert(6);
   t1.insert(10);
   t1.insert(5);
   t1.insert(15);
   t1.insert(7);
   t1.insert(4);
   t1.insert(9);
 
   t1.preOrderTraversal();
   t1.inOrderTraversal();
   t1.postOrderTraversal();
   cout << t1;
   t1.breadthFirstTraversal();
}

Overwriting ./demo/src/TestBinaryTree.cpp


In [81]:
!g++ -w -o ./demo/bin/TestBinaryTree ./demo/src/TestBinaryTree.cpp -I./demo/include/

In [82]:
!.\demo\bin\TestBinaryTree

{ 6 5 4 10 7 9 15 }
{ 4 5 6 7 9 10 15 }
{ 4 5 9 7 15 10 6 }
{ 4 5 6 7 9 10 15 }
{ 6 5 10 4 7 15 9 }


### 3.2  Heap

A heap is a complete binary tree that satisfies the  heap property.

#### 3.2.1 The heap property

The heap property says that is the value of Parent is either `greater than or equal` to (in a `max` heap ) or `less than or equal` to (in a `min` heap) the value of the Child.

**Binary heaps(二叉堆) can be represented using a list or array organized** so that

* the children of element N are at positions 2 * N + 1 and 2 * N + 2 (for zero-based indexes).

This layout makes it possible to rearrange heaps in place, so it is not necessary to reallocate as much memory when adding or removing items.



#### 3.2.2 Max and Min Heap

There are two types of heaps

* the max heap 
* the min heap.

In a max heap, the key present at the root is the largest in the heap and all the values below this are less than this value.


In a min heap, the key present at the root is the smallest in the heap and all the values below this are greater than this value.




![heap](./img/ds/heap.png)

max_heapify(A,i)
```cpp
l = left(i)
r = right(i)
if l<=heap-size[A] and A[l]>A[i] then
    largest = l
else 
    largest = i
end if    
if r<=heap-size[A] and A[r]>A[largest] then
     largest = r
end if     
if largest != i then
     exchange A[i] and A[largest]
     max_heapify(A,largest)
end if     
```



In [44]:
def left(i):
    return 2 * i + 1

def right(i):
    return 2 * i + 2

def max_heapify(A: list, i: int, heap_size=None):
    if not heap_size:
        heap_size = len(A)
 
    l = left(i)
    r = right(i)
    if l < heap_size and A[l] > A[i]:
        largest = l
    else:
        largest = i
    if r < heap_size and A[r] > A[largest]:
        largest = r
  
    if largest != i:
        A[i], A[largest] = A[largest], A[i]
        max_heapify(A, largest, heap_size)

def build_max_heap(A, heap_size=None):
    if not heap_size:
        heap_size = len(A)
    i = int(heap_size / 2) - 1
    while i >= 0:
        max_heapify(A, i, heap_size)
        i -= 1

A=[47,70,86, 46,44,45,66]
build_max_heap(A)
print(A)           

[86, 70, 66, 46, 44, 45, 47]


#### 3.2.3 Heap Sort

Heaps can be used in sorting an array. In max-heaps, maximum element will always be at the root. Heap Sort uses this property of heap to sort the array.

Consider an array $Arr$ which is to be sorted using Heap Sort.

* Initially build a max heap of elements in $Arr$

* The root element, that is  $Arr[0]$, will contain maximum element of $Arr$. After that, swap this element with the last element of   $Arr$ and heapify the max heap excluding the last element which is already in its correct position and then decrease the length of heap by one.

* Repeat the step 2, until all the elements are in their correct position.

**Complexity**:

max_heapify has complexity$O(logN)$ , build max heap has complexity$O(N)$  and we run max_heapify $N-1$ times in heapsort function, therefore complexity of heapsort function is $O(NlogN)$.

堆排序的方法:

* 把最大堆堆顶的最大数取出，将剩余的堆继续调整为最大堆，再次将堆顶的最大数取出，这个过程持续到剩余数只有一个时结束。



In [34]:
def heap_sort(A, heap_size=None):
    if not heap_size:
        heap_size = len(A)
    build_max_heap(A)
    start = heap_size - 1
    for i in range(start, 0, -1):
        A[0], A[i] = A[i], A[0]
        heap_size -= 1
        max_heapify(A, 0, heap_size=heap_size)


In [45]:
A=[47,70,86, 46,44,45,66]
heap_sort(A)
print(A)

[44, 45, 46, 47, 66, 70, 86]


In [49]:
%%file ./demo/src/heapsort.c
#include <stdio.h>
#include <stdlib.h>

void max_heapify(int arr[], int n, int i) 
{ 
    int largest = i; 
    int l = 2*i + 1; // left = 2*i + 1 
    int r = 2*i + 2; // right = 2*i + 2 
  
    if (l < n && arr[l] > arr[largest]) 
        largest = l; 
  
    if (r < n && arr[r] > arr[largest]) 
        largest = r; 
    
    int temp;
    if (largest != i) 
    { 
        temp = arr[i];
        arr[i] = arr[largest];
        arr[largest] = temp; 
        max_heapify(arr, n, largest); 
    } 
} 

void heapSort(int arr[], int n) 
{ 
    for (int i = n / 2 - 1; i >= 0; i--) 
        max_heapify(arr, n, i); 
  
    int temp;
    for (int i=n-1; i>=0; i--) 
    { 
        temp = arr[0];
        arr[0] = arr[i];
        arr[i] = temp; 
        max_heapify(arr, i, 0); 
    } 
} 

void print(const int a[], int iLeft, int iRight) {
   printf("{");
   for (int i = iLeft; i <= iRight; ++i) {
      printf("%d", a[i]);
      if (i < iRight) printf(",");
   }
   printf("}\n");
}


int main() {
   const int SIZE = 7;
   int a[] = {47,70,86, 46,44,45,66};
 
   print(a, 0,SIZE-1);
   heapSort(a, SIZE);
   print(a, 0,SIZE-1);
}


Overwriting ./demo/src/heapsort.c


In [50]:
!gcc -w -o ./demo/bin/heapsort ./demo/src/heapsort.c

In [51]:
!.\demo\bin\heapsort

{47,70,86,46,44,45,66}
{44,45,46,47,66,70,86}


### 3.3 Graphs


* [Graph Optimization Problems](./UnitDS-6-Graph_Optimization_Problems.ipynb)

### 3.4 Hash Tables

* [Hash Tables](./UnitDS-5-Hash_Tables.ipynb)