Java Data Structures and Algorithms
Java Ruby
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


ALGO - Java Data Structures and Algorithms

1. Java 1.6

This project provides solution to various java programming interview questions. There is an interactive applet (com.test.ProblemApplet ) that shows all the questions. Select one of the question in the applet to run it. In order to compile and run the project, use the following commands:

Algo>cd src
src>javac -d ../bin/ com/test/ 
src>javac -d ../bin/ com/questions/*/*.java
src>javac -d ../bin/ com/questions/tree/bst/*.java
src>javac -d ../bin/ com/questions/linkedlist/single/*.java
src>cd ..
Algo>cp -R META-INF bin/
Algo>java -cp bin com.test.ProblemApplet

Questions Remaining
1. Sort a link list using merge sort.
2. Find 3rd largest element in an array 
3. There is a linked list whose node has 3 fields, val, next pointer & random 
pointer points to next node in the list and random pointer can point to any node in the list.
Write an efficient function which takes such list and returns the copy/clone of that list.
4. Find 2nd largest element in a binary search tree
5. For a given BST, create a different linked list for each level
6. Deletion of a node 
7. Morris Traversal
8. Convert BST to Circular Double Linked List
9. Find biggest number that is smaller than the given number
10. find second most biggest number
11. Find the maximum depth of the BST
12. Given a binary search tree, design an algorithm which creates  a linked list of all the nodes at each depth.
(eg. if you have a tree with depth D, you will have D linked list)
13. Given a binary Search tree and a Node, How would you transform the tree to make that Node as root.The resulting tree should be a BST.
14. Given a value and a binary search tree. Print all the paths(if there exists more than one) 
which sum up to that value. It can be any path in the tree. It doesn't have to be from the root (NP-Hard Problem) 
16. Design an algorithm to find path from one node in a binary tree to another
17 Explain Different types of binary tree traversal
   * Depth-First
   ** Inorder -- Left, Node, Rith -- sorting
   ** Preorder -- Node, Left, Right -- Copy BST, Prefix expression
   ** Postorder -- Left, Right, Node -- Postfix
   * Level-Order/Breadth First

Red-Black Trees:
* Insertion Rules:
   1. Every node is either red or black 
   2. The root is always black
   3. If a node is red, its children must be black. Although the converse it not necessarily true
   4. Every path from the root to leaf or null node must have same number of black nodes. 

BST Hints:
  * Depth of Tree = Log(N)/Log(2)  = log(N, base=2)
  * Number of Leaf Nodes = 2^(Height) = 2^(log(N, base=2))
  * Inorder -- Left, Node, Right
  * PreOrder -- Node, Left, Right
  * PostOrder -- Left, Right, Node
  * Number of nodes = 2^(height+1) - 1
Representing Tree as Array
	** Left Child: 2*Index +1
	** Right Child: 2*Index + 2
	** Parent = (index-1)/2
Key Algorithms:
1. Morris Traversal for Binary Search Trees: Convert binary search tree to circular double linked list in place
2. Subset Sum Problem 
3. Sorting Algorithms: Merge sort, Quick Sort, Radix Sort, Bubble Sort, ...
4. Fi


* Explain fail-safe
	Fail-safe doesn't raises ConcurrentModificationException. Read more about Fail Safe/Fail Fast here:

* Difference between HashMap and HashTable 
    * Hashtable is synchronized and hence has slower performance
    * Hashtable iterator is not fail-safe
    * Hashtable doens't allow null for key and value
	* HashMap are not synchronized and therefore has better performance
	* Hashmap iteratores is fail-safe
	* Hashmap allows null for both key and values
	* HashSet does not allow duplicate values
	* HashSet only takes key - it has add 
* Vectors versus ArrayList
	* Vector is synchronized. ArrayList is not
	* Iterators for both are fail-safe. Enumeration returned by vector are not
	* Data growth - Internally, both the ArrayList and Vector hold onto their 
	contents using an Array. When an element is inserted into an ArrayList or a Vector, 
	the object will need to expand its internal array if it runs out of room. A Vector 
	defaults to doubling the size of its array, while the ArrayList increases its array 
	size by 50 percent.  

* Difference between Iterators and Enumerations
	* Enumeration is old. Iterators are new.
	* Enumeration are read only. Iterators has remove() method.
	* Iterators allow removing element will iterating. Enumerations don't. Enumeration acts as 
	Read-only interface, because 
	it has the methods only to traverse and fetch the objects, where as by using Iterator we can 
	manipulate the objects like adding and removing the objects from collection e.g. Arraylist.
	* Also Iterator is more secure and safe as compared to Enumeration because it  does not allow other thread 
	to modify the collection object while some thread is iterating over it and throws

Benefit of linkedlist over array:
  * you don't need continuous space
  * you don't need to predefined space
  * efficiently utilization of space because you don't block any more space than currently required. 

Disadvantages of linked list
  * insertion takes order of N time
  * Difficult to maintain and recover 

	* Usually uses array to store elements
	* usually made twice the initial size when full. Array size should be prime number and hence the new array size will be more than double. 
	* VERY IMPORTANT: table size should be prime number for open addressing 
Hashing Algorithm for Strings:
	* Approach 1:
	     Example: CATS
	     Approach 1: 3*27^0 + 1*27^0 + 20*27^1 + 19*27^0
			-- can't handle strings > 7
	     Approach 2 (Horner's Method): (((3*27+1)*27 + 20)*27 + 19)*27 
			-- can't handle strings > 7
	     Approach 3 
				((((3 % arraysize)*27 + 1) % arraysize)*27 + 20) % arraysize)*27 + 19
				--can be made faster using 32 base and then using bitwise operations for mod
11 % 32 ==> 00001011 & 00011111 = 00001011
33 % 32 ==> 000100001 & 00011111 = 00000001

Ways to handle Collisions in HashTable
	* Open Addressing: Collisions are resolved by looking for an open cell in the hash table. Generally load factor needs to be less than 1 (preferred is about 0.5 to 0.7)
		** Linear Probe
		** Quadratic Probe
		** Double Hashing
	* Separate Chaining: Each index is a linkedlist. Collisions are resolved by adding by simply adding the element to the linked list. Load factor can be 1 or more. Because of this, separate chaining is often preferred over open addressing. Table size doesn't need to be prime number. 
Advantages and Issues of HashTable

	* Issues
		** Based on arrays and hence need a good estimate of size beforehand
		** No easy way to scan the data in certain order
		** Work best when they are no more than 1/2 or 2/3 fill
	* Advantages
		** constant time lookup
Properties of HashFunction
	* Quick Computation
	* Random Keys
	* Non-Random Keys
Benefits/Issues of single linked list and double linked list -- think of applications ?
Properties of Red-Black Nodes:
* Root node is always black
* Red node will always have black children
* Black height/Depth of all paths is same

Height of red-black tree is in between: log_4(n) < h < 1 + log_2(n)
================== HEAP ===========================
* is a kind of tree for both insertion and deletion in O(logN) time
* Use for prioritizing items
* It's a binary tree with these characteristics:
	** Its complete
	** usually implemented as an array
	** Every node's key is larger than or equal to the keys of its children
Tree to Array
parent(i): return i/2
left(i): return 2i ==> i >> 1
right(i) return 2i+1 ==> (i >> 1) << 1

2^0=1-1 = 0
2^1=2-1 = 1
2^2=4-1 = 3
2^3=8-1 = 7


==== HEAPS-SORT ==============
		if index 1,  if index 0
parent: 	i/2,	(i+1)/2 - i 
left: 		2i, 	2*(i+1) - 1
right: 		2i+1, 	2*(i+1)

Heapify(A, i)
	l <- left(i)
	r <- right(i)
	//find largest among, i, l and r
	largest = i
	if(l < size && A[l] > A[i]) largest = l
	if(r < size && A[r] > A[largest]) largest = r
	//swap and heapify
	if(largest != i)
		swap(i, largest)
		heapify(A, largest)

 Heap-Size <- length(A)-1
 for i <- parent(heap-size) to 0:
	heapify(A, i)

		Swap(0, Heap-Size)
		Heap-Size <- Heap-Size - 1
		Heapify(A, 0)
	while Heap-Size > 0		
	Heap-Size <- length(A)-1

* Idea: The head of the heap is always gauranted to be maximum element and hence after each heapify we move the head to the right of the heap and re-run heapify on the remaining non-sorted elements. 
* Example:
Initial: [10, 5, 18, 2, 34, 0]
After Build-Heap: [ 34, 18, 10, 2, 5, 0 ]
HeapSize: 5
Do Loop Begins: 
   HeapSize, ArrayElements
      4,     [ 18, 5, 10, 2, 0 | 34 ]
      3,     [ 10, 5, 0, 2 | 18, 34 ]
      2,     [ 5, 2, 0 | 10, 18, 34 ]
      1,     [ 2, 0 | 5, 10, 18, 34 ]
      0,     [ 0 | 2, 5, 10, 18, 34 ]

Final Array: [ 0, 2, 5, 10, 18, 34 ]

==== QUICK-SORT ==============
QuickSort(A, p, r):
	if p < r
		then q <- PARTITION(A, p, r)
			 QuickSort(A, p, q-1)
			 QuickSort(A, q+1, r)

Partition(A, p, r):
	//pivot value
	x = A[r]
	i = p-1
	for j =  p to r -1
		if A[j] <= x
			swap(i, j)
	swap(i+1, r)
	return i+1

Example: [10, 5, 18, 2, 34, 0]

====== COUNTING SORT =======
* uses an auxillary array to store commulative sum of each value
* Running time: O(n) -- linear times --beats quicksort
* Stable algorithm - order of same value in the input array is changed.  Stability is desired in cases such as satellite data 

* cutrod problem
* uses additional memory to save computation time - saving might be dramatic. An exponential problem might be converted into polynomial time solution
* two ways to implement:
	1. top-down with memoization -- write the procedure recursively in a natural manner but modified to save the result of each subproblem. We say that the recursive procedure has been memoized; it remembers what results it has computed previously.
	2. bottom-up method - we sort subproblems by size and solve them in size order, smallest first . 
Recursive solution for cutting rod problem (extremely inefficient)
p is price vector
n is current length of rod
CUT-ROD(p,n) -- running time = 2^(n-1)
	if n == 0 return 0
	q = -inf
	for i = 1 to n
		q = max(q, p[i] + CUT-ROD(p, n-i))
	return q;

MEMOIZED-CUT-ROD(p,n) --> running time n^2
	//initialize an array with default 
	//negative value
	r = Array(n)
	for i=0 to n
		r[i] = -inf
	return MEMOIZED-CUT-ROD-AUX(p, n, r)

//Same as recursive version
//but we save result in a table	
	if r[n] >= 0
		return r[n]
	q = -inf
	if n == 0
		q == 0
		for i=1 to n
			q = max(q, p[i] + MEMOIZED-CUT-ROD-AUX(p, n-i, r))
	r[n] = q
	return q

BOTTOM-UP-CUT-ROD(p, n)  ---> running time n^2
	r = Array(n)
	r[0] = 0
	for j = 1 to n
		q = -inf
		for i = 1 to j
			q = max(q, p[i] + r[j-i])
		r[j] = q
	return r[n]

	r = Array(n)
	//initialize another array to store sizes
	s = Array(n)
	r[0] = 0
	for j = 1 to n
		q = -inf
		for i = 1 to i
			if q < p[i] + r[j-i]
				q = p[i] + r[j-i]
				s[j] = i
		r[j] = q
	return r and s

====== GRAPH =======
Two ways to represent graphs:
	Adjacency List: Space Efficient but requires time to determine if an edge exists between two vertices. 
	Adjacency Matrix: Allows to determine whether an edge exists in constant time. Requires only bit per edge
	** Uses White/Gray/Black Nodes

==== general java questions ====
1. what is transient variable: variables that do not get serialized 
2. Explain the concept of shallow copy vs deep copy 
	Shallow copy -- cloned object also refers to the same object to which the original object refers. Use clone() to create shallow copy
	Deep copy - clone of the class and all the objects referred by that class. To make deep copy, implement Cloneable interface and call its clone() method. 

3. What do you understand by synchornization ? How do synchronize a method call in java? How do you synchornize a block of code in java
	* Synchronization is a process of controlling the access of shared resources by the multiple threads in such a manner that only one thread can access one resource at a time. 
	* Synchornizing a method: Put keyword synchronized as part of the method declaration
	* Synchronizing a block of code: Put block of code in synchronized(this){ some code}
4. Explain encapsulation, inheritance and polymorphism
	* Encapsulation: binding or wrapping the data nad the codes that operates on the data into a single entity
	* inheritance: extending the class
	* polymorphism: using interface and implementations
5. Similarities and differences between abstract class and interface
	* both can't be instantiated
	* interfaces are limited to public methods and contains no implementation. 
	* A class may implement several interfaces. But an abstract class can only extend one abstract class
	* Interfaces are slow as it requires extra indirection to find corresponding method in the actual class. Abstract classes are fast

6. How to declare class variable in java
	* Use static modifier

7. Design a data structure that offers the following operations in O(1) time: insert, remove, contains, getRandom element
	* Use HashTable and Array. 
		HashTable keys are the elements in the data structure and values are their positions in the array
	* Insert(value): Append the value to array and let i be its index in the array. Set Hash[value] = i
	* Remove(value): Use Hash to find its position in array. Let say its i. Replace Array[i] with the last element of array. Let say its d and decrement array size by 1
			i = Hash[key(value)]
			last = A[array_size]
			A[i] = last
			Hash[key(last)] = i
	* contains(value): return H.contains(value)
	* getRandomeElement(): let r = A[random(array_size)]

8. Difference between process and threads
	* Both processes and threads are independent sequences of execution. 
	* Threads run in a shared memory space, while processes run in separate memory spaces
	* Each process provides the resources needed to execute a program. A program has a vital address space, executable code, open handles to systems object, a unique process identifier, environment variable, priority class, .... Each process is started with a single thread, often called the primary thread but can create additional threads from any of its thread
	* A thread is the entity within a process that can be scheduled for execution. All threads of a process share its virtual address space and system resources. In addition, each thread maintains exception handlers, a scheduling priority, thread local storage, a unique thread identifer, and a set of structures the system will use to save the thread context until it is scheduled. 
===== java interview questions 
1. Generate all possible subsets of an array of elements
	* Number of possible subsets: O(2^N)
	* Bit representation of 0 to 2^N indicates all possible subsets
		subsets = 2^N
		for i <- 0 to subsets-1
			k <- i
			subset = []
			for j <- 0 to A.length-1
				if (k & 1) == 1
					subset << A[j]
				k = k >> 1 
			print subset

2. Find the middle element of a single linked list in a single pass
	* Use two pointers. Initialize both pointing to the root of the list
	* Increment the first pointer by two and the second by one. 
	* When the first pointer reaches the end, the second will reach to the middle
	* Cases:
		Odd number of elements versus even number of elements
		what happens if root is null
		what happens if there is only one element
			A -> B -> C -> D -> E
			initialize i = j = A
			i = C, j = B, odd=False
			i = E, j = C, odd=False
			i = Null, j = C, odd=True
			MiddleElement = j
			A -> B -> C -> D -> E -> F
			initialize i = j = A
			i = C, j = B, odd = False
			i = E, j = C, odd = False
			i = F, j = C, odd = False
		    MiddleElement = C, D
		Only one element
			i = j = A
			i = NULL, j = A, odd = true
			MiddleElement = A
		if root == null
		i <- root
		j <- root
		oddElements = false
		while(i != null)
			i <-
			if(i != null) 
				i <-
				j <-
			else odd = true

3. Reverse a single linked list in place
	* Use three pointers to keep track of prev, current and next
	* Example
		A -> B -> C -> D -> E
		prev = next = Null
		current = root
		prev, current, next, list
		A, C, B, NULL <- A, B -> C -> D -> E
		if root = NIL
		prev = NULL
		current = root
		next = NULL
		while(current != null)
			next = = prev
			prev = current
			current = next
		return parent
4. Determine if a linked list is pallidrome
	* Use two pointers. Increment the first by two and the second by one
	* Until first pointer reaches the end, use stack to store values traversed by first
	* After the first pointer reaches the end, pop values from stack and check if they matches current value of the second pointer
		i, j <- root
		s = stack()
		isOdd = 
		while(j != null)