Algorithms and Data Structures Track

https://teamtreehouse.com/tracks/algorithms-and-data-structures

Course 1: Introduction to Algorithms

https://teamtreehouse.com/library/introduction-to-algorithms

Search algorithms

Linear/sequential/simple search
Binary search

Algorithm guidelines

Clearly defined problem statement, input and output
- What is input and what will output be
- "Does 1–10 include 1 and 10"?
Break down into steps that couldn't be further broken down
- No subtasks
- "Search until you find 10" can be interpreted in many different ways
Algorithms must products a result
Should complete and not run endless

Before rushing into a solution, first running through the algorithmic guidelines

Breaking down a problem into a clear set of smaller problems
Each of which can be defined in terms of an input and an output

Evaluating Linear Search

Correctness
- Algorithm is correct every time and with every value
- Mathematical proof through induction
Efficiency
- Balance of two measures: time and space
- Very fast algorithm might not be helpful if it takes more memory than you have available
Time complexity
- How long it takes to complete a job
- A measure of the running time of an algorithm
- Efficiency of an algorithm measured by time it takes in relation to the size of the data set
Space complexity
- Amount of memory taken up
- A measure of the storage taken up by an algorithm
Evaluating efficiency can be done by taking the worst case as an x-axis value
- Worst-case because it can never perform worse than that
- When the search target is the same as the max range of values, the data point reflects the size of the dataset as well as the algorithm's effort
- Plot number of tries against number of values in the range (shortened to n)

Evaluating Binary Search

Requires a sorted, sequential set of data
- In contrast to linear search

Time Complexity

Order of growth: measure of how much of an increase occurs in the time it takes to execute an operation as the the input size increases

Big-O notation

O = order of magnitude of complexity
(n) = a function of the size
- measures complexity as the input size grows
Upper bounds = Big-O measures how an algorithm performs in the worse case scenario
Linear search: O(n)
Binary search: O(log n)

Constant and Logarithmic Time

Constant time: O(1)

The runtime for an algorithm is independent of the size of the data set

e.g. Reading a value at an index will always take the same amount of time
Logarithmic runtime: O(log n)

The runtime for an algorithm increases logarithmically as the size of the data set increases

Binary search log2 of n + 1
i.e. How many times do you need to divide n by 2 until you get 1?
Also sometimes O(ln n)
Also can be referred to as sublinear

Linear and Quadratic Time

Linear time: O(n)

The runtime of the algorithm is directly proportional to the size of the data set

Reading every item sequentially
Interesting that linear search will be more performant than binary search up to a certain value of n because binary search requires the additional runtime of applying a sorting algorithm
Quadratic time: O(n^2)

The runtime of the algorithm increases by a factor of n squared as the size of the data set increases

Expensive computationally
e.g. making a n x n board grid that provides coordinates on a map (1,1), (1,2), (1,3), (2,1) ...

Quasilinear time

Quasilinear time: O(n log n)

Given a data set of size n, the algorithm executes an n number of operations where each operation runs in log n (logarithmic) time

Basically linear times a bit
Common in sorting algorithms (e.g. merge sort)

Exponential time

Polynomial runtimes: O(n^k)
- Worse case scenario is n to some fixed value k
- Considered efficient
Exponential runtimes: O(k^n)
- As size of data set (n) increases, the runtime will increase to that power
- Considered inefficient
- Brute force algorithms
Combinatorial runtimes: O(n!)
Traveling salesman
- Factorial or combinatorial runtime
- n!, i.e. n(n-1)(n-2)...(2)(1)

Determining Complexity

Each step in an algorithm has its own efficiency
Binary search has both
- linear efficiency (half length + access/compare value )
- logarithmic efficiency (number of times it would need to repeat in the worst case)
Can measure by average or worst-case performance

Algorithms in Code

Linear search in code

Not all code is an algorithm
To be proper linear search must:
- return same value given the same input
- return a value every time executed
- complete execution in a finite amount of time

function linearSearch(array, target) {
    for(let i = 0; i < array.length; i++){
        if(array[i] === target){
            return i;
        }
    }
    return -1;
}

Treehouse example uses a for-loop

function linearSearch(array, target) {
  let returnValue;
  array.forEach((item, index) => {
    if (item === target) {
      returnValue = index;
    }
  });
  return returnValue ? returnValue : -1;
}

forEach is a pain to use here because you can't break out of it
- Using worst-case efficiency this wouldn't affect things, but once average performance is taken into account, the O(n) efficiency here will be worse than either the for-loop or findIndex

function linearSearch(array, target) {
    return array.findIndex(item => item === target)
}

let arrayOfValues = [1, 2, 3];

console.log(linearSearch(arrayOfValues, 2)); // 1
console.log(linearSearch(arrayOfValues, 10)); // -1

Simplest via findIndex because the native method serves exactly this problem's input and output requirements
- findIndex will also immediately return if the condition is found, saving the worse case

Binary search in code

Iterative binary search

function binarySearch(array, key) {
    let left = 0;
    let right = array.length - 1;
    while (left <= right) {
        const mid = left + Math.floor((right - left) / 2);
        if (array[mid] === key) {
            return mid;
        }
        if (array[mid] < key) {
            left = mid + 1;
        } else {
            right = mid - 1;
        }
    }
    return -1;
}

Treehouse algorithm

Recursive binary search

function binarySearch(
  dataArray,
  target,
  startingIndex = 0,
  endingIndex = dataArray.length
) {
  const centerIndex = startingIndex + Math.floor((endingIndex - startingIndex) / 2);
  const middleValue = dataArray[centerIndex];
  const indexIsValid = centerIndex < dataArray.length && centerIndex > 0;

  if (middleValue === target) {
    return centerIndex;
  }

  if (!indexIsValid) {
    return -1;
  }

  if (middleValue > target) {
    return binarySearch(dataArray, target, startingIndex, centerIndex - 1);
  } else {
    return binarySearch(dataArray, target, centerIndex + 1, endingIndex);
  }
}

let valuesToSearch = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];

console.log(binarySearch(valuesToSearch, 1)); // 0
console.log(binarySearch(valuesToSearch, 3)); // 2
console.log(binarySearch(valuesToSearch, 6)); // 5
console.log(binarySearch(valuesToSearch, 10)); // 9
console.log(binarySearch(valuesToSearch, 11)); // -1
console.log(binarySearch(valuesToSearch, -10)); // -1

My solution, Treehouse didn't provide a JS binary search. Think this does the trick

Recursion and Space Complexity

Recursive functions

Stopping condition in recursive function is often called a base case
Recursive depth: the number of times a function calls itself
Python isn't designed for recursion and has a max recursion depth
- So an iterative algorithm is preferable in Python
Swift handles recursion nicely
Different languages will lend themselves to different solutions
Recursion is often preferred to iterative algorithms in functional programming paradigms
- I think because iterative loops require value mutation

Space complexity

Space complexity: A measure of how much working storage, or extra storage, is needed as a particular algorithm grows
- Not concerned with the initial memory needed (i.e. my input is 10 vs 100)
- Instead, space complexity measure the additional storage needed as the algorithm runs and tries to find a solution
In the iterative Python algorithm, the space complexity is constant (O(1)) because we only use a single list in memory and simply change the start and end values
- As the algorithm runs it doesn't require additional memory to what it began with
The recursive algorithm, on the other hand, creates a new list every time with a subset of values
- Starting out with memory allocation of n
- Each time the recursion occurs it will require n/2, n/4, etc. until the recursion hits a stopping condition
- The recursive algorithm runs in logarithmic space (O(log n))
- In a language w/o TCO (like Java or Python), the recursive algorithm also takes logarithmic space
In the Python example, both the iterative and recursive algorithms have the same time complexity, so the difference in their space complexity (constant vs. logarithmic) makes the iterative algorithms a better implementation
I don't think that my JS recursive binary search takes logarithmic space because I'm passing through the same single array rather using slice to make a new array
- dataArray === valuesToSearch is true within the algorithm, so the reference identity of the array should be consistent
- Good to think through though
Tail call: a call inside a function to the same function that occurs as the last operation
- I believe my binary search algorithm has tail calls
Tail call optimization (TCO): a feature some languages have to save memory when running recursive functions
- https://stackoverflow.com/questions/310974/what-is-tail-call-optimization/310980#310980

Course 2: Introduction to Data Structures

Array basics

A data structure is a collection of values and the format they are stored. Data structures model the relationships between the values in the collection as well as the operations applied on the data stored in the structure.

Homogenous vs. heterogenous structures
- Java requires all values stored in an array to be of the same type
- Python and JS allow any variety of values to be stored in an array
Python calls data structures that are basically arrays lists
Arrays in most languages are contiguous data structures = all values are stored next to one another in memory
- Easy to retrieve data
- Runtime more likely to be constant
Non contiguous data structures = data structure stores a value and a reference to where the next value is
- To retrieve next value, must follow the reference (pointer) to the next block of memory
- Adds runtime overhead
In a heterogenous data structure (e.g. Python list): hard for compiler to allocate contiguous memory of the correct size because different types (e.g. string, number) can require different amounts of space in memory
- Python gets around this by allocating contiguous memory blocks with pointer references to the values (pointers are always the same size)
Only first memory block location needs to be held in memory
- Because of contiguous memory, the index of an array can be used to determine which block of memory stores the desired value
- Only the first index value is stored, while the other values can be calculated at runtime using an offset
- Starting memory block + index
- Error if attempting to access an index outside the array / range of contiguous memory blocks

JavaScript Arrays

Does not have an explicit array type, but an Array object, which is a list like object
- Has special constructor and accessor methods
- Prototype has traversal and mutation operations
Heterogenous and non-contiguous data structures

arrays are regular JavaScript objects with syntactic sugar

Array Search, Insert, Delete

Search in an array is a linear time operation
- Most languages will do an iterative search because it's typically less expensive than both a sort and binary search together
Insert = linear runtime because every item in the array needs to be shifted one index over
- e.g. Adding a value to the beginning of an array
Appending = constant time because we're just adding a new index to the end
- Ammortized constant space complexity = Python has a growth pattern for its list type (e.g. 4, 8, 16), so resizing the list when adding a single value may actually have a greater than constant space complexity
- But it averages to constant
Delete = upper bound of O(n) or linear runtime because in the worst case, every item in the array needs to be shifted one index over
In sum:
- arrays are good at accessing and reading values takes place in constant time
- arrays aren't great at inserting or deleting, which take place in linear time

Building a Linked List

Linked lists are better than arrays at insertions and deletions
- Good for a task that requires a lot of insertion and deletion and not much searching (which they're slow at)
Head and tail nodes
- Memory typically only keeps a reference to the head node
- Each node has a value and a pointer to the next value
- Except tail, which has no pointer
Singly- or doubly-linked lists
- Reference to only next node or also the one before
Linked lists are non-contiguous data structures
- Each node is stored in a random memory location
- Each node had a reference to the next node's location

class Node {
  constructor(value) {
    this.value = value;
    this.nextNode = null;
  }
}

class SinglyLinkedList {
  constructor(head = null) {
    this.head = head;
    this.size = 0;
  }

  isEmpty() {
    return this.head === null;
  }

  getLength() {
    let current = this.head;
    let count = 0;

    while (current) {
      count++;
      current = current.nextNode;
    }
    return count;
  }

  add(value) {
    // Adds a new node to the head
    // Takes constant time: O(1)
    const nodeToAdd = new Node(value);
    nodeToAdd.nextNode = this.head;
    this.head = nodeToAdd;
  }

  search(target) {
    // Takes linear time: O(n)
    let current = this.head;
    let foundNode = null;

    while (current && foundNode === null) {
      if (current.value === target) {
        foundNode = current;
      }
      current = current.nextNode;
    }
    return foundNode;
  }
}

const myLinkedList = new SinglyLinkedList();
myLinkedList.add(1);
myLinkedList.add(2);
myLinkedList.add(3);

console.log(myLinkedList.getLength()); // 3
console.log(myLinkedList.search(2)); // Node { value: 2, nextNode: Node { value: 1, nextNode: null } }

First stab implementation mainly reproducing the Python code example from Treehouse
Interesting to compare with an object focused JS linked list: https://github.com/tyreer/learning-notes/blob/master/Egghead_data-structures-and-algorithms-in-js.md#linked-list-data-structure-in-javascript

Adding Nodes to a Linked List

Actual insert is anywhere in the list
- Focusing now on only prepend and append
- To insert at the end of a list, need a reference to the tail

  insert(value, index) {
    let position = index;
    let current = this.head;

    if (index === 0) {
      this.add(nodeToAdd);
    }

    if (index > 0) {
      const nodeToAdd = new Node(value);

      while (position > 1) {
        position--;
        current = current.nextNode;
      }

      let previous = current;
      let next = current.nextNode;

      previous.nextNode = nodeToAdd;
      nodeToAdd.nextNode = next;
    }
  }

position > 1 is key stopping condition

The Merge Sort Algorithm

Break array into single-element arrays
Sort and merge arrays at the same time to build back up the original array length / members
Goes recursively down the left half of the original array, then the right half, finally merging the two sorted halves
Overall, this results in fewer sort operations than sorting the list as a whole
Note that in the Python example, the slice makes the algorithm much more expensive
- O(kn log n) where k is the number of items in the lists being split
- This seems true of JS also SO ref
- A non recursive, iterative solution would avoid the k multiplier
- Trade off is same as earlier lesson: mutating variables vs a more functional execution

// Overall runtime will be split + merge: O(n log n)
function mergeSort(values) {
  // Base case or stopping condition that will end recursion
  if (values.length <= 1) {
    return values;
  }

  const [leftValues, rightValues] = split(values)

  // Recursively break down array
  return merge(mergeSort(leftValues), mergeSort(rightValues));
}

function split(values){
  // Takes overall O(log n) time
  const middleIndex = Math.floor(values.length / 2);
  const leftValues = values.slice(0, middleIndex);
  const rightValues = values.slice(middleIndex);

  return [leftValues, rightValues];
}

function merge(left, right) {
  // Runs in overall O(n) time
  // Linear space complexity because operations are not simultaneous 
  // and only the final merge will require the memory equivalent to the initial value 

  let resultArray = [];
  let leftIndex = 0;
  let rightIndex = 0;

  while (leftIndex < left.length && rightIndex < right.length) {
    if (left[leftIndex] < right[rightIndex]) {
      resultArray.push(left[leftIndex]);
      leftIndex++;
    } else {
      resultArray.push(right[rightIndex]);
      rightIndex++;
    }
  }

  return resultArray
    .concat(left.slice(leftIndex))
    .concat(right.slice(rightIndex));
}

const myArray = [11, 2, 44, 4, 555, 6, 7];

console.log(mergeSort(myArray)) // [ 2, 4, 6, 7, 11, 44, 555 ];

This is a mix of the Treehouse Python solution and this helpful reference: https://medium.com/javascript-in-plain-english/javascript-merge-sort-3205891ac060

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Treehouse_algorithms-and-data-structures.md

Treehouse_algorithms-and-data-structures.md

Algorithms and Data Structures Track

Course 1: Introduction to Algorithms

Search algorithms

Algorithm guidelines

Evaluating Linear Search

Evaluating Binary Search

Time Complexity

Big-O notation

Constant and Logarithmic Time

Linear and Quadratic Time

Quasilinear time

Exponential time

Determining Complexity

Algorithms in Code

Linear search in code

Binary search in code

Recursion and Space Complexity

Recursive functions

Space complexity

Course 2: Introduction to Data Structures

Array basics

JavaScript Arrays

Array Search, Insert, Delete

Building a Linked List

Adding Nodes to a Linked List

The Merge Sort Algorithm

Files

Treehouse_algorithms-and-data-structures.md

Latest commit

History

Treehouse_algorithms-and-data-structures.md

File metadata and controls

Algorithms and Data Structures Track

Course 1: Introduction to Algorithms

Search algorithms

Algorithm guidelines

Evaluating Linear Search

Evaluating Binary Search

Time Complexity

Big-O notation

Constant and Logarithmic Time

Linear and Quadratic Time

Quasilinear time

Exponential time

Determining Complexity

Algorithms in Code

Linear search in code

Binary search in code

Recursion and Space Complexity

Recursive functions

Space complexity

Course 2: Introduction to Data Structures

Array basics

JavaScript Arrays

Array Search, Insert, Delete

Building a Linked List

Adding Nodes to a Linked List

The Merge Sort Algorithm