Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added docs/assets/images/QuickFind.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/images/WeightedUnion.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/images/WeightedUnionLeetCode.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
31 changes: 24 additions & 7 deletions src/main/java/dataStructures/disjointSet/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,14 @@
## Background

A disjoint-set structure also known as a union-find or merge-find set, is a data structure
keeps track of a partition of a set into disjoint (non-overlapping) subsets. In CS2040s, this
keeps track of a partition of a set into disjoint (non-overlapping) subsets.

In CS2040s, this
is introduced in the context of checking for dynamic connectivity. For instance, Kruskal's algorithm
in graph theory to find minimum spanning tree of the graph utilizes disjoint set to efficiently
query if there exists a path between 2 nodes. <br>
It supports 2 main operations:
in graph theory to find minimum spanning tree of a graph utilizes disjoint set to efficiently
query if there already exists a path between 2 nodes.

Generally, there are 2 main operations:

1. Union: Join two subsets into a single subset
2. Find: Determine which subset a particular element is in. In practice, this is often done to check
Expand All @@ -17,12 +20,26 @@ The Disjoint Set structure is often introduced in 3 parts, with each iteration b
previous either in time or space complexity (or both). More details can be found in the respective folders.
Below is a brief overview:

1. Quick Find - Elements are assigned a component identity.
1. **Quick Find** - Elements are assigned a component identity.
Querying for connectivity and updating usually tracked with an internal array.

2. Quick Union - Component an element belongs to is now tracked with a tree structure. Nothing to enforce
2. **Quick Union** - Component an element belongs to is now tracked with a tree structure. Nothing to enforce
a balanced tree and hence complexity does not necessarily improve
- Note, this is not implemented but details can be found under weighted union folder.

3. Weighted Union - Same idea of using a tree, but constructed in a way that the tree is balanced, leading to improved
3. **Weighted Union** - Same idea of using a tree, but constructed in a way that the tree is balanced, leading to improved
complexities. Can be further augmented with path compression.

## Applications
Because of its efficiency and simplicity in implementing, Disjoint Set structures are widely used in practice:
1. As mentioned, it is often sued as a helper structure for Kruskal's MST algorithm
2. It can be used in the context of network connectivity
- Managing a network of computers
- Or even analyse social networks, finding communities and determining if two users are connected through a chain
3. Can be part of clustering algorithms to group data points based on similarity - useful for ML
4. It can be used to detect cycles in dependency graphs, e.g, software dependency management systems
5. It can be used for image processing, in labelling different connected components of an image

## Notes
Disjoint Set is a data structure designed to keep track of a set of elements partitioned into a number of
non-overlapping subsets. It is not suited for handling duplicates and so our implementation ignores duplicates.
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
package dataStructures.disjointSet.quickFind;

import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

/**
* Implementation of quick-find structure; Turns a list of objects into a data structure that supports union operations
*
* @param <T> generic type of object to be stored
*/
public class DisjointSet<T> {
private final Map<T, Integer> identifier;

/**
* Basic constructor to create the Disjoint Set data structure.
*/
public DisjointSet() {
identifier = new HashMap<>();
}

/**
* Constructor to initialize Disjoint Set with a known list of objects.
* @param objects
*/
public DisjointSet(List<T> objects) {
identifier = new HashMap<>();
int size = objects.size();
for (int i = 0; i < size; i++) {
// internally, component identity is tracked with integers
identifier.put(objects.get(i), identifier.size()); // each obj initialize with a unique identity using size;
}
}

public int size() {
return identifier.size();
}

/**
* Adds an object into the structure.
* @param obj
*/
public void add(T obj) {
identifier.put(obj, identifier.size());
}

/**
* Checks if object a and object b are in the same component.
* @param a
* @param b
* @return a boolean value
*/
public boolean find(T a, T b) {
if (!identifier.containsKey(a) || !identifier.containsKey(b)) { // key(s) does not even exist
return false;
}
return identifier.get(a).equals(identifier.get(b));
}

/**
* Merge the components of object a and object b.
* @param a
* @param b
*/
public void union(T a, T b) {
if (!identifier.containsKey(a) || !identifier.containsKey(b)) { // key(s) does not even exist; do nothing
return;
}

if (identifier.get(a).equals(identifier.get(b))) { // already same; do nothing
return;
}

int compOfA = identifier.get(a);
int compOfB = identifier.get(b);
for (T obj : identifier.keySet()) {
if (identifier.get(obj).equals(compOfA)) {
identifier.put(obj, compOfB);
}
}
}

/**
* Retrieves all elements that are in the same component as the specified object. Not a typical operation
* but here to illustrate other use case.
* @param a
* @return a list of objects
*/
public List<T> retrieveFromSameComponent(T a) {
List<T> ret = new ArrayList<>();
for (T obj : identifier.keySet()) {
if (find(a, obj)) {
ret.add(obj);
}
}
return ret;
}
}
11 changes: 9 additions & 2 deletions src/main/java/dataStructures/disjointSet/quickFind/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,18 @@

## Background
Every object will be assigned a component identity. The implementation of Quick Find often involves
an underlying array that tracks the component identity of each object.
an underlying array or hash map that tracks the component identity of each object.
Our implementation uses a hash map (to easily handle the case when objects aren't integers).

<div align="center">
<img src="../../../../../../docs/assets/images/QuickFind.png" width="50%">
<br>
Credits: CS2040s Lecture Slides
</div>

### Union
Between the two components, decide on the component d, to represent the combined set. Let the other
component's identity be d'. Simply iterate over the component identifier array, and for any element with
component's identity be d'. Simply iterate over the component identifier array / map, and for any element with
identity d', assign it to d.

### Find
Expand Down

This file was deleted.

This file was deleted.

Loading