Skip to content

Commit

Permalink
Merge branch 'master' of https://github.com/michael-rapp/Tries
Browse files Browse the repository at this point in the history
# Conflicts:
#	RELEASE_NOTES.md
  • Loading branch information
michael-rapp committed Jan 30, 2018
2 parents d9638ab + e04ed59 commit 8bb9973
Show file tree
Hide file tree
Showing 142 changed files with 34,698 additions and 211 deletions.
153 changes: 153 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,153 @@
# Tries - README

[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0) [![Donate](https://img.shields.io/badge/Donate-PayPal-green.svg)](https://www.paypal.com/cgi-bin/webscr?cmd=_s-xclick&hosted_button_id=X75YSLEJV3DWE)

This is a Java library, which provides different implementations of "Tries" (also referred to as "prefix trees", "digital trees" or "radix trees"). Tries are data structures that can be used to store associative arrays where the keys are sequences (e.g. sequences of characters or digits). Tries allow to efficiently search for sequences (and their associated values) that share a common prefix.

The library currently provides the following features:

* Provides an unsorted trie implementation, which stores successors of nodes in hash maps (`HashTrie`).
* Provides a sorted trie implementation, which stores successors of nodes in sorted lists (`SortedListTrie`).
* Provides a Patricia trie implementation (`PatriciaTrie`).
* For each available trie implementation a dedicated variant for using character sequences as keys is available.
* The utility class `Tries` allows to create empty, singleton and unmodifiable instances of all available trie implementations.

## License Agreement

This project is distributed under the Apache License version 2.0. For further information about this license agreement's content please refer to its full version, which is available at http://www.apache.org/licenses/LICENSE-2.0.txt.

## Download

The latest release of this library can be downloaded as a zip archive from the download section of the project's Github page, which is available [here](https://github.com/michael-rapp/Tries/releases). Furthermore, the library's source code is available as a Git repository, which can be cloned using the URL https://github.com/michael-rapp/Tries.git.

Alternatively, the library can be added to your project as a Gradle dependency by adding the following dependency to the `build.gradle` file:

```groovy
dependencies {
compile 'com.github.michael-rapp:tries:1.0.0'
}
```

When using Maven, the following dependency can be added to the `pom.xml`:

```xml
<dependency>
<groupId>com.github.michael-rapp</groupId>
<artifactId>tries</artifactId>
<version>1.0.0</version>
</dependency>
```

## Tries

The image below illustrates the structure of an (unsorted) trie. The following key-value pairs (where the keys are character sequences and the values are integers) are stored by the trie:

```
A -> 21
to -> 7
tea -> 13
ted -> 8
ten -> 9
in -> 14
inn -> 16
```

![](/doc/images/trie_example.png)

As the given example illustrates, all leaf nodes of a trie are associated with values (highlighted in blue). In addition, some of the inner nodes may also correspond to values. The predecessors of a node specify the sequence it corresponds to. E.g. the node that corresponds to the key "tea" has the predecessors t -> e -> a. The root node corresponds to an empty sequence. Because nodes with a common prefix share the same predecessors, tries provide some kind of compression.

Most importantly, this library provides two interfaces - `Trie` and `SortedTrie`. The first of both interfaces extends the interface `java.util.Map`, whereas the latter extends the interface `java.util.NavigableMap`. Similar to the map implementations the Java SDK provides, key-value pairs can be added to a trie using the `put`-method. For retrieving values by their key, the `get`-method can be used. Whereas the order of keys is not taken into account by unsorted tries, when iterating a sorted trie, the order of the keys preserved. The following table provides an overview of the different implementations of the interfaces `Trie` and `SortedTrie`:

| Interface | Implementations | Description |
|----------------------------------------|-------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `Trie<SequenceType, ValueType>` | `HashTrie<SequenceType, ValueType>` | An unsorted trie, which stores the successors of nodes in hash maps. This enables to lookup keys with linear complexity. |
| `StringTrie<ValueType>` | `HashStringTrie<ValueType>` | The pendant of the class `HashTrie` for using character sequences, i.e. Strings, as keys. |
| `SortedTrie<SequenceType, ValueType>` | `SortedListTrie<SequenceType, ValueType>` | A sorted trie, which stores the successors of nodes in sorted lists. As binary searches are used to search for successors, looking up keys comes at logarithmic costs. |
| | `PatriciaTrie<SequenceType, ValueType>` | A sorted trie similar to a `SortedListTrie`, where edges between nodes do not always correspond to a single element. Subsequent nodes with a single successor are merged to a single node. |
| `SortedStringTrie<ValueType>` | `SortedListStringTrie<ValueType>` | The pendant of the class `SortedListTrie` for using character sequences, i.e. Strings, as keys. |
| | `PatriciaStringTrie<SequenceType, ValueType>` | The pendant of the class `PatriciaTrie` for using character sequences, i.e. Strings, as keys. |

Whereas the values of a trie can be of an arbitrary type (referred to as the generic type `ValueType`), the type of the keys (referred to as `SequenceType`) must implement the interface `Sequence`. The tries for storing character sequences internally use the class `StringSequence`, which implements that interface. The following example illustrates how key-value pairs can be added and looked up using a generic `HashTrie`.

```java
Trie<StringSequence, Integer> trie = new HashTrie<>();
trie.put(new StringSequence("A"), 21);
trie.put(new StringSequence("to"), 7);
// ...
int value = trie.get(new StringSequence("to"));
```

Using a `StringTrie` instead of a `Trie` simplifies the handling of keys:

```java
StringTrie<Integer> trie = new HashStringTrie<>();
trie.put("A", 21);
trie.put("to", 7);
// ...
int value = trie.get("to");
```

The nodes of a trie are implemented as classes implementing the interface `Node`. If necessary, the root node of a trie can be retrieved by using the `getRootNode`-method. Although the returned node may not be modified, this enables to traverse the tree structure of the trie:

```java
Node<StringSequence, Integer> rootNode = trie.getRootNode();

if (rootNode.hasSuccessors()) {
for (StringSequence successorKey : rootNote)
Node<StringSequence, Integer> successor = rootNode.getSuccessor(successorKey);
// ...
}
}
```

The interfaces [`Map`](https://docs.oracle.com/javase/8/docs/api/java/util/Map.html), respectively [`NavigableMap`](https://docs.oracle.com/javase/8/docs/api/java/util/NavigableMap.html), which are extends by the interfaces `Trie` and `SortedTrie` provide various methods for modifying a trie or retrieving its key or values. For a more detailed documentation of these interfaces, please refer to their API documentation. In addition to the methods, which are provides by these interfaces, each trie provides a `subTrie`-method. It enables to create a new trie from an existing one, which will only contain a subset of the original key-value pairs. In the example below, the resulting sub trie only contains keys that start with the element "t".

```java
Trie<StringSequence, Integer> subTrie = trie.subTrie(new StringSequence("t"));
```

## Patricia Tries

Patricia tries use a structure, which is optimized in terms of the required space. Unlike a `HashTrie` or a `SortedListTrie`, the edges between a `PatriciaTrie`'s nodes do not always correspond to a single element. Instead, subsequent nodes that only have a single successor are merged into a single node to reduce space complexity. As this requires to reorganize the tree structure when inserting or removing elements, patricia tries should be preferred over a `SortedListTrie` if elements are only added or removed sporadically and optimizing memory consumption is important.

The image below illustrates the structure of a Patricia trie, which contains the following key value-pairs.

```
roman -> 7
romane -> 13
romanus -> 8
romulus -> 9
ruber -> 17
rubicon -> 14
```

![](/doc/images/patricia_trie_example.png)

## Utility methods

Similar to the Java SDK's class `java.util.Collections`, the class `Tries` provides various static utility methods regarding tries. By using the following methods, empty and unmodifiable trie can be created:

```java
Trie<SequenceType, ValueType> trie = Tries.emptyTrie();
StringTrie<ValueType> stringTrie = Tries.emptyStringTrie();
SortedTrie<SequenceType, ValueType> sortedTrie = Tries.emptySortedTrie();
SortedStringTrie<ValueType> sortedStringTrie = Tries.emptySortedStringTrie();
```

Furthermore, the class `Tries` provides the following methods to create unmodifiable tries that consist of a single entry:

```java
Trie<SequenceType, ValueType> trie = Tries.singletonTrie(key, value);
StringTrie<ValueType> stringTrie = Tries.singletonStringTrie(key, value);
SortedTrie<SequenceType, ValueType> sortedTrie = Tries.singletonSortedTrie(key, value);
SortedStringTrie<ValueType> sortedStringTrie = Tries.singletonSortedStringTrie(key, value);
```

In order to create an unmodifiable instance of an existing trie, the following utility methods can be used. This enables to return a trie from an API, which only provides read-only access to some data.

```java
Trie<SequenceType, ValueType> unmodifiableTrie = Trie.unmodifiableTrie(trie);
StringTrie<ValueType> unmodifiableStringTrie = Trie.unmodifiableStringTrie(stringTrie);
SortedTrie<SequenceType, ValueType> unmodifiableSortedTrie = Trie.unmodifiableSortedTrie(sortedTrie);
SortedStringTrie<ValueType> unmodifiableSortedStringTrie = Trie.unmodifiableSortedStringTrie(sortedStringTrie);
```
Binary file added assets/patricia_trie_example.odg
Binary file not shown.
Binary file added assets/trie_example.odg
Binary file not shown.
Binary file added doc/images/patricia_trie_example.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc/images/trie_example.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
66 changes: 66 additions & 0 deletions doc/javadoc/allclasses-frame.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<!-- NewPage -->
<html lang="en">
<head>
<!-- Generated by javadoc (1.8.0_151) on Tue Jan 23 15:27:49 CET 2018 -->
<title>All Classes</title>
<meta name="date" content="2018-01-23">
<link rel="stylesheet" type="text/css" href="stylesheet.css" title="Style">
<script type="text/javascript" src="script.js"></script>
</head>
<body>
<h1 class="bar">All&nbsp;Classes</h1>
<div class="indexContainer">
<ul>
<li><a href="de/mrapp/tries/datastructure/AbstractEmptyTrie.html" title="class in de.mrapp.tries.datastructure" target="classFrame">AbstractEmptyTrie</a></li>
<li><a href="de/mrapp/tries/datastructure/node/AbstractNode.html" title="class in de.mrapp.tries.datastructure.node" target="classFrame">AbstractNode</a></li>
<li><a href="de/mrapp/tries/datastructure/AbstractSingletonTrie.html" title="class in de.mrapp.tries.datastructure" target="classFrame">AbstractSingletonTrie</a></li>
<li><a href="de/mrapp/tries/datastructure/AbstractSortedTrie.html" title="class in de.mrapp.tries.datastructure" target="classFrame">AbstractSortedTrie</a></li>
<li><a href="de/mrapp/tries/datastructure/AbstractStringTrieWrapper.html" title="class in de.mrapp.tries.datastructure" target="classFrame">AbstractStringTrieWrapper</a></li>
<li><a href="de/mrapp/tries/datastructure/AbstractTrie.html" title="class in de.mrapp.tries.datastructure" target="classFrame">AbstractTrie</a></li>
<li><a href="de/mrapp/tries/datastructure/AbstractTrie.AbstractKeySet.html" title="class in de.mrapp.tries.datastructure" target="classFrame">AbstractTrie.AbstractKeySet</a></li>
<li><a href="de/mrapp/tries/datastructure/AbstractTrie.KeySet.html" title="class in de.mrapp.tries.datastructure" target="classFrame">AbstractTrie.KeySet</a></li>
<li><a href="de/mrapp/tries/datastructure/AbstractUnmodifiableStringTrie.html" title="class in de.mrapp.tries.datastructure" target="classFrame">AbstractUnmodifiableStringTrie</a></li>
<li><a href="de/mrapp/tries/datastructure/AbstractUnmodifiableTrie.html" title="class in de.mrapp.tries.datastructure" target="classFrame">AbstractUnmodifiableTrie</a></li>
<li><a href="de/mrapp/tries/datastructure/EmptySortedTrie.html" title="class in de.mrapp.tries.datastructure" target="classFrame">EmptySortedTrie</a></li>
<li><a href="de/mrapp/tries/datastructure/EmptyTrie.html" title="class in de.mrapp.tries.datastructure" target="classFrame">EmptyTrie</a></li>
<li><a href="de/mrapp/tries/util/EntryUtil.html" title="class in de.mrapp.tries.util" target="classFrame">EntryUtil</a></li>
<li><a href="de/mrapp/tries/datastructure/node/HashNode.html" title="class in de.mrapp.tries.datastructure.node" target="classFrame">HashNode</a></li>
<li><a href="de/mrapp/tries/HashStringTrie.html" title="class in de.mrapp.tries" target="classFrame">HashStringTrie</a></li>
<li><a href="de/mrapp/tries/HashTrie.html" title="class in de.mrapp.tries" target="classFrame">HashTrie</a></li>
<li><a href="de/mrapp/tries/Node.html" title="interface in de.mrapp.tries" target="classFrame"><span class="interfaceName">Node</span></a></li>
<li><a href="de/mrapp/tries/NodeValue.html" title="class in de.mrapp.tries" target="classFrame">NodeValue</a></li>
<li><a href="de/mrapp/tries/PatriciaStringTrie.html" title="class in de.mrapp.tries" target="classFrame">PatriciaStringTrie</a></li>
<li><a href="de/mrapp/tries/structure/PatriciaStructure.html" title="class in de.mrapp.tries.structure" target="classFrame">PatriciaStructure</a></li>
<li><a href="de/mrapp/tries/PatriciaTrie.html" title="class in de.mrapp.tries" target="classFrame">PatriciaTrie</a></li>
<li><a href="de/mrapp/tries/Sequence.html" title="interface in de.mrapp.tries" target="classFrame"><span class="interfaceName">Sequence</span></a></li>
<li><a href="de/mrapp/tries/util/SequenceUtil.html" title="class in de.mrapp.tries.util" target="classFrame">SequenceUtil</a></li>
<li><a href="de/mrapp/tries/datastructure/SingletonSortedTrie.html" title="class in de.mrapp.tries.datastructure" target="classFrame">SingletonSortedTrie</a></li>
<li><a href="de/mrapp/tries/datastructure/SingletonTrie.html" title="class in de.mrapp.tries.datastructure" target="classFrame">SingletonTrie</a></li>
<li><a href="de/mrapp/tries/datastructure/node/SortedListNode.html" title="class in de.mrapp.tries.datastructure.node" target="classFrame">SortedListNode</a></li>
<li><a href="de/mrapp/tries/SortedListStringTrie.html" title="class in de.mrapp.tries" target="classFrame">SortedListStringTrie</a></li>
<li><a href="de/mrapp/tries/SortedListTrie.html" title="class in de.mrapp.tries" target="classFrame">SortedListTrie</a></li>
<li><a href="de/mrapp/tries/SortedStringTrie.html" title="interface in de.mrapp.tries" target="classFrame"><span class="interfaceName">SortedStringTrie</span></a></li>
<li><a href="de/mrapp/tries/datastructure/SortedStringTrieWrapper.html" title="class in de.mrapp.tries.datastructure" target="classFrame">SortedStringTrieWrapper</a></li>
<li><a href="de/mrapp/tries/datastructure/SortedStringTrieWrapper.StringSequenceComparatorWrapper.html" title="class in de.mrapp.tries.datastructure" target="classFrame">SortedStringTrieWrapper.StringSequenceComparatorWrapper</a></li>
<li><a href="de/mrapp/tries/structure/SortedStructure.html" title="interface in de.mrapp.tries.structure" target="classFrame"><span class="interfaceName">SortedStructure</span></a></li>
<li><a href="de/mrapp/tries/SortedTrie.html" title="interface in de.mrapp.tries" target="classFrame"><span class="interfaceName">SortedTrie</span></a></li>
<li><a href="de/mrapp/tries/datastructure/StringNodeWrapper.html" title="class in de.mrapp.tries.datastructure" target="classFrame">StringNodeWrapper</a></li>
<li><a href="de/mrapp/tries/sequence/StringSequence.html" title="class in de.mrapp.tries.sequence" target="classFrame">StringSequence</a></li>
<li><a href="de/mrapp/tries/StringTrie.html" title="interface in de.mrapp.tries" target="classFrame"><span class="interfaceName">StringTrie</span></a></li>
<li><a href="de/mrapp/tries/datastructure/StringTrieWrapper.html" title="class in de.mrapp.tries.datastructure" target="classFrame">StringTrieWrapper</a></li>
<li><a href="de/mrapp/tries/structure/Structure.html" title="interface in de.mrapp.tries.structure" target="classFrame"><span class="interfaceName">Structure</span></a></li>
<li><a href="de/mrapp/tries/structure/Structure.Operation.html" title="enum in de.mrapp.tries.structure" target="classFrame">Structure.Operation</a></li>
<li><a href="de/mrapp/tries/Trie.html" title="interface in de.mrapp.tries" target="classFrame"><span class="interfaceName">Trie</span></a></li>
<li><a href="de/mrapp/tries/Tries.html" title="class in de.mrapp.tries" target="classFrame">Tries</a></li>
<li><a href="de/mrapp/tries/structure/UncompressedSortedStructure.html" title="class in de.mrapp.tries.structure" target="classFrame">UncompressedSortedStructure</a></li>
<li><a href="de/mrapp/tries/structure/UncompressedStructure.html" title="class in de.mrapp.tries.structure" target="classFrame">UncompressedStructure</a></li>
<li><a href="de/mrapp/tries/datastructure/node/UnmodifiableNode.html" title="class in de.mrapp.tries.datastructure.node" target="classFrame">UnmodifiableNode</a></li>
<li><a href="de/mrapp/tries/datastructure/UnmodifiableSortedStringTrie.html" title="class in de.mrapp.tries.datastructure" target="classFrame">UnmodifiableSortedStringTrie</a></li>
<li><a href="de/mrapp/tries/datastructure/UnmodifiableSortedTrie.html" title="class in de.mrapp.tries.datastructure" target="classFrame">UnmodifiableSortedTrie</a></li>
<li><a href="de/mrapp/tries/datastructure/UnmodifiableStringTrie.html" title="class in de.mrapp.tries.datastructure" target="classFrame">UnmodifiableStringTrie</a></li>
<li><a href="de/mrapp/tries/datastructure/UnmodifiableTrie.html" title="class in de.mrapp.tries.datastructure" target="classFrame">UnmodifiableTrie</a></li>
</ul>
</div>
</body>
</html>
Loading

0 comments on commit 8bb9973

Please sign in to comment.