Java path matching lib (HTTP, ...)
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
script
src
.gitignore
README.md
pom.xml

README.md

Path Travel Agent

A path matcher. Well suited for HTTP, but also works with any request/response system.

Only does paths, not method/verb, content-type, etc.

Made to be embedded by other engines that wants fast and type-safe request/response path matching.

Installing

Install from maven central.

<dependency>
  <groupId>com.augustl</groupId>
  <artifactId>pathtravelagent</artifactId>
  <version>0.1.1</version>
</dependency>

API documentation

The javadoc is hosted here:

http://docs.augustl.com/com.augustl.pathtravelagent/0.1.1/

Examples

There's a number of unit tests that showcases the various ways to build routes and match them.

https://github.com/augustl/path-travel-agent/blob/master/src/test/java/com/augustl/pathtravelagent/PathTravelAgentTest.java

Example of integration

There's a runnable test that demonstrates how to integrate path-travel-agent in a HTTP environment. You can find it here:

https://github.com/augustl/path-travel-agent/blob/master/src/test/java/com/augustl/pathtravelagent/HttpExampleTest.java

Benchmark

Path Travel Agent is very fast.

PathTravelAgentBenchmark.largeRouter: [measured 10 out of 15 rounds, threads: 1 (sequential)]
 round: 0.04 [+- 0.01], round.block: 0.00 [+- 0.00], round.gc: 0.00 [+- 0.00], \
 GC.calls: 2, GC.time: 0.02, time.total: 0.74, time.warmup: 0.37, time.bench: 0.37

PathTravelAgentBenchmark.sparkRouter: [measured 10 out of 15 rounds, threads: 1 (sequential)]
 round: 0.66 [+- 0.03], round.block: 0.00 [+- 0.00], round.gc: 0.00 [+- 0.00], \
 GC.calls: 11, GC.time: 0.02, time.total: 10.30, time.warmup: 3.72, time.bench: 6.58

path-travel-agent spends 0.37 seconds, while spark spends 6.58 seconds.

My benchmark is quite possibly naive, and is deliberatly constructed to make array-based routers look bad. Better benchmarks are welcome :)

How it works

Most routing libraries, such as Spark and Express, is based on an array of regular expressions. This has a performance characteristic O(N), as every regexp in the system has to be tested against the path to be matched.

path-travel-agent uses a bastardized radix tree. The routes are represented as a tree of hash maps. Each node has a handler, a hash map of known sub-nodes, and a single parameterized node.

Let's assume the following route set

  • /
  • /people
  • /people/important
  • /people/:person-id
  • /people/:person-id/friends
  • /people/:person-id/friends/:friend-id
  • /about

path-travel-agent will store these routes as:

{
  handler: IRouteHandler,
  pathSegmentChildNodes: {
    "people": {
        handler: IRouteHandler,
        parametricChild: {
            paramName: "person-id",
            handler: IRouteHandler,
            pathSegmentChildNodes: {
                "friends": {
                    handler: IRouteHandler,
                    parametricChild: {
                        paramName: "friend-id",
                        handler: IRouteHandler
                    }
                }
            }
        },
        pathSegmentChildNodes: {
            "important": {handler: IRouteHandler}
        },
    }
    "about": {
        handler: IRouteHandler
    }
  }
}

When matching, the flow is:

  • The path /people/1/friends/2 will get parsed into ["people", "1", "friends", "2"].
  • We ask the root node to recognize "people".
  • We look up "people" in pathSegmentChildNodes.
  • We find a match.
  • We ask that child node to route "1"
  • We look up "1" in pathSegmentChildNodes
  • Nothing was found. We call the parametricChild
  • We ask that parametric child node (which is just a regular node) to recognize "friends"
  • We find "friends" in pathSegmentChildNodes
  • We ask that child node to recognize "2"
  • It does not find any match in pathSegmentChildNodes, so we ask the parametricChild

One operation per path segment in a tree structure, is much more efficient than looping through an array of tens or hundreds of routes and matching with a regexp.

The reason this is called a "bastardized radix tree" is that it's not really a radix tree, it's just nested hash maps. Also, the parametricChild makes it even less of a radix tree.

License

3-clause BSD License

Copyright

Copyright 2014, August Lilleaas