NodeJS v4.3.0
npm install -g maplesyrup
maplesyrup < input_file > output_file
Given a list of patterns and a list paths, for each path find a pattern from the list of patterns that best describes the path.
To help us think about the problem, let's give a definition to few things:
- Number of paths is N
- Number of patterns is M
- A path delimiter is one of: forward slash, beginning of line, end of line
- A pattern delimiter is one of: comma, beginning of line, end of line
- A path segment is a single piece of path that is contained between two path delimiters
- A pattern segment is a single piece of pattern that is contained between two path delimiters
- A length of pattern or a path is the number of segments they have
- Index of a segment is its position from left to right
- Avarage length of a path is K
- Average length of a pattern is P
Let's also assume that in order for a pattern to match a path:
- They must have the same length
- Each segment of a path must match a segment of a pattern of the same index
We have to check each path against each pattern which gives us O(N * M) or more precisely O(N * K * M).
Using a tree allows us to minimize number of possible matches for each consecutive segment.
- Construct a pattern tree in O(M * log(M * P)).
- Traverse the tree once for each path in O(N * K * log(M))
It might be possible to further optimize the algorithm by comparing the size of N and M. For example, for a really large M and small N it would be faster to skip the tree construction.
-
Referencing children using a hashmap inside the pattern tree plays a big role in performance gains. It allows us to not enumerate through all the patterns and instead only look at the ones with matching segments.
-
To learn about a specific term, open the corresponding file in lib. For example, a
path
is defined in ./lib/path.js -
Next step for this project is to create automated unit tests.