Skip to content
Branch: master
Find file History
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
..
Failed to load latest commit information.
mapreduce
README.md
rebar

README.md

The Erlang implementation has three modes: unsafe, binary, or regex mode.

  1. Unsafe reads the entire dataset into memory, which is the fastest, but may not work on larger datasets. It then performs binary matching as described in step 2. 😁 (~2.5s)
  2. Binary uses binary pattern matching with file:read_line/1 (~4.4s)
  • Because binary matches are case sensitive (while the solution is to be case insensitive), the first step of the mapper algorithm is to generate all permutations for the word being counted.
  1. Regex uses regular expressions with file:read_line/1 (~6.6s)

Further discussion can be found in this pull request.

You can’t perform that action at this time.