Block or report user

Popular repositories

  1. MapReduceAlgorithms

    Data-Intensive Text Processing with MapReduce

    JavaScript 542 292

  2. Cloud9

    Cloud9 is a Hadoop toolkit for working with big data

    Java 234 136

  3. twitter-tools

    Twitter Tools

    Java 181 92

  4. Mr.LDA

    Scalable Topic Modeling using Variational Inference in MapReduce

    Java 132 94

  5. warcbase

    Warcbase is an open-source platform for managing analyzing web archives

    Java 114 42

  6. Ivory

    A Hadoop toolkit for web-scale information retrieval research

    Java 71 38

593 contributions in the last year

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Jan Mon Wed Fri

Contribution activity First pull request First issue First repository Joined GitHub

January 2017

Created an issue in lintool/bespin that received 1 comment

Write a Java tokenizer

Write a tokenizer so the definition of "word" is consistent across all examples...

Seeing something unexpected? Take a look at the GitHub profile guide.