Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Sep 18, 2011

  1. Hinrik Örn Sigurðsson

    v0.70

        - When using --train-fast, remove the "flushing cache" message when done
    
        - Word tokenizer:
            * Improve tokenization of email addresses
            * Use backspace instead of escape as a magic character when
              capitalizing text in multiple passes, since it's less likely to
              appear in tokens.
            * Preserve casing of words like "ATMs"
    authored September 18, 2011

Aug 23, 2011

  1. Hinrik Örn Sigurðsson

    Preserve casing of words like "ATMs"

    authored August 23, 2011
  2. Hinrik Örn Sigurðsson

    Reformat the latest Changes entries

    authored August 23, 2011
  3. Hinrik Örn Sigurðsson

    Allow double prefixes on IRC nicks

    authored August 23, 2011
  4. Hinrik Örn Sigurðsson

    Add README.pod and cover_db to MANIFEST.SKIP

    authored June 08, 2011

May 14, 2011

  1. Hinrik Örn Sigurðsson

    Mention tweetmix, remove dead HALBot link

    authored May 14, 2011

May 13, 2011

  1. Hinrik Örn Sigurðsson

    Use backspace as the magic string, not escape

    authored May 13, 2011
  2. Hinrik Örn Sigurðsson

    When using --train-fast, remove the "flushing cache" message when done

    authored May 13, 2011
  3. Hinrik Örn Sigurðsson

    Use consistent formatting in Changes

    authored May 13, 2011
  4. Hinrik Örn Sigurðsson

    Fix typo in comment

    authored May 13, 2011

May 09, 2011

  1. Hinrik Örn Sigurðsson

    Word tokenizer: Improve tokenization of email addresses

    authored May 09, 2011

May 07, 2011

  1. Hinrik Örn Sigurðsson

    v0.69

        - Scored engine: Prefer shorter replies, like MegaHAL/cobe do
    
        - Word tokenizer:
            * Improve matching/capitalization of filenames and domain names
            * Match timestamps as single tokens
            * Match IRC nicks (<foobar>, <@foobar>, etc) as single tokens
            * Match IRC channel names (#foo, &bar, +baz)
            * Match various prefixes and postfixes with numbers
            * Match "#1" and "#1234" as single tokens
            * Match </foo> as a single token
    
        - Depend on MouseX::Getopt 0.33 to fix test failures
    authored May 07, 2011

May 06, 2011

  1. Hinrik Örn Sigurðsson

    Shorten this

    authored May 06, 2011
  2. Hinrik Örn Sigurðsson

    Remove unused variable

    authored May 06, 2011
  3. Hinrik Örn Sigurðsson

    Depend on MouseX::Getopt 0.33 to fix test failures

    authored May 06, 2011
  4. Hinrik Örn Sigurðsson

    Match </foo> as a single token

    authored May 06, 2011
  5. Hinrik Örn Sigurðsson

    Match IRC channel names (#foo, &bar, +baz)

    authored May 06, 2011
  6. Hinrik Örn Sigurðsson

    Forget the tabs matching

    It's not that useful anyway.
    authored May 06, 2011
  7. Hinrik Örn Sigurðsson

    Prevent some cache collisions here

    authored May 06, 2011
  8. Hinrik Örn Sigurðsson

    Silence some Perl syntax warnings

    authored May 06, 2011
  9. Hinrik Örn Sigurðsson

    Prettify the Changes file

    authored May 06, 2011
  10. Hinrik Örn Sigurðsson

    Match "#1" and "#1234" as single tokens

    authored May 06, 2011
  11. Hinrik Örn Sigurðsson

    Allow whitespace in timestamp tokens

    authored May 06, 2011
  12. Hinrik Örn Sigurðsson

    Match various prefixes and postfixes with numbers

    authored May 06, 2011
  13. Hinrik Örn Sigurðsson

    Match timestamps and IRC nicks

    I changed the way input is processed, so that we can match whitespace in
    tokens. This allows matching paths with spaces in them, as well as IRC
    nicks from irssi such as < literal>.
    authored May 05, 2011
  14. Hinrik Örn Sigurðsson

    Match tabs as tokens

    authored May 05, 2011
  15. Hinrik Örn Sigurðsson

    Improve matching/capitalization of filenames/domains

    authored May 05, 2011

May 04, 2011

  1. Hinrik Örn Sigurðsson

    Prefer shorter replies

    authored May 04, 2011
  2. Hinrik Örn Sigurðsson

    Remove dead code

    Due to how the tokenizer works, at least one of the tokens will always
    have normal spacing.
    authored May 04, 2011

May 03, 2011

  1. Hinrik Örn Sigurðsson

    v0.68

        - Speed up the learning of repetitive sentences by caching more
    
        - Added Hailo::Engine::Scored, which generates multiple replies (limited
          by time or number of iterations) and returns the best one. Based on
          code from Peter Teichman's Cobe project.
    
        - Fixed a bug which caused the tokenizer to be very slow at capitalizing
          replies which contain things like "script/osm-to-tilenumbers.pl"
    
        - Speed up learning quite a bit (up to 25%) by using more efficient SQL.
    
        - Add --train-fast to speed up learning by up to an additional 45% on
          large brains by using aggressive caching. This uses a lot of memory.
          Almost 600MB with SQLite on a 64bit machine for a brain which
          eventually takes 134MB on disk (trained from a 350k line IRC log).
    
        - Word tokenizer:
            * Preserve casing of Emacs key sequences like "C-u"
            * Don't capitalize words after ellipses (e.g. "Wait... what?")
            * When adding a full stop to paragraphs which end with a quoted word,
              add it inside the quotes (e.g. "I heard him say 'hello there.'")
            * Make it work correctly when the input has newlines
    authored May 03, 2011
  2. Hinrik Örn Sigurðsson

    Add --train-fast for great justice

    authored May 03, 2011

May 02, 2011

  1. Hinrik Örn Sigurðsson

    Fix typo in Changes

    authored May 02, 2011
  2. Hinrik Örn Sigurðsson

    Make the Word tokenizer work correctly when the input has newlines

    authored May 02, 2011
  3. Hinrik Örn Sigurðsson

    Separate these from the rest

    authored May 02, 2011
  4. Hinrik Örn Sigurðsson

    Rename this regex for clarity

    authored May 02, 2011
Something went wrong with that request. Please try again.