Permalink
Commits on May 26, 2011
  1. Polished README

    committed May 26, 2011
  2. More docs about re2

    committed May 26, 2011
  3. Added re2 support

    committed May 26, 2011
  4. Small bugfixes with use of cursor

    committed May 26, 2011
Commits on Apr 26, 2011
  1. edu.wiki.modify.IndexPruner is not used in current configuration. Fix…

    …ed README in this respect, also mentioning the difference between IndexModifier and MemIndexModifier.
    faraday committed Apr 26, 2011
Commits on Sep 9, 2010
  1. a) fixed: WINDOW_THRES should be 0.005 in pruning b) added MemIndexMo…

    …difier for sorting TF-IDF vectors in memory (needs lots of RAM for recent dumps) c) returning -1 for unknown inputs d) unknown pairs treated as 0 relatedness in TestWordsim353 e) memory cleanups
    Çağatay Çallı committed Sep 9, 2010
Commits on Sep 1, 2010
  1. scanData reads disambig list from Wikiprep

    Çağatay Çallı committed Sep 1, 2010
Commits on Aug 31, 2010
  1. changed to a variable size (limit parameter) for concept vector retri…

    …eval
    Çağatay Çallı committed Aug 31, 2010
  2. added a task to retrieve concept vectors to ESA web service

    Çağatay Çallı committed Aug 31, 2010
Commits on Aug 30, 2010
Commits on Aug 26, 2010
  1. converted IndexModifier+IndexPruner to a faster solution using 1) sor…

    …t utility 2) byte-encoded doc-score vectors in DB
    Çağatay Çallı committed Aug 26, 2010
Commits on Aug 23, 2010
  1. a) don't compute Spearman, just collect ESA scores for Wordsim-353 b)…

    … ESASearcher returns 0 for unknown terms instead of -1
    Çağatay Çallı committed Aug 23, 2010
Commits on Aug 22, 2010
  1. a) fixed handling of no results for search query, return null b) adde…

    …d Spearman correlation computation for on Wordsim-353
    Çağatay Çallı committed Aug 22, 2010
Commits on Aug 21, 2010
  1. minor fix for tokenization

    faraday committed Aug 21, 2010
  2. actually fixed filter bug about reading term from buffer

    Çağatay Çallı committed Aug 21, 2010
  3. fixed bug in custom filter

    Çağatay Çallı committed Aug 21, 2010
  4. introduced a custom filter to indexing code, filtering mixed alphanum…

    …erics etc
    Çağatay Çallı committed Aug 21, 2010
Commits on Aug 20, 2010
  1. added missing UTF-8 option in connection url

    Çağatay Çallı committed Aug 20, 2010
Commits on Aug 19, 2010
  1. fix for avoiding non-existing links of results

    Çağatay Çallı committed Aug 19, 2010
  2. fixed UTF-8 problem in pruning

    Çağatay Çallı committed Aug 19, 2010
  3. added combined feature extraction and related similarity test, and cu…

    …stom tokenizer according to delim list of Gabrilovich et al
    Çağatay Çallı committed Aug 19, 2010
  4. Merge branch 'master' of git@github.com:faraday/wikiprep-esa

    Çağatay Çallı committed Aug 19, 2010
  5. minor tweaks in analyzer: no length filter, 3xPorter stem

    Çağatay Çallı committed Aug 19, 2010
Commits on Aug 18, 2010
  1. tokenizer applies stemming, word filter counts unique tokens, include…

    …s regex of Gabrilovich et al
    faraday committed Aug 18, 2010
  2. Merge branch 'master' of git@github.com:faraday/wikiprep-esa

    Çağatay Çallı committed Aug 18, 2010
  3. modified tokenizer delimiter characters

    Çağatay Çallı committed Aug 18, 2010
Commits on Aug 17, 2010
  1. updated README

    faraday committed Aug 17, 2010
  2. part of previous commit

    faraday committed Aug 17, 2010
Commits on Aug 16, 2010