Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
branch: develop
Commits on Dec 2, 2011
  1. Radim Řehůřek

    up version (0.8.3)

    piskvorky authored
  2. Radim Řehůřek

    doc update

    piskvorky authored
  3. Radim Řehůřek
Commits on Dec 1, 2011
  1. Radim Řehůřek

    reworked LDA state handling

    piskvorky authored
    * clean up of program flow
    * as a by-product, LDA now consumes less memory
  2. Radim Řehůřek

    deleted setup.cfg

    piskvorky authored
    * not sure why it was originally added, but it's empty now...
Commits on Nov 30, 2011
  1. Radim Řehůřek
  2. Radim Řehůřek

    added extra asserts to LDA

    piskvorky authored
Commits on Nov 26, 2011
  1. Radim Řehůřek

    optimize shard re-open

    piskvorky authored
Commits on Nov 23, 2011
  1. Radim Řehůřek

    Merge pull request #66 from piskvorky/issue65

    piskvorky authored
    re #65: fixed the sharding bug
Commits on Nov 16, 2011
  1. Radim Řehůřek

    re #65: fixed the sharding bug

    piskvorky authored
    * also made similarity unit tests more extensive
Commits on Nov 3, 2011
  1. Radim Řehůřek

    Merge pull request #62 from Dieterbe/develop

    piskvorky authored
    Remove stale comment
  2. Radim Řehůřek

    Merge pull request #63 from Dieterbe/clarify-num_best-output

    piskvorky authored
    clarify the output format of index[query]
  3. clarify the output format of index[query], wrt special cases such as …

    Dieter Plaetinck authored
    …'empty vectors'
  4. Remove stale comment

    Dieter Plaetinck authored
    reusing a variable set in a loop is not a hack, merely good usage of
    python scope semantics.
Commits on Oct 31, 2011
  1. Radim Řehůřek

    make jquery load from google CDN

    piskvorky authored
    * instead of local _static/jquery.js file
  2. Radim Řehůřek

    added css button gradient

    piskvorky authored
Commits on Oct 30, 2011
  1. Radim Řehůřek

    up version: 0.8.2

    piskvorky authored
  2. Radim Řehůřek

    addthis.js widget now loads asynchronously (was: damn slow)

    piskvorky authored
    * by using the same javascript trick google analytics uses
    * also cleaned up html templates a bit
    * fixed a bug that made gensim site render weirdly under Internet Explorer
  3. Radim Řehůřek

    updated gensim dependencies

    piskvorky authored
    * numpy < 1.3 no longer works, because gensim uses `mmap_mode` in numpy.load()
    * all tox tests pass now
    * closes #61
  4. Radim Řehůřek

    re #61: using `tox` for automated testing

    piskvorky authored
    * use one recent environment (py2.7 + latest numpy and scipy)
    * plus one oldest supported env (py2.5 + numpy 1.3 + scipy 0.7)
    * tox doesn't support scipy as dependency, so the deps are pre-installed in their respective site-packages
Commits on Oct 29, 2011
  1. Radim Řehůřek

    renamed README.rst (was: README.txt)

    piskvorky authored
    * closes #60
  2. Radim Řehůřek

    removed simserver from the sources

    piskvorky authored
    * will appear as a separate project
    * see http://permalink.gmane.org/gmane.comp.ai.gensim/645
  3. Radim Řehůřek

    doc update

    piskvorky authored
    * added `about` section
  4. Radim Řehůřek
  5. Radim Řehůřek
  6. Radim Řehůřek

    re #59: improved SVD accuracy

    piskvorky authored
    * the fix was actually simple: i just missed algo 4.4 in Halko's paper, which addresses exactly this issue
    * thanks to Mark Tygert for pointing me to the fix
    * closes #59
Commits on Oct 27, 2011
  1. Radim Řehůřek

    LDA now interprets topics differently

    piskvorky authored
    * a big conceptual change of (the interpretation of) LDA results -- although only 2 lines of code + 1 line in unittest
    * LDA model now interprets `lambda` directly as the desired word-topic distributions
    * and similarly, `gamma` are now directly topic-document distributions
    * was: `lambda` and `gamma` were only parameters to dirichlet, so that the desired topics were e^{E[log lambda]}
    * the previous interpretation underrepresented components with a lot of variance (e^E[log x] <= E[x], Jensen's inequality). under both interpretations, the resulting vector is normalized to sum to 1, so although the results are different numerically, both make sense and neither is "wrong"
    * this change was initiated by seeing Matt Hoffman's reponse on the topic-models mailing list: https://lists.cs.princeton.edu/pipermail/topic-models/2011-October/001600.html
    * closes #57
  2. Radim Řehůřek

    updated documentation

    piskvorky authored
    * change urls to radimrehurek.com
    * fixed LsiModel.print_topics, which ignored its num_topics parameter
    * fixed one forgotten SessionServer import
Commits on Oct 10, 2011
  1. Radim Řehůřek
  2. Radim Řehůřek

    make simserver tests conditional

    piskvorky authored
    * no unittesting if importing SessionServer fails
  3. Radim Řehůřek

    removed automatic simserver import from `similarities/__init__.py`

    piskvorky authored
    * because simserver imports sqlitedict, an optional dependency
    * `import gensim` imported simserver via `__init__`, which made sqlitedict non-optional (a bug)
    * github issue #55: piskvorky#55
Commits on Oct 7, 2011
  1. Radim Řehůřek
  2. Radim Řehůřek
Commits on Oct 5, 2011
  1. Radim Řehůřek

    up version: 0.8.1

    piskvorky authored
  2. Radim Řehůřek

    fixed logger object names

    piskvorky authored
Something went wrong with that request. Please try again.