Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
branch: develop
Commits on Aug 20, 2011
  1. @piskvorky

    added utils.upload_chunked fnc

    piskvorky authored
    * uploads a corpus to SimServer in smaller chunks
  2. @piskvorky
  3. @piskvorky

    speed up of lsi[corpus]

    piskvorky authored
    * uses sparse * dense multiplication (was: dense * dense)
    * about 5x speed-up :-)
    * lowering lsi.num_topics (=slicing projection.u) is less efficient, because array order is always wrong now. no big deal, few people manually decrease num_topics anyway. could probably be improved by clever use sparsetools.matvecs...
Commits on Aug 3, 2011
  1. @piskvorky
  2. @piskvorky

    distributed code now uses new Pyro4 (was: Pyro 4.1)

    piskvorky authored
    * plus documentation update
Commits on Jul 16, 2011
  1. @piskvorky

    added option of query = a specific index doc

    piskvorky authored
    * if we're interested in how similar is document #123 in the index against every other index document: `index.similarity_by_id(123)`
    * so the query is only a number 0 <= query < len(index), not a full document/vector like in standard `index[query]`
    * implemented in the Similarity class
  2. @piskvorky
  3. @piskvorky
  4. @piskvorky
Commits on Jul 7, 2011
  1. @piskvorky
Commits on Jul 6, 2011
  1. @piskvorky
Commits on Jun 28, 2011
  1. @piskvorky
  2. @piskvorky
  3. @piskvorky
  4. @piskvorky

    work around strange Pyro packaging (version numbers)

    piskvorky authored
    * to be removed once the new Pyro (>=4.4) is integrated
Commits on Jun 27, 2011
  1. @piskvorky

    added alias any2utf8 for to_utf8

    piskvorky authored
    * and any2unicode for to_unicode
  2. @piskvorky

    Merge pull request #44 from dedan/develop

    piskvorky authored
    fix the module import when linking to the git root instead of module
  3. @dedan

    fix the module import when linking to the git root instead of module

    dedan authored
    for some application I need to link to the gensim folder which is also the root of the repository. This script helps python to find the actual sourcecode of the module and had to be changed because radim moved the source within the repo
Commits on Jun 25, 2011
  1. @piskvorky

    fixed one PEP8 orphan

    piskvorky authored
Commits on Jun 22, 2011
  1. @piskvorky
  2. @piskvorky
  3. @piskvorky

    Merge branch 'develop' of github.com:piskvorky/gensim into develop

    piskvorky authored
    Conflicts:
    	gensim/test/test_models.py
  4. @piskvorky

    Merge pull request #40 from Dieterbe/develop

    piskvorky authored
    Rename variable "chunks" to more sensible "chunksize"
  5. Rename variable "chunks" to more sensible "chunksize"

    Dieter Plaetinck authored
  6. @piskvorky

    removed print_debug calls from the LSI unittest

    piskvorky authored
    * was causing `invalid value in divide` warnings in numpy
    * see http://groups.google.com/group/gensim/browse_thread/thread/45c1c9efe91ce8d0
Commits on Jun 19, 2011
  1. @piskvorky
  2. @piskvorky

    up version: 0.8.0rc1

    piskvorky authored
  3. @piskvorky
  4. @piskvorky
Commits on Jun 18, 2011
  1. @piskvorky

    improved doc strings

    piskvorky authored
Commits on Jun 16, 2011
  1. @piskvorky
  2. @piskvorky

    Added chunking for lsi[corpus] transformation (about 3x faster)

    piskvorky authored
    * before, lsi[corpus] was just syntactic sugar for (lsi[doc] for doc in corpus)
    * now, lsi[corpus] proceeds in chunks of documents (256 by default) and transforms each entire chunk at once
    * the reason is, transforming a chunk = matrix * matrix multiply, is faster than 256 single document transforms = matrix * vector multiplies (bc. of cache&co)
Commits on Jun 15, 2011
  1. @piskvorky
  2. @piskvorky
  3. @piskvorky
Something went wrong with that request. Please try again.