Merge pull request #1 from piskvorky/batch_sentences

Simplify job loop + merge latest gensim
piskvorky · Oct 20, 2015 · 14888eb · 14888eb
2 parents 9f95ff2 + e35e2ee
commit 14888eb
Show file tree

Hide file tree

Showing 32 changed files with 3,194 additions and 3,305 deletions.
diff --git a/CHANGELOG.txt b/CHANGELOG.txt
@@ -1,12 +1,26 @@
 Changes
 =======
 
+NEXT
 
-0.12.2
+* Make show_topics return value consistent across models (Christopher Corley, #448)
+  - All models with the `show_topics` method should return a list of
+    `(topic_number, topic)` tuples, where `topic` is a list of
+    `(word, probability)` tuples.
+  - This is a breaking change that affects users of the `LsiModel`, `LdaModel`,
+  and `LdaMulticore` that may be reliant on the old tuple layout of
+  `(probability, word)`.
 
+0.12.2, 19/09/2015
+
+* tutorial on text summarization (Ólavur Mortensen, #436)
+* more flexible vocabulary construction in word2vec & doc2vec (Philipp Dowling, #434)
 * added support for sliced TransformedCorpus objects, so that after applying (for instance) TfidfModel the returned corpus remains randomly indexable. (Matti Lyra, #425)
 * changed the LdaModel.save so that a custom `ignore` list can be passed in (Matti Lyra, #331)
 * added support for NumPy style fancy indexing to corpus objects (Matti Lyra, #414)
+* py3k fix in distributed LSI (spacecowboy, #433)
+* Windows fix for setup.py (#428)
+* fix compatibility for scipy 0.16.0 (#415)
 
 0.12.1, 20/07/2015
 

diff --git a/README.rst b/README.rst
@@ -19,7 +19,7 @@ Target audience is the *natural language processing* (NLP) and *information retr
 Features
 ---------
 
-* All algorithms are **memory-independent** w.r.t. the corpus size (can process input larger than RAM),
+* All algorithms are **memory-independent** w.r.t. the corpus size (can process input larger than RAM, streamed, out-of-core),
 * **Intuitive interfaces**
 
   * easy to plug in your own input corpus/datastream (trivial streaming API)