Skip to content

Releases: paulhoule/infovore

Stable release for Weekly Freebase processing

10 Nov 22:41
Compare
Choose a tag to compare

This is a stable release that can be used for processing Freebase for a few weeks while I do development work which is a little more speculative. The only changes are #64 (Spin Off Centipede) and #74 (Delete Chopper Poject), these are documented here

Stop speculative execution for pse3

03 Nov 21:35
Compare
Choose a tag to compare

This prevents sporadic failures associated with the use of real multiple output paths with speculative execution. Changes are documented in milestone t20131103

Fix $XXXX escapes in IRIs

28 Oct 19:19
Compare
Choose a tag to compare

This release fixes problems #71 (unescape $XXXX escapes in IRIs) and #72 (really turn off speculative execution in sieve3). The latter prevented sporadic crashes while running on big clusters.

Local variables in flows and filter out $[0-9]{4} escapes

21 Oct 20:17
Compare
Choose a tag to compare

Major changes in this release are the introduction of local variables in flows #55 and strict filtering of invalid unicode escape sequences endemic to Freebase #61, as well as reliability and quality improvements such as #66 and #62. Although it's not reflected in the open source code, progress has been made on :BaseKB Now productization #59 in the sense that there is now an RSS feed.

See the matching milestone for a complete list of changes.

Now publishing to maven central

15 Oct 20:32
Compare
Choose a tag to compare

This release contains configuration changes so that we can push releases into maven central.

Multi-Step Flows

14 Oct 14:16
Compare
Choose a tag to compare

Issue fixed in this release are written up here

The major new feature is #32, which lets a number of Hadoop jobs be grouped into a "flow", named after the "job flow" concept in the Amazon EMR API. In the case of :BaseKB Now production, all of the steps are submitted as a unit to Amazon EMR so that a single cluster does all the work, rather than starting new clusters. This helps with speed, reliability, cost and all that.

Not using EMR? No problem. Haruhi will submit the jobs sequentially to your cluster.

This release has some minor bug fixes and also marks increasing process maturity because the maven-release plugin is integrated #44 and we are know using Travis CI to monitor build quality #45.

Major fix -- fix issue #46 causing data loss in pse3

06 Oct 21:55
Compare
Choose a tag to compare

This release fixes a bug, issue #46 that was caused by an incorrectly implemented comparator function.

t20130927 -- spring workaround for sieve3

27 Sep 16:18
Compare
Choose a tag to compare

This fixes some problems that have plagued recent releases, most importantly an incompatibility between Spring and our JAR packaging.

t20130920: add ranSampler app, make fields of PrimitiveTriple non-public, add …

20 Sep 21:19
Compare
Choose a tag to compare

Add ranSampler app to bakemono. Made all fields of PrimitiveTriple private

sieve3 Horizontal decomposition of Freebase

17 Sep 20:01
Compare
Choose a tag to compare

This version adds the sieve3 tool that partitions RDF data, such as Freebase, into mutually exclusive subsets.

capture