Skip to content

Releases: damballa/parkour

0.6.2

05 Feb 19:00
Compare
Choose a tag to compare
Overview

A mixed bag of bug fixes, internal refactorings, and new features.

User-visible changes
  • Support passing classes as task function etc parameters.
  • Interface for defining input formats from Clojure.
  • Added range dseq for jobs over ranges of integers.
  • Added dval dseq for jobs over the content of a dval.
  • Correct local processing of zero-split dseqs.
  • Ensure output paths exists on job success.

0.6.1

23 Nov 05:30
Compare
Choose a tag to compare
Overview

A minor release primarily fixing a handful of bugs.

User-visible changes
  • Ignore Errors during namespace loading.
  • NLineInputFormat dseq default source shape is :vals.
  • Retry failing job-status polls with exponential backoff.
  • Correctly reduce-by empty collections.
  • Add and use reducers version of concat.
  • Coerce dseq/move! dst-path to a Hadoop Path.
  • Fix busted pr/distinct-by.
  • Fix with-meta for dcpaths.

0.6.0

23 Nov 05:29
Compare
Choose a tag to compare
Overview

A significant release with a few breaking changes and some powerful new features. The most import new features in dvals -- a value-oriented mechanism for delivering data via the distributed cache.

Breaking changes
  • Deprecate direct invocation of source-shaping functions.
  • Normalize shuffle & sink type/schema arguments to vectors of such.
  • TextInputFormat dseq defaults to :vals source shape.
  • AvroKeyInputFormat dseq defaults to :keys source shape.
  • AvroKeyOutputFormat dsink defaults to :keys sink shape.
Other changes
  • Allow shorthand partition shuffle to specify only key class.
  • Add dseq/input-paths for determining dseq input paths.
  • Support direct Avro input via Hadoop filesystem paths.
  • Add cser namespace; de/serialize vars as task arguments.
  • Add distributed values (dvals) and documentation.
  • Modify file dsinks to allow implicit transient output paths.
  • Allow csteps to specify default source/sink shapes.
  • Allow in-memory dseqs to specify default source shape.
  • Wait for Hadoop 1.x FS cleanup hook to complete on exit.
  • Add fexecute function to job graph API.
  • Use combiner as reducer when reducer not later specified.
  • Extend reducers namespace of reducer-based helpers.
  • Add toolbox namespace of common task functions.
  • Make tuple sources r/fold-able via map-combine.
  • Allow pg/input to handle a vector of :input nodes.
  • Load task-side the same namespaces loaded locally.

0.5.4

08 Feb 19:48
Compare
Choose a tag to compare
Overview

This release is all about REPL-support features. Parkour now supports connecting to a live cluster, then running local-mode jobs, mixed-mode jobs, and remote jobs, all from the same REPL process. See the new docs/repl.md documentation for details.

User-visible changes
  • Ensure job-failure clean-up runs only once.
  • Only set job JAR in basic cstep when still unset.
  • Working local-mode tests under an active cluster configuration.
  • Build job JARs and launch remote jobs from the REPL.
  • Support experimental collfn ::{source,sink}-as metadata.
  • Round-trip fragment-less distcache URIs through fs/distcache!.
  • Added sampling dseq.
  • Include any local task exception in job failure cause chain.

0.5.3

03 Feb 18:10
Compare
Choose a tag to compare
User-visible changes
  • Add fs/path-exists? function.
  • Stop deleting job output paths when they already exist.
  • Drop support for Hadoop version 0.20.205.
  • Properly close dux record writers when leaving task scope.
  • Add explicit dux/{map,combine}-output sink functions.
  • Delete output paths for in-progress jobs when interrupted.

0.5.2

03 Feb 18:10
Compare
Choose a tag to compare
User-visible changes
  • Restore parallel execution for cluster jobs.
  • Fix NPE caused by nil-dseq dsinks.

0.5.1

03 Feb 18:09
Compare
Choose a tag to compare
User-visible changes
  • Fix broken ability to specify Avro grouping schema via shuffle config step.
  • Run local jobs in serial to work around MAPREDUCE-5367.
  • Work-around allowing Avro multiple files per named output.
  • Expose extended version of configuration test-helper.

0.5.0

17 Nov 19:59
Compare
Choose a tag to compare
Breaking changes
  • The default map/reduce task function interface uses the new collfn adapter. The previous interface may be specified via the contextfn adapter.
  • Local reduction of dseqs yields unwrapped values. Raw values may be accessed via source-for with the :raw? option.
Other changes
  • Allow seqs to be used as tuple sources.
  • Allow chaining of sink-as results to source-shaping functions.
  • Allow var entry points to directly specify the adapter function used to transform their values to the type-specific Parkour base interface.

0.4.1

15 Nov 13:07
Compare
Choose a tag to compare
User-visible changes
  • Fixed bug in reduced handling of job tuple-source reductions.
  • Make tuple-source re-shape results seqable, allowing tasks to be written in terms of lazy sequences.

0.4.0

15 Nov 10:28
Compare
Choose a tag to compare
User-visible changes
  • Initial public release!