Skip to content
Compare
Choose a tag to compare
Compare
Choose a tag to compare

Features

n/a

Improvements

Bugs

n/a

Compare
Choose a tag to compare

Features

  • introducing aliases, explicit associations between locations and their content (see #135) . Aliases can be urls (e.g., https://example.org/data.zip), but also urns (e.g., urn:uuid:a8037839-0b09-440e-bd9f-d1e34d7770b5).

Improvements

Bugs

  • append slashes when building sofware heritage api paths; fixes #141
Compare
Choose a tag to compare

Features

n/a

Improvements

  • reorganized software modules to enable re-use of specific parts of preston (e.g., content-addressed storage, hexastore modules)

Bugs

  • change bloom:gz: to gz:bloom: as discussed in #131 by @mielliott
  • fix jekyll templates with archive preston meta data #133
Compare
Choose a tag to compare

Features

  • add support for line-based matching/addressing, for example: line:hash://sha256/abc...!/L2 to point to line 2 in content with signature hash://sha256/abc... (#109)

Improvements

  • stop processing when there's a printing error (#89)
  • add max-per-content option to preston grep (#89)
  • update support for configuring target path pattern of [preston cp] (07f503a)
  • refactor preston into preston-cas (content-addressed storage) and preston-cli (command line interface) to facilitate library re-use (#127)

Bugs

n/a

Compare
Choose a tag to compare

Features

n/a

Improvements

  • add sketch generation operations for bloom filters and theta sketches (#113)
  • add support for sketch intersect/union operation (#113)
  • additional configuration options for jekyll site content generation (#107, #110)
  • bump maven-s3-wagon 0.0.3 -> 0.0.4 for improved upload performance of maven artifacts
  • add support for crawling GBIF indexed datasets in/with biocase format/metadata

Bugs

n/a

Compare
Choose a tag to compare

Features

n/a

Improvements

  • enhance image corpus building: add support for indexing images via GBIF occurrence API (#104)
  • enhance jekyll site content generation: to support GBIF indexed records

Bugs

n/a

Compare
Choose a tag to compare

Features

n/a

Improvements

  • use slf4j.org to make preston more log framework agnostic
  • various refactoring to increase code maintenance
  • switch to maven-s3-wagon v0.0.3 on maven central to enable non-AWS s3 deployments.
  • add "grep" as alias for "match" (e.g., preston grep "[some regex]")

Bugs

  • allow preston get to retrieve non-text data (#100)
Compare
Choose a tag to compare

Improvements

  • preston-generated UUIDs are now prefixed with "urn:uuid"
  • UUIDs in old provenance logs are prefixed with "urn:uuid: during preston ls
  • preston get can accept content-based locations e.g. preston get cut:zip:hash://abc...!/eml.xml!/b155-296

Bugs

  • preston match reports the correct type of archival/compression used, e.g. tar:gz:hash://abc...
Compare
Choose a tag to compare

Improvements

  • preston match reports submatches, as well as the name of the regex group associated with submatches when specified (related to #86)

Bugs

  • escape invalid characters when preston match reports file URIs (related to #88)