Skip to content

Version 0.7.0 release candidate (1)

Pre-release
Pre-release
Compare
Choose a tag to compare
@pkiraly pkiraly released this 22 May 08:26
· 502 commits to main since this release

The current release has the following major features:

  • in case of union catalogues the main analyses (validation, completeness) and data is displayed both as a whole, and by individual libraries
  • more support of PICA records
  • improved command line interface
  • a new beta feature: validation against SHACL-like problem patterns

Group values by library

  • #199: Group results in completeness
  • #200: Group results in issues
  • #246: Filter results in data tab
  • #254: Fixing performance issue for groupping validation
  • #253: Creation of id-groupid.csv required for validation

PICA changes

  • #163: PICA: general changes
  • #190: Extend PICA subject fields
  • #215: issue #215: Completeness: check occurrence numbers
  • #232: Adding XML serialization for PICA
  • #234: Making occurrence a first class citizen of PICA data fields
  • #247: Uniqueness of PICA field ranges reported wrongly
  • #251: PICA: fixing reading of gzipped files
  • #250: Copy Avram schema to output directory
  • Adjust K10plus Avram schema

Shacl4bib

  • #209: adding Shacl4bib
  • #217: issue #217: create a stub class

Command line interface

  • common-script: die if input files don't exist
  • common-script: disable colors if not run via terminal
  • common-script: emit DONE only for processing steps
  • common-script: show UPDATE on config
  • Add default settings to setdir.sh
  • Add configuration varaible UPDATE and summarize configuration
  • Add configuration variable ANALYSES for all-analyses
  • Refactor common-script
  • Allow globs in MASK
  • Fixing parameter removal from catalogue specific params
  • Ignore default input/output also when they are symlinks
  • Improve downloaders
  • Improve KB downloader
  • Update ONB downloader
  • Improve output of common-script
  • Add input directory to ONB downloader
  • #223: Create a configuration file for Zentralbibliothek Zürich #223
  • masking ZB
  • #265: 'all' command should run only the selected tasks if schema is PICA #265
  • Update catalogue scripts
  • Update catalogues
  • Make common-script more robust
  • Make setdir.sh optional
  • Make sqlite more robust
  • Remove unnecessary ; chars
  • Simplify bash scripts
  • Simplify catalogues/k10plus_*.sh
  • Remove duplicated DONE in catalog scripts
  • Remove unused parts
  • Support setting MASK in setdir.sh (k10plus_pica only)

Documentation

  • README.md: Adjust path to run helper script
  • Create CONTRIBUTING
  • Better definition of the tool in the README
  • Adding sponsors section
  • Adding Binghampton University Libraries to the list of users
  • Add SonarCloud badge
  • #196: issue #196: update README
  • #244: Document dependencies (close #244)
  • Rename CONTRIBUTING to CONTRIBUTING.md
  • Update test schema README file

CSV generation

  • #216: Completeness: use proper CSV library to generate .csv
  • #242: Validation: use proper CSV library to generate .csv

other

  • #227: The data field (without subfields) are categorized as "unknown origin" in marc-elements.csv #227

Dependency updates

  • upgrade com.fasterxml.jackson.core from 2.13.4 to 2.15.0
  • upgrade org.apache.logging.log4j from 2.19.0 to 2.20.0
  • upgrade org.apache.solr from 9.1.0 to 9.2.0
  • upgrade org.apache.spark from 3.3.1 to 3.3.2
  • upgrade org.mongodb:bson from 4.7.2 to 4.9.1
  • upgrade org.mongodb:mongo-java-driver from 3.12.11 to 3.12.13
  • upgrade org.xerial:sqlite-jdbc from 3.39.3.0 to 3.41.2.1

Debugging, refactoring, performance inmprovement

  • Implement Sonar suggestions.
  • #269: Build failure: testing
  • Add coveralls report integration
  • Improve performance of classification analysis
  • Improve test coverage
  • Improving performance
  • Fix a missing character from the Docker description.