Version 0.7.0 release candidate (1)
Pre-release
Pre-release
The current release has the following major features:
- in case of union catalogues the main analyses (validation, completeness) and data is displayed both as a whole, and by individual libraries
- more support of PICA records
- improved command line interface
- a new beta feature: validation against SHACL-like problem patterns
Group values by library
- #199: Group results in completeness
- #200: Group results in issues
- #246: Filter results in data tab
- #254: Fixing performance issue for groupping validation
- #253: Creation of id-groupid.csv required for validation
PICA changes
- #163: PICA: general changes
- #190: Extend PICA subject fields
- #215: issue #215: Completeness: check occurrence numbers
- #232: Adding XML serialization for PICA
- #234: Making occurrence a first class citizen of PICA data fields
- #247: Uniqueness of PICA field ranges reported wrongly
- #251: PICA: fixing reading of gzipped files
- #250: Copy Avram schema to output directory
- Adjust K10plus Avram schema
Shacl4bib
Command line interface
- common-script: die if input files don't exist
- common-script: disable colors if not run via terminal
- common-script: emit DONE only for processing steps
- common-script: show UPDATE on config
- Add default settings to setdir.sh
- Add configuration varaible UPDATE and summarize configuration
- Add configuration variable ANALYSES for all-analyses
- Refactor common-script
- Allow globs in MASK
- Fixing parameter removal from catalogue specific params
- Ignore default input/output also when they are symlinks
- Improve downloaders
- Improve KB downloader
- Update ONB downloader
- Improve output of common-script
- Add input directory to ONB downloader
- #223: Create a configuration file for Zentralbibliothek Zürich #223
- masking ZB
- #265: 'all' command should run only the selected tasks if schema is PICA #265
- Update catalogue scripts
- Update catalogues
- Make common-script more robust
- Make setdir.sh optional
- Make sqlite more robust
- Remove unnecessary ; chars
- Simplify bash scripts
- Simplify catalogues/k10plus_*.sh
- Remove duplicated DONE in catalog scripts
- Remove unused parts
- Support setting MASK in setdir.sh (k10plus_pica only)
Documentation
- README.md: Adjust path to run helper script
- Create CONTRIBUTING
- Better definition of the tool in the README
- Adding sponsors section
- Adding Binghampton University Libraries to the list of users
- Add SonarCloud badge
- #196: issue #196: update README
- #244: Document dependencies (close #244)
- Rename CONTRIBUTING to CONTRIBUTING.md
- Update test schema README file
CSV generation
- #216: Completeness: use proper CSV library to generate .csv
- #242: Validation: use proper CSV library to generate .csv
other
- #227: The data field (without subfields) are categorized as "unknown origin" in marc-elements.csv #227
Dependency updates
- upgrade com.fasterxml.jackson.core from 2.13.4 to 2.15.0
- upgrade org.apache.logging.log4j from 2.19.0 to 2.20.0
- upgrade org.apache.solr from 9.1.0 to 9.2.0
- upgrade org.apache.spark from 3.3.1 to 3.3.2
- upgrade org.mongodb:bson from 4.7.2 to 4.9.1
- upgrade org.mongodb:mongo-java-driver from 3.12.11 to 3.12.13
- upgrade org.xerial:sqlite-jdbc from 3.39.3.0 to 3.41.2.1
Debugging, refactoring, performance inmprovement
- Implement Sonar suggestions.
- #269: Build failure: testing
- Add coveralls report integration
- Improve performance of classification analysis
- Improve test coverage
- Improving performance
- Fix a missing character from the Docker description.