Permalink
Fetching contributors…
Cannot retrieve contributors at this time
439 lines (345 sloc) 22.2 KB

Change Log

v1.7.10 (2018-09-19)

Added

  • print configuration defaults with sf -version

Changed

  • update PRONOM to v94

Fixed

  • LOC identifier fixed after regression in v1.7.9
  • remove skeleton-suite files triggering malware warnings by adding to .gitignore; reported by Dave Rice
  • release built with Go version 11, which includes a fix for a CIFS error that caused files to be skipped during file walk; reported by Maarten Savels

v1.7.9 (2018-08-30)

Added

  • save defaults in a configuration file: use the -setconf flag to record any other flags used into a config file. These defaults will be loaded each time you run sf. E.g. sf -multi 16 -setconf then sf DIR (loads the new multi default)
  • use -conf filename to save or load from a named config file. E.g. sf -multi 16 -serve :5138 -conf srv.conf -setconf and then sf -conf srv.conf
  • added -yaml flag so, if you set json/csv in default config :(, you can override with YAML instead. Choose the YAML!

Changed

  • the roy compare -join options that join on filepath now work better when comparing results with mixed windows and unix paths
  • exported decompress package to give more functionality for users of the golang API; requested by Byron Ruth
  • update LOC signatures to 2018-06-14
  • update freedesktop.org signatures to v1.10
  • update tika-mimetype signatures to v1.18

Fixed

  • misidentifications of some files e.g. ODF presentation due to sf quitting early on strong matches. Have adjusted this algorithm to make sf wait longer if there is evidence (e.g. from filename) that the file might be something else. Reported by Jean-Séverin Lair
  • read and other file errors caused sf to hang; reports by Greg Lepore and Andy Foster; fix contributed by Ross Spencer
  • bug reading streams where EOF returned for reads exactly adjacent the end of file
  • bug in mscfb library (race condition for concurrent access to a global variable)
  • some matches result in extremely verbose basis fields; reported by Nick Krabbenhoeft. Partly fixed: basis field now reports a single basis for a match but work remains to speed up matching for these cases.

v1.7.8 (2017-12-02)

Changed

  • update LOC signatures to 2017-09-28
  • update PRONOM signatures to v93

v1.7.7 (2017-11-30)

Added

  • version information for MIME-info signatures (freedesktop.org and tika-mimetypes) now recorded in mime-info.json file and presented in results
  • new sets file for PRONOM extensions. This creates sets like @.doc and @.txt (i.e. all PUIDs with those extensions). Allows you to do commands like roy build -limit @.doc,@.docx, roy inspect @.txt and sf -log @.pdf,o DIR

Changed

  • update freedesktop.org signatures to v1.9

Fixed

  • out of memory error when using sf -z on compressed files that contain very large files; reported by Terry Jolliffe
  • report errors that occur during file decompression. Previously, only fatal errors encountered when a compressed file is first opened were reported. Now errors that are encountered while attempting to walk the contents of a compressed file are also reported.
  • report errors for 'roy inspect' when roy can't find anything to inspect; reported by Ross Spencer

v1.7.6 (2017-10-04)

Added

  • continue on error flag (-coe) can now be used to continue scans despite fatal file errors that would normally cause scanning to halt. This may be useful e.g. for big directory scans over unreliable networks. Usage: sf -coe DIR

Changed

  • update PRONOM signatures to v92

Fixed

  • file scanning is now restricted to regular files (i.e. not symlinks, sockets, devices etc.). Reported by Henk Vanstappen.
  • windows longpath fix now works for paths that appear short

v1.7.5 (2017-08-12)

Added

  • sf -update flag can now be used to download/update non-PRONOM signatures. Options are "loc", "tika", "freedesktop", "pronom-tika-loc", "deluxe" and "archivematica". To update a non-PRONOM signature, include the signature name as an argument after the flags e.g. sf -update freedesktop. This command will overwrite 'default.sig' (the default signature file that sf loads). You can preserve your default signature file by providing an alternative -sig target e.g. sf -sig notdefault.sig -update loc. If you use one of the signature options as a filename (with or without a .sig extension), you can omit the signature argument i.e. sf -update -sig loc.sig is equivalent to sf -sig loc.sig -update loc. Feature requested by Ross Spencer.
  • sf -update now does SHA-256 hash verification of updates and communication with the update server is via HTTPS.

Changed

  • update PRONOM signatures to v91

Fixed

  • fixes to config package where global variables are polluted with subsquent calls to the Add(Identifier) function
  • fix to reader package where panic triggered by illegal slice access in some cases

v1.7.4 (2017-07-14)

Added

  • roy build and roy add now take a -nobyte flag to omit byte signatures from the identifier; requested by Nick Krabbenhoeft

Changed

  • update Tika MIMEInfo signatures to 1.16
  • update LOC to 2017-06-10

v1.7.3-(x) (2017-05-30)

Fixed

  • no changes since v1.7.3, repairing Travis-CI auto-deploy of Debian packages

v1.7.3 (2017-05-20)

Added

  • sf now accepts multiple files or directories as input e.g. sf myfile1.doc mydir myfile3.txt
  • LOC signature update

Changed

  • code re-organisation to export reader and writer packages
  • sf -replay can now take lists of results files with -f flag e.g. sf -replay -f list-of-results.txt

Fixed

  • the command sf -replay - now works on Windows as expected e.g. sf myfiles | sf -replay -json -
  • text matcher not allocating hits to correct identifiers; fixes #101
  • unescaped YAML field contains quote; reported by Ross Spencer

v1.7.2 (2017-04-4)

Added

  • PRONOM v90 update

Fixed

  • the -home flag was being overriden for roy subcommands due to interaction other flags

v1.7.1 (2017-03-12)

Added

  • signature updates for PRONOM, LOC and tika-mimetypes

Changed

  • roy inspect accepts space as well as comma-separated lists of formats e.g. roy inspect fmt/1 fmt/2

v1.7.0 (2017-02-17)

Added

  • log files that match particular formats with -log fmt/1,@set2 (comma separated list of format IDs/format sets). These can be mixed with regular log options e.g. -log unknown,fmt/1,chart
  • generate a summary view of formats matched during a scan with -log chart (or just -log c)
  • replay scans from results files with sf -replay: load one or more results files to replay logging or to convert to a different output format e.g. sf -replay -csv results.yaml or sf -replay -log unknown,chart,stdout results1.yaml results2.csv
  • compare results with roy compare subcommand: view the difference between two or more results e.g. roy compare results1.yaml results2.csv droid.csv ...
  • roy sets subcommand: roy sets creates pronom-all.json, pronom-families.json, and pronom-types.json sets files; roy sets -changes creates a pronom-changes.json sets file from a PRONOM release-notes.xml file; roy sets -list @set1,@set2 lists contents of a comma-separated list of format sets
  • roy inspect releases provides a summary view of a PRONOM release-notes.xml file

Changed

  • the sf - command now scans stdin e.g. cat mypdf.pdf | sf -. You can pass a filename in to supplement the analysis with the -name flag. E.g. cat myfile.pdf | sf -name myfile.pdf -. In previous versions of sf, the dash argument signified treating stdin as a newline separated list of filenames for scanning. Use the new -f flag for this e.g. sf -f myfiles.txt or cat myfiles.txt | sf -f -; change requested by pm64

Fixed

  • some files cause endless scanning due to large numbers of signature hits; reported by workflowsguy
  • null bytes can be written to output due to bad zip filename decoding; reported by Tim Walsh

v1.6.7 (2016-11-23)

Added

  • enable -hash, -z, and -log flags for -serve and -multi modes
  • new hash, z, and sig params for -serve mode (to control per-request)
  • enable droid output in -serve mode
  • GET requests in -serve mode now just percent encoded (with base64 option as a param)
  • -serve mode landing page now includes example forms

Changed

  • code re-organisation using /internal directory to hide internal packages
  • Identify method now returns a slice rather than channel of IDs (siegfried pkg change)

v1.6.6 (2016-10-25)

Added

  • graph implicit and missing priorities with roy inspect implicit-priorities and roy inspect missing-priorities

Fixed

  • error parsing mimeinfo signatures with double backslashes (e.g. rtf signatures)

v1.6.5 (2016-09-28)

Added

  • new sets files (pronom-families.json and pronom-types) automatically created from PRONOM classficiations. Removed redundant sets (database, audio, etc.).

Fixed

  • debbuilder.sh fix: debian packages were copying roy data to wrong directory

Changed

  • roy inspect priorities command now includes "orphan" fmts in graphs
  • update PRONOM urls from apps. to www.

v1.6.4 (2016-09-05)

Added

  • roy inspect FMT command now inspects sets e.g. roy inspect @pdfa
  • roy inspect priorities command generates graphs of priority relations

Fixed

Changed

  • use fwac rather than wac package for performance
  • roy inspect FMT command speed up by building without reports and without the doubles filter
  • -reports flag removed for roy harvest and roy build commands
  • -reports flag changed for roy inspect command, now a boolean that, if set, will cause the signature(s) to be built from the PRONOM report(s), rather than the DROID XML file. This is slower but can be a more accurate representation.

v1.6.3 (2016-08-18)

Added

Fixed

v1.6.2 (2016-08-08)

Fixed

v1.6.1 (2016-07-06)

Added

  • Travis and Appveyor CI automated deployment to Github releases and Bintray
  • PRONOM v85 signatures
  • LICENSE.txt, CHANGELOG.md
  • Go Report Card

Fixed

  • golang.org/x/image/riff bug (reported here)
  • misspellings reported by Go Report Card
  • ineffectual assignments reported by Go Report Card

v1.6.0 (2016-06-26)

Added

  • implement Library of Congress FDD signatures (beta)
  • implement RIFF matcher
  • -multi flag replaces -nopriority; based on report by Ross Spencer

Changed

  • change to -z output: use hash as filepath separator (and unix slash for webarchives); requested by Ross Spencer

Fixed

v1.5.0 (2016-03-14)

Added

  • implement freedesktop.org MIME-info signatures (and the Apache Tika variant)
  • implement XML matcher
  • file name matcher now supports glob patterns as well as file extensions

Changed

  • default signature file now "default.sig" (was "pronom.sig")
  • changes to YAML and JSON output: "ns" (for namespace) replaces "id", and "id" replaces "puid"
  • changes to CSV output: multi-identifiers now displayed in extra columns, not extra rows

v1.4.5 (2016-02-06)

Added

Fixed

v1.4.4 (2016-01-09)

Changed

  • code quality: refactor textmatcher package
  • code quality: refactor siegreader package
  • code quality: documentation

Fixed

  • speed regression in TIFF mis-identification patch last release

v1.4.3 (2015-12-19)

Added

  • measure time elapsed with -log time

Fixed

v1.4.2 (2015-11-27)

Added

Changed

Fixed

v1.4.1 (2015-11-06)

Changed

  • -log replaces -debug, -slow, -unknown and -known flags (see usage above)
  • highlight empty file/stream with error and warning
  • negative text match overrides extension-only plain text match

v1.4.0 (2015-10-31)

Added

  • new MIME matcher; requested by Dragan Espenschied
  • support warc continuations
  • add all.json and tiff.json sets

Changed

  • minor speed-up
  • report less redundant basis information
  • report error on empty file/stream

v1.3.0 (2015-09-27)

Added

  • scan within warc and arc files with -z flag; reqested by Dragan Espenschied
  • sf -slow FILE | DIR reports slow signatures
  • sf -version describes signature file; requested by Michelle Lindlar

Changed

  • quit scanning earlier on known unknowns
  • don't include byte signatures where formats have container signatures (unless -doubleup flag is given); fixes a mis-identification reported by Ross Spencer
  • sf -debug output simplified
  • roy -limit and -exclude now operate on text and default zip matches
  • roy -nopriority re-configured to return more results

Fixed

  • upgraded versions of sf panic when attempting to read old signature files; reported by Stefan
  • panic mmap'ing files over 1GB on Win32; reported by Duncan
  • reporting extensions for folders with "."; reported by Ross Spencer

v1.2.2 (2015-08-15)

Added

  • -noext flag to roy to suppress extension matching; requested by Greg Lepore
  • -known and -unknown flags for sf to output lists of recognised and unknown files respectively; requested by Greg Lepore

v1.2.1 (2015-08-11)

Added

  • support annotation of sets.json files; requested by Greg Lepore
  • add warning when use -extendc without -extend

Fixed

  • report container extensions in details; reported by Ross Spencer

v1.2.0 (2015-07-31)

Added

  • text matcher (i.e. sf README will now report a 'Plain Text File' result)
  • -notext flag to suppress text matcher (roy build -notext)
  • all outputs now include file last modified time
  • -hash flag with choice of md5, sha1, sha256, sha512, crc (e.g. sf -hash md5 FILE)
  • -droid flag to mimic droid output (sf -droid FILE)

Fixed

v1.1.0 (2015-05-17)

Added

  • scan within archive formats (zip, tar, gzip) with -z flag
  • format sets (e.g. roy build -exclude @pdfa)
  • support bitmask patterns

Changed

  • leaner, faster signature format
  • mirror bof patterns as eof patterns where both roy -bof and -eof limits set

Fixed

v1.0.0 (2015-03-22)

Changed

v0.8.2 (2015-02-22)

Added

  • json output
  • server mode

v0.8.1 (2015-02-01)

Fixed

  • single quote YAML output

v0.8.0 (2015-01-26)

Changed

  • optimisations (mmap, multithread, etc.)

v0.7.1 (2014-12-09)

Added

  • csv output

Changed

  • periodic priority checking to stop searches earlier

Fixed

  • range/distance/choices bugfix

v0.7.0 (2014-11-24)

Changed

  • change to signature file format

v0.6.1 (2014-11-21)

Added

  • roy (r2d2 rename) signature customisation
  • parse Droid signature (not just PRONOM reports)
  • support extension signatures

v0.6.0 (2014-11-11)

Added

  • support multiple identifiers
  • config package

Changed

  • license info in srcs (no change to license; this allows for attributing authorship for non-Richard contribs)
  • default home change to "$HOME/siegfried" (no longer ".siegfried")

Fixed

  • mscfb bugfixes

v0.5.0 (2014-10-01)

Added

  • container matching

v0.4.3 (2014-09-23)

Fixed

  • cross-compile was broken (because of use of os/user). Now doing native builds on the three platforms so the download binaries should all work now.

v0.4.2 (2014-09-16)

Fixed

  • bug in processing code caused really bad matching profile for MP3 sigs. No need to update the tool for this, but please do a sieg -update to get the latest signature file.

v0.4.1 (2014-09-14)

Added

  • sf command line: descriptive output in YAML, including basis for matches

Changed

  • optimisations inc. initial BOF loop before main matching loop

v0.4.0 (2014-08-24)

Added

v0.3.0 (2014-08-19)

Changed

  • replaced ac matcher with wac matcher
  • re-write of bytematcher code
  • some benchmarks slower but fewer really poor edge cases (see cmd/sieg/testdata/bench_results.txt)... so a win!
  • but still too slow!

v0.2.0 (2014-03-26)

Added

  • an Identifier type that controls the matching process and stops on best possible match (i.e. no longer require a full file scan for all files)
  • name/extension matching
  • a custom reader (pkg/core/siegreader)

Changed

  • benchmarks (cmd/sieg/testdata)
  • simplifications to the sieg command and signature file
  • optimisations that have boosted performance (see cmd/sieg/testdata/bench_results.txt). But still too slow!

v0.1.0 (2014-02-28)

Added

  • First release. Parses PRONOM signatures and performs byte matching. Bare bones CLI. Glacially slow!