Commit dfb579b
committed
Implement a Wikidata identifier for Siegfried
Wikidata contains a host of file format information. Wikidata records
provide information about format names, identifiers (PUID, LoC FDD),
file-extensions, MIMEtypes, and format identification patterns. This
means it might be quite handy as an extension to Siegfried's
capabilties.
This is the first-cut of that work. We are able to harvest
information from the Wikidata Query Service. Create an identifier.
And consume the information in the identifier to match file-formats
using format-identification patterns taken directly from the
service.
We attempt to return new information not otherwise present in other
identifiers yet, for example, signature provenance. Provenance is
recorded in Wikidata as well as the date a format-identification
pattern was added. We try and replay this to the user to enrich
what is available to them.
This work is graciously supported by Yale University Library
(Euan Cochran, Kat Thornton) and Richard Lehane. It has been fun
trying to put this out there for folk.1 parent 8a95ae4 commit dfb579b
41 files changed
Lines changed: 112196 additions & 73 deletions
File tree
- cmd
- roy
- data/wikidata
- sf
- testdata/wikidata
- pro
- wd
- internal/identifier
- pkg
- config
- internal/wikidatasparql
- core
- pronom
- wikidata
- internal
- converter
- mappings
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
| 13 | + | |
13 | 14 | | |
14 | 15 | | |
15 | 16 | | |
| |||
66 | 67 | | |
67 | 68 | | |
68 | 69 | | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
69 | 84 | | |
70 | 85 | | |
71 | 86 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
64 | 64 | | |
65 | 65 | | |
66 | 66 | | |
| 67 | + | |
67 | 68 | | |
68 | 69 | | |
69 | 70 | | |
| |||
377 | 378 | | |
378 | 379 | | |
379 | 380 | | |
| 381 | + | |
| 382 | + | |
| 383 | + | |
| 384 | + | |
| 385 | + | |
380 | 386 | | |
381 | 387 | | |
382 | 388 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
126 | 126 | | |
127 | 127 | | |
128 | 128 | | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
129 | 133 | | |
130 | 134 | | |
131 | 135 | | |
| |||
Loading
Binary file not shown.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
0 commit comments