You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
preston update throws an error when trying to dereference content-based URIs:
$ preston update 'hash://sha256/97cbeae429fbc95d1859f7afa28b33f08ac64125ba72511c49c4b77ca66d2d66'
<https://preston.guoda.bio> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/prov#SoftwareAgent> <urn:uuid:c716b299-09e1-401d-8b7d-f5fd948694f6> .
<https://preston.guoda.bio> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/prov#Agent> <urn:uuid:c716b299-09e1-401d-8b7d-f5fd948694f6> .
<https://preston.guoda.bio> <http://purl.org/dc/terms/description> "Preston is a software program that finds, archives and provides access to biodiversity datasets."@en <urn:uuid:c716b299-09e1-401d-8b7d-f5fd948694f6> .
<urn:uuid:c716b299-09e1-401d-8b7d-f5fd948694f6> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/prov#Activity> <urn:uuid:c716b299-09e1-401d-8b7d-f5fd948694f6> .
<urn:uuid:c716b299-09e1-401d-8b7d-f5fd948694f6> <http://purl.org/dc/terms/description> "A crawl event that discovers biodiversity archives."@en <urn:uuid:c716b299-09e1-401d-8b7d-f5fd948694f6> .
<urn:uuid:c716b299-09e1-401d-8b7d-f5fd948694f6> <http://www.w3.org/ns/prov#startedAtTime> "2021-06-08T14:36:24.593Z"^^<http://www.w3.org/2001/XMLSchema#dateTime> <urn:uuid:c716b299-09e1-401d-8b7d-f5fd948694f6> .
<urn:uuid:c716b299-09e1-401d-8b7d-f5fd948694f6> <http://www.w3.org/ns/prov#wasStartedBy> <https://preston.guoda.bio> <urn:uuid:c716b299-09e1-401d-8b7d-f5fd948694f6> .
<https://doi.org/10.5281/zenodo.1410543> <http://www.w3.org/ns/prov#usedBy> <urn:uuid:c716b299-09e1-401d-8b7d-f5fd948694f6> <urn:uuid:c716b299-09e1-401d-8b7d-f5fd948694f6> .
<https://doi.org/10.5281/zenodo.1410543> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.org/dc/dcmitype/Software> <urn:uuid:c716b299-09e1-401d-8b7d-f5fd948694f6> .
<https://doi.org/10.5281/zenodo.1410543> <http://purl.org/dc/terms/bibliographicCitation> "Jorrit Poelen, Icaro Alzuru, & Michael Elliott. 2019. Preston: a biodiversity dataset tracker (Version 0.0.1-SNAPSHOT) [Software]. Zenodo. http://doi.org/10.5281/zenodo.1410543"@en <urn:uuid:c716b299-09e1-401d-8b7d-f5fd948694f6> .
<urn:uuid:0659a54f-b713-4f86-a917-5be166a14110> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/prov#Entity> <urn:uuid:c716b299-09e1-401d-8b7d-f5fd948694f6> .
<urn:uuid:0659a54f-b713-4f86-a917-5be166a14110> <http://purl.org/dc/terms/description> "A biodiversity dataset graph archive."@en <urn:uuid:c716b299-09e1-401d-8b7d-f5fd948694f6> .
<hash://sha256/6d924b3cc007cdb2fd78eab535dd9102563ebdddf4e0e30b00b50bde555f5e68> <http://www.w3.org/ns/prov#usedBy> <urn:uuid:c716b299-09e1-401d-8b7d-f5fd948694f6> <urn:uuid:c716b299-09e1-401d-8b7d-f5fd948694f6> .
[main] WARN bio.guoda.preston.store.Archiver - failed to dereference [<hash://sha256/97cbeae429fbc95d1859f7afa28b33f08ac64125ba72511c49c4b77ca66d2d66>]
org.apache.http.client.ClientProtocolException
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:187)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108)
at bio.guoda.preston.ResourcesHTTP.asInputStream(ResourcesHTTP.java:75)
at bio.guoda.preston.ResourcesHTTP.asInputStream(ResourcesHTTP.java:55)
at bio.guoda.preston.ResourcesHTTP.asInputStream(ResourcesHTTP.java:59)
at bio.guoda.preston.store.DereferencerContentAddressed.dereference(DereferencerContentAddressed.java:19)
at bio.guoda.preston.store.DereferencerContentAddressed.dereference(DereferencerContentAddressed.java:8)
at bio.guoda.preston.store.Archiver.handleBlankVersion(Archiver.java:49)
at bio.guoda.preston.store.VersionProcessor.on(VersionProcessor.java:28)
at bio.guoda.preston.process.StatementsListenerEmitterAdapter.on(StatementsListenerEmitterAdapter.java:12)
at bio.guoda.preston.cmd.CmdUpdate.processQueue(CmdUpdate.java:61)
at bio.guoda.preston.cmd.CmdActivity.run(CmdActivity.java:124)
at bio.guoda.preston.cmd.CmdActivity.run(CmdActivity.java:77)
at bio.guoda.preston.cmd.CmdLine.run(CmdLine.java:18)
at bio.guoda.preston.cmd.CmdLine.run(CmdLine.java:26)
at bio.guoda.preston.Preston.main(Preston.java:19)
Caused by: org.apache.http.HttpException: hash protocol is not supported
at org.apache.http.impl.conn.DefaultRoutePlanner.determineRoute(DefaultRoutePlanner.java:89)
at org.apache.http.impl.client.InternalHttpClient.determineRoute(InternalHttpClient.java:125)
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
... 16 more
<https://deeplinker.bio/.well-known/genid/aa0abea3-9689-3cc9-a467-ccfde0d5544f> <http://www.w3.org/ns/prov#wasGeneratedBy> <urn:uuid:a2a5c80e-32fc-490d-9615-316e7f2e24bd> <urn:uuid:a2a5c80e-32fc-490d-9615-316e7f2e24bd> .
<https://deeplinker.bio/.well-known/genid/aa0abea3-9689-3cc9-a467-ccfde0d5544f> <http://www.w3.org/ns/prov#qualifiedGeneration> <urn:uuid:a2a5c80e-32fc-490d-9615-316e7f2e24bd> <urn:uuid:a2a5c80e-32fc-490d-9615-316e7f2e24bd> .
<urn:uuid:a2a5c80e-32fc-490d-9615-316e7f2e24bd> <http://www.w3.org/ns/prov#generatedAtTime> "2021-06-08T14:36:25.373Z"^^<http://www.w3.org/2001/XMLSchema#dateTime> <urn:uuid:a2a5c80e-32fc-490d-9615-316e7f2e24bd> .
<urn:uuid:a2a5c80e-32fc-490d-9615-316e7f2e24bd> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/prov#Generation> <urn:uuid:a2a5c80e-32fc-490d-9615-316e7f2e24bd> .
<urn:uuid:a2a5c80e-32fc-490d-9615-316e7f2e24bd> <http://www.w3.org/ns/prov#wasInformedBy> <urn:uuid:c716b299-09e1-401d-8b7d-f5fd948694f6> <urn:uuid:a2a5c80e-32fc-490d-9615-316e7f2e24bd> .
<urn:uuid:a2a5c80e-32fc-490d-9615-316e7f2e24bd> <http://www.w3.org/ns/prov#used> <hash://sha256/97cbeae429fbc95d1859f7afa28b33f08ac64125ba72511c49c4b77ca66d2d66> <urn:uuid:a2a5c80e-32fc-490d-9615-316e7f2e24bd> .
<hash://sha256/97cbeae429fbc95d1859f7afa28b33f08ac64125ba72511c49c4b77ca66d2d66> <http://purl.org/pav/hasVersion> <https://deeplinker.bio/.well-known/genid/aa0abea3-9689-3cc9-a467-ccfde0d5544f> <urn:uuid:a2a5c80e-32fc-490d-9615-316e7f2e24bd> .
Use case: after finding records (lines) in datasets using preston grep, I thought it would be fun to hash each of the lines so that I could see which ones have identical content. preston update seemed like a convenient way to do this.
This ties into an old idea of verifying the reliability (as defined in Elliott et al. 2020) of preston datasets by running preston update on an existing preston-generated provenance log, which we imagined should find the same exact content, since it would dereference content-based identifiers instead of location-based ones.
The text was updated successfully, but these errors were encountered:
Use case: after finding records (lines) in datasets using preston grep, I thought it would be fun to hash each of the lines so that I could see which ones have identical content. preston update seemed like a convenient way to do this.
I realize that finding "identical records" could be done using the text values associated with each line, as outputted by preston grep, but being able to find the hash of each line opens all sorts of possibilities (e.g. preston grep, sketch, similar, etc.), and the hash URI tends to be much more concise than the whole text of the record.
preston update
throws an error when trying to dereference content-based URIs:Use case: after finding records (lines) in datasets using
preston grep
, I thought it would be fun to hash each of the lines so that I could see which ones have identical content.preston update
seemed like a convenient way to do this.This ties into an old idea of verifying the reliability (as defined in Elliott et al. 2020) of preston datasets by running
preston update
on an existing preston-generated provenance log, which we imagined should find the same exact content, since it would dereference content-based identifiers instead of location-based ones.The text was updated successfully, but these errors were encountered: