Permalink
Browse files

experiment with figures

  • Loading branch information...
jhpoelen committed Feb 8, 2019
1 parent 0afaf8e commit e17dcea25fcfa615a1f09329d23491ceb3f8e216
Showing with 35 additions and 20 deletions.
  1. +1 −1 README.md
  2. +11 −0 history.dot
  3. BIN history.png
  4. +23 −19 process.dot
  5. BIN process.png
@@ -12,7 +12,7 @@ Preston uses the [PROV](https://www.w3.org/TR/prov-o/) and [PAV](https://pav-ont
The process diagram below shows how Preston starts crawls to download copies of biodiversity registries and their datasets. A detailed log of the crawl activities are recorded to describe what data was discovered and how. This activity log is referred to as the history of a biodiversity dataset graph. The numbers indicate the sequence of events. Click on the image to enlarge.
<img src="https://raw.githubusercontent.com/bio-guoda/preston/master/process.png" width="50%">
<img src="https://raw.githubusercontent.com/bio-guoda/preston/master/history.png" width="50%">
The figure above shows how Preston starts (1) a crawl activity. This crawl activity then accesses (2) a registry to save (3,4) a snapshot (or version) of it. Now, datasets referenced in this registry version are accessed, downloaded and saved (6,7,8). After all this, the crawl activity saves the log that contains its activities (1-8) as a version of a biodiversity dataset (9, 10). This log can be used to retrace the steps of the crawl activity to reconstruct the relationships between the registries, datasets as well as their respective content signatures or content hashes. Actual instances of crawl activities contains multiple registries (e.g., GBIF, iDigBio) and potentially thousands of datasets.
@@ -0,0 +1,11 @@
digraph biodiversity_graph {
rankdir=RL
a [shape="box", label="history of biodiversity\ndataset graphs"];
//v1 [shape="box", label="a biodiversity\ndataset graph\nhash://sha256/..."];
v1 [shape="box", image="process.png"];
v2 [shape="box", image="process.png"];

v2 -> v1 [label="hasPreviousVersion"]
v1 -> a [label="versionOf"]
}

BIN +111 KB history.png
Binary file not shown.
@@ -1,28 +1,32 @@
digraph test123 {
r [shape="box", label="a registry\nhttps://..."];
b [shape="box", label="history of\nbiodiversity\ndatasets"];
preston [shape="box", label="Preston"];
digraph biodiversity_graph {
labelloc="t";
label="a biodiversity dataset graph\nhash://sha256/...";

r [shape="box", label="a registry\nhttps://..."];
preston [shape="box", label="Preston"];

a1 [label="a crawl\nactivity"];
a1 -> preston [label="(1) startedBy"];
a1 [label="a crawl\nactivity"];
a1 -> preston [label="(1) startedBy"];

r -> a1 [label="(2) usedBy"];
r -> a1 [label="(2) usedBy"];

rv0 [shape="box", label="a registry\nversion\nhash://sha256/..."];
r -> rv0 [label="(4) hasVersion"];
rv0 -> a1 [label="(3) generatedBy"];
rv0 [shape="box", label="a registry\nversion\nhash://sha256/..."];
r -> rv0 [label="(4) hasVersion"];
rv0 -> a1 [label="(3) generatedBy"];

d [shape="box", label="a dataset\nhttps://..."];
rv0 -> d [label="(5) hadMember"];
d -> a1 [label="(6) usedBy"];
d [shape="box", label="a dataset\nhttps://..."];
rv0 -> d [label="(5) hadMember"];
d -> a1 [label="(6) usedBy"];

dv0 [shape="box", label="a dataset\nversion\nhash://sha256/..."];
dv0 -> a1 [label="(7) generatedBy"];
dv0 [shape="box", label="a dataset\nversion\nhash://sha256/..."];
dv0 -> a1 [label="(7) generatedBy"];

d -> dv0 [label="(8) hasVersion"];
d -> dv0 [label="(8) hasVersion"];

x1 [shape="box", label="a biodiversity\ndataset\ngraph\n\nhash://sha256/..."];
x1 -> a1 [label="(9) generatedBy"];
b -> x1 [shape="box", label="(10) hasVersion"];


//cluster_0 -> a1 [label="(9) generatedBy"];
//b -> cluster_0 [shape="box", label="(10) hasVersion"];
//b [shape="box", label="history of\nbiodiversity\ndatasets"];
}

BIN -1.93 KB (97%) process.png
Binary file not shown.

0 comments on commit e17dcea

Please sign in to comment.