add option to [grep]/[match] to select by line #109

jhpoelen · 2021-03-10T18:52:24Z

Currently, you can select parts of content that match a specific pattern:

e.g.,

$ preston ls | preston match [some pattern]
... cut:zip:hash://sha256/97cbeae429fbc95d1859f7afa28b33f08ac64125ba72511c49c4b77ca66d2d66!/measurementorfact.txt!/b39887-40069

We'd like to add an option to match only the line number on which the pattern was found:

e.g.,

$ preston ls | preston match --line [some pattern]
... line:zip:hash://sha256/97cbeae429fbc95d1859f7afa28b33f08ac64125ba72511c49c4b77ca66d2d66!/measurementorfact.txt!/L23

where the example above expresses that a match was found on line 23.

The text was updated successfully, but these errors were encountered:

jhpoelen · 2021-03-10T19:04:11Z

when using option "-o" in combination with --line, you get the cut notation also:

preston match [pattern]

line:zip:hash://sha256/97cbeae429fbc95d1859f7afa28b33f08ac64125ba72511c49c4b77ca66d2d66!/measurementorfact.txt!/L123

and

preston match -o [pattern]

cut:line:zip:hash://sha256/97cbeae429fbc95d1859f7afa28b33f08ac64125ba72511c49c4b77ca66d2d66!/measurementorfact.txt!/L123!/b12-23

and

preston match --no-line [pattern]

cut:zip:hash://sha256/97cbeae429fbc95d1859f7afa28b33f08ac64125ba72511c49c4b77ca66d2d66!/measurementorfact.txt!/b12-23

PS compare with grep -o

man grep
... 
-o, --only-matching
              Print only the matched (non-empty) parts of a matching line, with each  such  part  on  a  separate
              output line.

…ns to revert original behavior #109

mielliott · 2021-06-04T19:31:11Z

Using the preston-amazon dataset hash://sha256/1aa34112ade084ccc8707388fbc329dcb8fae5f895cb266e3ad943f7495740b3

$ preston history | tail -n1
<hash://sha256/1aa34112ade084ccc8707388fbc329dcb8fae5f895cb266e3ad943f7495740b3> <http://purl.org/pav/previousVersion> <hash://sha256/d7b73e3472d5a1989598f2a46116a4fc11dfb9ceacdf0a2b2f7f69737883c951> .

Default to reporting full lines that contain matches:

$ preston ls | preston match | head -n4
<urn:uuid:0a460ac1-2e23-4a71-96a0-39448b404ea4> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/prov#Activity> <urn:uuid:0a460ac1-2e23-4a71-96a0-39448b404ea4> .
<urn:uuid:0a460ac1-2e23-4a71-96a0-39448b404ea4> <http://www.w3.org/ns/prov#used> <hash://sha256/6d86c332b045e74fe4410f79655a1f47596808c057f30779b9584dba38fa25d5> <urn:uuid:0a460ac1-2e23-4a71-96a0-39448b404ea4> .
<urn:uuid:0a460ac1-2e23-4a71-96a0-39448b404ea4> <http://purl.org/dc/terms/description> "An activity that finds the locations of text matching the regular expression '(?:https?|ftp|file)://[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|]' inside any encountered content (e.g., hash://sha256/... identifiers)."@en <urn:uuid:0a460ac1-2e23-4a71-96a0-39448b404ea4> .
<line:hash://sha256/6d86c332b045e74fe4410f79655a1f47596808c057f30779b9584dba38fa25d5!/L1> <http://www.w3.org/ns/prov#value> "[{\"key\":\"82ceb6ba-f762-11e1-a439-00145eb45e9a\",\"title\":\"Andes to Amazon Biodiversity Program\",\"description\":\"The Andes to Amazon Program is an international, multidisciplinary team of scientists, students, and Peruvian locals working between the Botanical Research Institute of Texas (BRIT) and selected field and museum sites in Peru.\",\"type\":\"OCCURRENCE\"},{\"key\":\"58414378-4fb2-47e0-8dd5-8b55d5c77117\",\"title\":\"Bolivian Amazon lowland fish metacommunity data\",\"description\":\"<p>This dataset represents data from the paper Yukoni, T. and Torres L. V. (2016) Fish metacommunity dynamics in the patchy heterogeneous habitats of varzea lakes, turbid river channels and transparent clear and black water bodies in the Amazonian Lowlands of Bolivia. Environmental Biology of Fishes.</p>\\n<p>This study documents the spatial dynamic of fish metacommunity based on the date sets of 65 sites, covering two geographic patches of transparent water valleys; Manuripi and Itenez Rivers, separated by turbid water valleys originate in the Andes and the Savanna.</p>\\n<p>See http://www.freshwaterbiodiversity.eu/metadb/bf_mdb_view.php?entryID=BFE_105 for additional metadata.</p>\",\"type\":\"OCCURRENCE\"},{\"key\":\"5a607ce6-eaaf-4420-a302-54ddc767130c\",\"title\":\"Earthworms (Oligochaeta: Glossoscolecidae) of the Amazon region of Colombia\",\"description\":\"This dataset contains the digitized treatments in Plazi based on the original journal article M, Alexander Feijoo, Celis, Liliana V. (2011): Earthworms (Oligochaeta: Glossoscolecidae) of the Amazon region of Colombia. Zootaxa 3201: 27-44, DOI: 10.5281/zenodo.202378\",\"type\":\"CHECKLIST\"},{\"key\":\"4716951d-11f5-4f31-bb1d-d40c95268ad3\",\"title\":\"Notes on the Agrilus fauna of the Colombian Amazon (Coleoptera, Buprestidae)\",\"description\":\"This dataset contains the digitized treatments in Plazi based on the original journal article Curletti, Gianfranco, Dutto, Angelo (2017): Notes on the Agrilus fauna of the Colombian Amazon (Coleoptera, Buprestidae). Zootaxa 4243 (2): 373-376, DOI: https://doi.org/10.11646/zootaxa.4243.2.7\",\"type\":\"CHECKLIST\"},{\"key\":\"10b6b053-2ccf-4ca0-8922-486ad098fc56\",\"title\":\"Stink bugs (Hemiptera: Pentatomidae) from Brazilian Amazon: checklist and new records\",\"description\":\"This dataset contains the digitized treatments in Plazi based on the original journal article Silva, Valeria Juliete Da, Santos, Cleverson Rannieri Meira Dos, Fernandes, Jose Antonio Marin (2018): Stink bugs (Hemiptera: Pentatomidae) from Brazilian Amazon: checklist and new records. Zootaxa 4425 (3): 401-455, DOI: https://doi.org/10.11646/zootaxa.4425.3.1\",\"type\":\"CHECKLIST\"},{\"key\":\"6b43aef0-e62d-478f-9a88-b692578f0a73\",\"title\":\"New species and notes on Apenesia (Hymenoptera, Bethylidae) from the Brazilian Amazon\",\"description\":\"This dataset contains the digitized treatments in Plazi based on the original journal article Lanes, G. O., Azevedo, C. O. (2004): New species and notes on Apenesia (Hymenoptera, Bethylidae) from the Brazilian Amazon. Zootaxa 679: 1-16, DOI: 10.5281/zenodo.158458\",\"type\":\"CHECKLIST\"},{\"key\":\"9f2ce723-8926-4f4c-b9f6-a4b596c8c1a9\",\"title\":\"Revision of the Brazilian Amazon Basin species of Porphyrochroa Melander (Diptera: Empididae)\",\"description\":\"This dataset contains the digitized treatments in Plazi based on the original journal article Mendonça, Mirian Nascimento, Rafael, José Albertino, Ale-Rocha, Rosaly (2008): Revision of the Brazilian Amazon Basin species of Porphyrochroa Melander (Diptera: Empididae). Zootaxa 1859: 1-39, DOI: 10.5281/zenodo.183631\",\"type\":\"CHECKLIST\"},{\"key\":\"f0096116-c79f-41cd-8376-35587cbe9fcd\",\"title\":\"New species of earthworms (Oligochaeta: Glossoscolecidae) in the Amazon region of Colombia\",\"description\":\"This dataset contains the digitized treatments in Plazi based on the original journal article M, Alexander Feijoo, Celis, Liliana V. (2012): New species of earthworms (Oligochaeta: Glossoscolecidae) in the Amazon region of Colombia. Zootaxa 3458: 103-119, DOI: 10.5281/zenodo.214602\",\"type\":\"CHECKLIST\"},{\"key\":\"0d875ec4-2366-4592-803d-cf6c61de8df4\",\"title\":\"A new species of Rhinella (Anura: Bufonidae) from Brazilian Amazon\",\"description\":\"This dataset contains the digitized treatments in Plazi based on the original journal article Lima, Albertina P., Menin, Marcelo, Araújo, Maria Carmozina De (2007): A new species of Rhinella (Anura: Bufonidae) from Brazilian Amazon. Zootaxa 1663: 1-15, DOI: 10.5281/zenodo.179996\",\"type\":\"CHECKLIST\"},{\"key\":\"92b19f16-a3d8-4924-89ff-1800c1d048d0\",\"title\":\"Amazoonops, a new genus of goblin spiders (Araneae: Oonopidae) from the Brazilian Amazon\",\"description\":\"This dataset contains the digitized treatments in Plazi based on the original journal article Ott, Ricardo, Ruiz, Gustavo R. S., Brescovit, Antonio D., Bonaldo, Alexandre B. (2017): Amazoonops, a new genus of goblin spiders (Araneae: Oonopidae) from the Brazilian Amazon. Zootaxa 4236 (2): 244-268, DOI: https://doi.org/10.11646/zootaxa.4236.2.2\",\"type\":\"CHECKLIST\"},{\"key\":\"e24ad3dc-e2fe-44a9-983c-d7fb90147e9f\",\"title\":\"Three new species of Leptohyphidae (Insecta: Ephemeroptera) from Central Amazon, Brazil\",\"description\":\"This dataset contains the digitized treatments in Plazi based on the original journal article Belmont, Enide Luciana L., Salles, Frederico F., Hamada, Neusa (2011): Three new species of Leptohyphidae (Insecta: Ephemeroptera) from Central Amazon, Brazil. Zootaxa 3047: 43-53, DOI: 10.5281/zenodo.201430\",\"type\":\"CHECKLIST\"},{\"key\":\"9d215804-dcb2-49ba-82d5-cc927abfb384\",\"title\":\"Two New Species of Halictophagidae (Insecta: Strepsiptera) from the Brazilian Amazon Basin\",\"description\":\"This dataset contains the digitized treatments in Plazi based on the original journal article Kogan, Marcos (2012): Two New Species of Halictophagidae (Insecta: Strepsiptera) from the Brazilian Amazon Basin. Zootaxa 3517: 79-87, DOI: 10.5281/zenodo.282617\",\"type\":\"CHECKLIST\"},{\"key\":\"b4683510-1fed-4ad4-bef6-64606e847fe9\",\"title\":\"A new Argyrodiaptomus (Copepoda: Calanoida: Diaptomidae) from the southwestern Brazilian Amazon\",\"description\":\"This dataset contains the digitized treatments in Plazi based on the original journal article Previattelli, Daniel, Santos-Silva, Edinaldo Nelson Dos (2007): A new Argyrodiaptomus (Copepoda: Calanoida: Diaptomidae) from the southwestern Brazilian Amazon. Zootaxa 1518: 1-29, DOI: 10.5281/zenodo.177358\",\"type\":\"CHECKLIST\"},{\"key\":\"d0e838e4-82b0-440b-b876-b62afe416245\",\"title\":\"A new species of Elachistocleis (Anura; Microhylidae) from the Brazilian Amazon\",\"description\":\"This dataset contains the digitized treatments in Plazi based on the original journal article Toledo, Luís Felipe (2010): A new species of Elachistocleis (Anura; Microhylidae) from the Brazilian Amazon. Zootaxa 2496: 63-68, DOI: 10.5281/zenodo.195714\",\"type\":\"CHECKLIST\"},{\"key\":\"b3262114-8580-4828-973d-b743d1034d00\",\"title\":\"A new species of Furciseta (Diptera, Ctenostylidae) from the Brazilian Amazon\",\"description\":\"This dataset contains the digitized treatments in Plazi based on the original journal article Câmara, J. T., Rafael, J. A. (2013): A new species of Furciseta (Diptera, Ctenostylidae) from the Brazilian Amazon. Zootaxa 3669 (2): 147-152, DOI: http://dx.doi.org/10.11646/zootaxa.3669.2.5\",\"type\":\"CHECKLIST\"},{\"key\":\"a1d56c0b-d41f-42c1-91d2-417ab340969f\",\"title\":\"A new species of Besleria (Gesneriaceae) from the western Amazon rainforest\",\"description\":\"This dataset contains the digitized treatments in Plazi based on the original journal article Gabriel Emiliano Ferreira, Andréa Onofre De Araújo, Michael John Gilbert Hopkins, Alain Chautems (2017): A new species of Besleria (Gesneriaceae) from the western Amazon rainforest. Brittonia 69 (2): 241-245, DOI: 10.1007/s12228-017-9464-6\",\"type\":\"CHECKLIST\"},{\"key\":\"1edd9658-1b17-4794-bbf0-e2eccb017fd3\",\"title\":\"Leporinus amazonicus, a new anostomid species from the Amazon lowlands, Brazil (Osteichthyes: Characiformes)\",\"description\":\"This dataset contains the digitized treatments in Plazi based on the original journal article Santos, Geraldo Mendes Dos, Zuanon, Jansen (2008): Leporinus amazonicus, a new anostomid species from the Amazon lowlands, Brazil (Osteichthyes: Characiformes). Zootaxa 1815: 35-42, DOI: 10.5281/zenodo.182896\",\"type\":\"CHECKLIST\"},{\"key\":\"bbe844d6-89b6-4c9b-9cd1-4b7a8c332dc5\",\"title\":\"A new species of Paraonis and an annotated checklist of polychaetes from mangroves of the Brazilian Amazon Coast (Annelida, Paraonidae)\",\"description\":\"This dataset contains the digitized treatments in Plazi based on the original journal article Ribeiro, Rannyele Passos, Alves, Paulo Ricardo, Almeida, Zafira da Silva de, Ruta, Christine (2018): A new species of Paraonis and an annotated checklist of polychaetes from mangroves of the Brazilian Amazon Coast (Annelida, Paraonidae). ZooKeys 740: 1-34, DOI: http://dx.doi.org/10.3897/zookeys.740.14640, URL: http://dx.doi.org/10.3897/zookeys.740.14640\",\"type\":\"CHECKLIST\"},{\"key\":\"6e68c0bd-4d98-4a19-a066-e610c60b9478\",\"title\":\"Amazoniaseius imparisetosus n. sp., n. g.: an unusual new phytoseiid mite (Acari: Phytoseiidae) from the Amazon forest\",\"description\":\"This dataset contains the digitized treatments in Plazi based on the original journal article Demite, Peterson R., Cruz, Wilton P., Moraes, Gilberto J. (2017): Amazoniaseius imparisetosus n. sp., n. g.: an unusual new phytoseiid mite (Acari: Phytoseiidae) from the Amazon forest. Zootaxa 4236 (2): 302-310, DOI: https://doi.org/10.11646/zootaxa.4236.2.5\",\"type\":\"CHECKLIST\"},{\"key\":\"663199f1-3528-4289-8069-d27552f62f10\",\"title\":\"A survey of Dryinidae (Hymenoptera, Chrysidoidea) from Caxiuanã, Amazon Basin, with three new taxa and keys to genera and species\",\"description\":\"This dataset contains the digitized treatments in Plazi based on the original journal article Coelho, Beatriz W., Aguiar, Alexandre P., Engel, Michael S. (2011): A survey of Dryinidae (Hymenoptera, Chrysidoidea) from Caxiuanã, Amazon Basin, with three new taxa and keys to genera and species. Zootaxa 2907: 1-21, DOI: 10.5281/zenodo.201416\",\"type\":\"CHECKLIST\"}]" <urn:uuid:0a460ac1-2e23-4a71-96a0-39448b404ea4> .

With -o:

$ preston ls | preston match -o | head -n4
<urn:uuid:268b5c87-c4b2-4f87-8c6d-acb394f6fc5b> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/prov#Activity> <urn:uuid:268b5c87-c4b2-4f87-8c6d-acb394f6fc5b> .
<urn:uuid:268b5c87-c4b2-4f87-8c6d-acb394f6fc5b> <http://www.w3.org/ns/prov#used> <hash://sha256/6d86c332b045e74fe4410f79655a1f47596808c057f30779b9584dba38fa25d5> <urn:uuid:268b5c87-c4b2-4f87-8c6d-acb394f6fc5b> .
<urn:uuid:268b5c87-c4b2-4f87-8c6d-acb394f6fc5b> <http://purl.org/dc/terms/description> "An activity that finds the locations of text matching the regular expression '(?:https?|ftp|file)://[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|]' inside any encountered content (e.g., hash://sha256/... identifiers)."@en <urn:uuid:268b5c87-c4b2-4f87-8c6d-acb394f6fc5b> .
<cut:line:hash://sha256/6d86c332b045e74fe4410f79655a1f47596808c057f30779b9584dba38fa25d5!/L1!/b1063-1137> <http://www.w3.org/ns/prov#value> "http://www.freshwaterbiodiversity.eu/metadb/bf_mdb_view.php?entryID=BFE_105" <urn:uuid:268b5c87-c4b2-4f87-8c6d-acb394f6fc5b> .

With -o --no-line to use the original behavior:

$ preston ls | preston match -o --no-line | head -n4
<urn:uuid:cc37819d-0521-496f-8919-689aa5453f29> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/prov#Activity> <urn:uuid:cc37819d-0521-496f-8919-689aa5453f29> .
<urn:uuid:cc37819d-0521-496f-8919-689aa5453f29> <http://www.w3.org/ns/prov#used> <hash://sha256/6d86c332b045e74fe4410f79655a1f47596808c057f30779b9584dba38fa25d5> <urn:uuid:cc37819d-0521-496f-8919-689aa5453f29> .
<urn:uuid:cc37819d-0521-496f-8919-689aa5453f29> <http://purl.org/dc/terms/description> "An activity that finds the locations of text matching the regular expression '(?:https?|ftp|file)://[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|]' inside any encountered content (e.g., hash://sha256/... identifiers)."@en <urn:uuid:cc37819d-0521-496f-8919-689aa5453f29> .
<cut:hash://sha256/6d86c332b045e74fe4410f79655a1f47596808c057f30779b9584dba38fa25d5!/b1063-1137> <http://www.w3.org/ns/prov#value> "http://www.freshwaterbiodiversity.eu/metadb/bf_mdb_view.php?entryID=BFE_105" <urn:uuid:cc37819d-0521-496f-8919-689aa5453f29> .

@jhpoelen

jhpoelen · 2021-06-04T22:09:56Z

@mielliott I was able to reproduce your newly added -o feature using:

$ preston ls | preston match | head -n4 
<urn:uuid:e7a921d4-6f1f-44b6-bf7d-964614e6d233> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/prov#Activity> <urn:uuid:e7a921d4-6f1f-44b6-bf7d-964614e6d233> .
<urn:uuid:e7a921d4-6f1f-44b6-bf7d-964614e6d233> <http://www.w3.org/ns/prov#used> <hash://sha256/6d86c332b045e74fe4410f79655a1f47596808c057f30779b9584dba38fa25d5> <urn:uuid:e7a921d4-6f1f-44b6-bf7d-964614e6d233> .
<urn:uuid:e7a921d4-6f1f-44b6-bf7d-964614e6d233> <http://purl.org/dc/terms/description> "An activity that finds the locations of text matching the regular expression '(?:https?|ftp|file)://[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|]' inside any encountered content (e.g., hash://sha256/... identifiers)."@en <urn:uuid:e7a921d4-6f1f-44b6-bf7d-964614e6d233> .
<line:hash://sha256/6d86c332b045e74fe4410f79655a1f47596808c057f30779b9584dba38fa25d5!/L1> <http://www.w3.org/ns/prov#value> "[{\"key\":\"82ceb6ba-f762-11e1-a439-00145eb45e9a\",\"title\":\"Andes to Amazon Biodiversity Program\",\"description\":\"The Andes to Amazon Program is an international, multidisciplinary team of scientists, students, and Peruvian locals working between the Botanical Research Institute of Texas (BRIT) and selected field and museum sites in Peru.\",\"type\":\"OCCURRENCE\"},{\"key\":\"58414378-4fb2-47e0-8dd5-8b55d5c77117\",\"title\":\"Bolivian Amazon lowland fish metacommunity data\",\"description\":\"<p>This dataset represents data from the paper Yukoni, T. and Torres L. V. (2016) Fish metacommunity dynamics in the patchy heterogeneous habitats of varzea lakes, turbid river channels and transparent clear and black water bodies in the Amazonian Lowlands of Bolivia. Environmental Biology of Fishes.</p>\\n<p>This study documents the spatial dynamic of fish metacommunity based on the date sets of 65 sites, covering two geographic patches of transparent water valleys; Manuripi and Itenez Rivers, separated by turbid water valleys originate in the Andes and the Savanna.</p>\\n<p>See http://www.freshwaterbiodiversity.eu/metadb/bf_mdb_view.php?entryID=BFE_105 for additional metadata.</p>\",\"type\":\"OCCURRENCE\"},{\"key\":\"5a607ce6-eaaf-4420-a302-54ddc767130c\",\"title\":\"Earthworms (Oligochaeta: Glossoscolecidae) of the Amazon region of Colombia\",\"description\":\"This dataset contains the digitized treatments in Plazi based on the original journal article M, Alexander Feijoo, Celis, Liliana V. (2011): Earthworms (Oligochaeta: Glossoscolecidae) of the Amazon region of Colombia. Zootaxa 3201: 27-44, DOI: 10.5281/zenodo.202378\",\"type\":\"CHECKLIST\"},{\"key\":\"4716951d-11f5-4f31-bb1d-d40c95268ad3\",\"title\":\"Notes on the Agrilus fauna of the Colombian Amazon (Coleoptera, Buprestidae)\",\"description\":\"This dataset contains the digitized treatments in Plazi based on the original journal article Curletti, Gianfranco, Dutto, Angelo (2017): Notes on the Agrilus fauna of the Colombian Amazon (Coleoptera, Buprestidae). Zootaxa 4243 (2): 373-376, DOI: https://doi.org/10.11646/zootaxa.4243.2.7\",\"type\":\"CHECKLIST\"},{\"key\":\"10b6b053-2ccf-4ca0-8922-486ad098fc56\",\"title\":\"Stink bugs (Hemiptera: Pentatomidae) from Brazilian Amazon: checklist and new records\",\"description\":\"This dataset contains the digitized treatments in Plazi based on the original journal article Silva, Valeria Juliete Da, Santos, Cleverson Rannieri Meira Dos, Fernandes, Jose Antonio Marin (2018): Stink bugs (Hemiptera: Pentatomidae) from Brazilian Amazon: checklist and new records. Zootaxa 4425 (3): 401-455, DOI: https://doi.org/10.11646/zootaxa.4425.3.1\",\"type\":\"CHECKLIST\"},{\"key\":\"6b43aef0-e62d-478f-9a88-b692578f0a73\",\"title\":\"New species and notes on Apenesia (Hymenoptera, Bethylidae) from the Brazilian Amazon\",\"description\":\"This dataset contains the digitized treatments in Plazi based on the original journal article Lanes, G. O., Azevedo, C. O. (2004): New species and notes on Apenesia (Hymenoptera, Bethylidae) from the Brazilian Amazon. Zootaxa 679: 1-16, DOI: 10.5281/zenodo.158458\",\"type\":\"CHECKLIST\"},{\"key\":\"9f2ce723-8926-4f4c-b9f6-a4b596c8c1a9\",\"title\":\"Revision of the Brazilian Amazon Basin species of Porphyrochroa Melander (Diptera: Empididae)\",\"description\":\"This dataset contains the digitized treatments in Plazi based on the original journal article Mendonça, Mirian Nascimento, Rafael, José Albertino, Ale-Rocha, Rosaly (2008): Revision of the Brazilian Amazon Basin species of Porphyrochroa Melander (Diptera: Empididae). Zootaxa 1859: 1-39, DOI: 10.5281/zenodo.183631\",\"type\":\"CHECKLIST\"},{\"key\":\"f0096116-c79f-41cd-8376-35587cbe9fcd\",\"title\":\"New species of earthworms (Oligochaeta: Glossoscolecidae) in the Amazon region of Colombia\",\"description\":\"This dataset contains the digitized treatments in Plazi based on the original journal article M, Alexander Feijoo, Celis, Liliana V. (2012): New species of earthworms (Oligochaeta: Glossoscolecidae) in the Amazon region of Colombia. Zootaxa 3458: 103-119, DOI: 10.5281/zenodo.214602\",\"type\":\"CHECKLIST\"},{\"key\":\"0d875ec4-2366-4592-803d-cf6c61de8df4\",\"title\":\"A new species of Rhinella (Anura: Bufonidae) from Brazilian Amazon\",\"description\":\"This dataset contains the digitized treatments in Plazi based on the original journal article Lima, Albertina P., Menin, Marcelo, Araújo, Maria Carmozina De (2007): A new species of Rhinella (Anura: Bufonidae) from Brazilian Amazon. Zootaxa 1663: 1-15, DOI: 10.5281/zenodo.179996\",\"type\":\"CHECKLIST\"},{\"key\":\"92b19f16-a3d8-4924-89ff-1800c1d048d0\",\"title\":\"Amazoonops, a new genus of goblin spiders (Araneae: Oonopidae) from the Brazilian Amazon\",\"description\":\"This dataset contains the digitized treatments in Plazi based on the original journal article Ott, Ricardo, Ruiz, Gustavo R. S., Brescovit, Antonio D., Bonaldo, Alexandre B. (2017): Amazoonops, a new genus of goblin spiders (Araneae: Oonopidae) from the Brazilian Amazon. Zootaxa 4236 (2): 244-268, DOI: https://doi.org/10.11646/zootaxa.4236.2.2\",\"type\":\"CHECKLIST\"},{\"key\":\"e24ad3dc-e2fe-44a9-983c-d7fb90147e9f\",\"title\":\"Three new species of Leptohyphidae (Insecta: Ephemeroptera) from Central Amazon, Brazil\",\"description\":\"This dataset contains the digitized treatments in Plazi based on the original journal article Belmont, Enide Luciana L., Salles, Frederico F., Hamada, Neusa (2011): Three new species of Leptohyphidae (Insecta: Ephemeroptera) from Central Amazon, Brazil. Zootaxa 3047: 43-53, DOI: 10.5281/zenodo.201430\",\"type\":\"CHECKLIST\"},{\"key\":\"9d215804-dcb2-49ba-82d5-cc927abfb384\",\"title\":\"Two New Species of Halictophagidae (Insecta: Strepsiptera) from the Brazilian Amazon Basin\",\"description\":\"This dataset contains the digitized treatments in Plazi based on the original journal article Kogan, Marcos (2012): Two New Species of Halictophagidae (Insecta: Strepsiptera) from the Brazilian Amazon Basin. Zootaxa 3517: 79-87, DOI: 10.5281/zenodo.282617\",\"type\":\"CHECKLIST\"},{\"key\":\"b4683510-1fed-4ad4-bef6-64606e847fe9\",\"title\":\"A new Argyrodiaptomus (Copepoda: Calanoida: Diaptomidae) from the southwestern Brazilian Amazon\",\"description\":\"This dataset contains the digitized treatments in Plazi based on the original journal article Previattelli, Daniel, Santos-Silva, Edinaldo Nelson Dos (2007): A new Argyrodiaptomus (Copepoda: Calanoida: Diaptomidae) from the southwestern Brazilian Amazon. Zootaxa 1518: 1-29, DOI: 10.5281/zenodo.177358\",\"type\":\"CHECKLIST\"},{\"key\":\"d0e838e4-82b0-440b-b876-b62afe416245\",\"title\":\"A new species of Elachistocleis (Anura; Microhylidae) from the Brazilian Amazon\",\"description\":\"This dataset contains the digitized treatments in Plazi based on the original journal article Toledo, Luís Felipe (2010): A new species of Elachistocleis (Anura; Microhylidae) from the Brazilian Amazon. Zootaxa 2496: 63-68, DOI: 10.5281/zenodo.195714\",\"type\":\"CHECKLIST\"},{\"key\":\"b3262114-8580-4828-973d-b743d1034d00\",\"title\":\"A new species of Furciseta (Diptera, Ctenostylidae) from the Brazilian Amazon\",\"description\":\"This dataset contains the digitized treatments in Plazi based on the original journal article Câmara, J. T., Rafael, J. A. (2013): A new species of Furciseta (Diptera, Ctenostylidae) from the Brazilian Amazon. Zootaxa 3669 (2): 147-152, DOI: http://dx.doi.org/10.11646/zootaxa.3669.2.5\",\"type\":\"CHECKLIST\"},{\"key\":\"a1d56c0b-d41f-42c1-91d2-417ab340969f\",\"title\":\"A new species of Besleria (Gesneriaceae) from the western Amazon rainforest\",\"description\":\"This dataset contains the digitized treatments in Plazi based on the original journal article Gabriel Emiliano Ferreira, Andréa Onofre De Araújo, Michael John Gilbert Hopkins, Alain Chautems (2017): A new species of Besleria (Gesneriaceae) from the western Amazon rainforest. Brittonia 69 (2): 241-245, DOI: 10.1007/s12228-017-9464-6\",\"type\":\"CHECKLIST\"},{\"key\":\"1edd9658-1b17-4794-bbf0-e2eccb017fd3\",\"title\":\"Leporinus amazonicus, a new anostomid species from the Amazon lowlands, Brazil (Osteichthyes: Characiformes)\",\"description\":\"This dataset contains the digitized treatments in Plazi based on the original journal article Santos, Geraldo Mendes Dos, Zuanon, Jansen (2008): Leporinus amazonicus, a new anostomid species from the Amazon lowlands, Brazil (Osteichthyes: Characiformes). Zootaxa 1815: 35-42, DOI: 10.5281/zenodo.182896\",\"type\":\"CHECKLIST\"},{\"key\":\"bbe844d6-89b6-4c9b-9cd1-4b7a8c332dc5\",\"title\":\"A new species of Paraonis and an annotated checklist of polychaetes from mangroves of the Brazilian Amazon Coast (Annelida, Paraonidae)\",\"description\":\"This dataset contains the digitized treatments in Plazi based on the original journal article Ribeiro, Rannyele Passos, Alves, Paulo Ricardo, Almeida, Zafira da Silva de, Ruta, Christine (2018): A new species of Paraonis and an annotated checklist of polychaetes from mangroves of the Brazilian Amazon Coast (Annelida, Paraonidae). ZooKeys 740: 1-34, DOI: http://dx.doi.org/10.3897/zookeys.740.14640, URL: http://dx.doi.org/10.3897/zookeys.740.14640\",\"type\":\"CHECKLIST\"},{\"key\":\"6e68c0bd-4d98-4a19-a066-e610c60b9478\",\"title\":\"Amazoniaseius imparisetosus n. sp., n. g.: an unusual new phytoseiid mite (Acari: Phytoseiidae) from the Amazon forest\",\"description\":\"This dataset contains the digitized treatments in Plazi based on the original journal article Demite, Peterson R., Cruz, Wilton P., Moraes, Gilberto J. (2017): Amazoniaseius imparisetosus n. sp., n. g.: an unusual new phytoseiid mite (Acari: Phytoseiidae) from the Amazon forest. Zootaxa 4236 (2): 302-310, DOI: https://doi.org/10.11646/zootaxa.4236.2.5\",\"type\":\"CHECKLIST\"},{\"key\":\"663199f1-3528-4289-8069-d27552f62f10\",\"title\":\"A survey of Dryinidae (Hymenoptera, Chrysidoidea) from Caxiuanã, Amazon Basin, with three new taxa and keys to genera and species\",\"description\":\"This dataset contains the digitized treatments in Plazi based on the original journal article Coelho, Beatriz W., Aguiar, Alexandre P., Engel, Michael S. (2011): A survey of Dryinidae (Hymenoptera, Chrysidoidea) from Caxiuanã, Amazon Basin, with three new taxa and keys to genera and species. Zootaxa 2907: 1-21, DOI: 10.5281/zenodo.201416\",\"type\":\"CHECKLIST\"}]" <urn:uuid:e7a921d4-6f1f-44b6-bf7d-964614e6d233> .

and with -o

$ preston ls | preston match -o | head -n4
<urn:uuid:d35c7bbb-2e7e-4002-8cee-a73bb387828d> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/prov#Activity> <urn:uuid:d35c7bbb-2e7e-4002-8cee-a73bb387828d> .
<urn:uuid:d35c7bbb-2e7e-4002-8cee-a73bb387828d> <http://www.w3.org/ns/prov#used> <hash://sha256/6d86c332b045e74fe4410f79655a1f47596808c057f30779b9584dba38fa25d5> <urn:uuid:d35c7bbb-2e7e-4002-8cee-a73bb387828d> .
<urn:uuid:d35c7bbb-2e7e-4002-8cee-a73bb387828d> <http://purl.org/dc/terms/description> "An activity that finds the locations of text matching the regular expression '(?:https?|ftp|file)://[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|]' inside any encountered content (e.g., hash://sha256/... identifiers)."@en <urn:uuid:d35c7bbb-2e7e-4002-8cee-a73bb387828d> .
<cut:line:hash://sha256/6d86c332b045e74fe4410f79655a1f47596808c057f30779b9584dba38fa25d5!/L1!/b1063-1137> <http://www.w3.org/ns/prov#value> "http://www.freshwaterbiodiversity.eu/metadb/bf_mdb_view.php?entryID=BFE_105" <urn:uuid:d35c7bbb-2e7e-4002-8cee-a73bb387828d> .

and with -o --no-line

$ preston ls | preston match -o --no-line | head -n4
<urn:uuid:f8c5fcb3-acbd-4233-aa66-40cada5cd3c5> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/ns/prov#Activity> <urn:uuid:f8c5fcb3-acbd-4233-aa66-40cada5cd3c5> .
<urn:uuid:f8c5fcb3-acbd-4233-aa66-40cada5cd3c5> <http://www.w3.org/ns/prov#used> <hash://sha256/6d86c332b045e74fe4410f79655a1f47596808c057f30779b9584dba38fa25d5> <urn:uuid:f8c5fcb3-acbd-4233-aa66-40cada5cd3c5> .
<urn:uuid:f8c5fcb3-acbd-4233-aa66-40cada5cd3c5> <http://purl.org/dc/terms/description> "An activity that finds the locations of text matching the regular expression '(?:https?|ftp|file)://[-a-zA-Z0-9+&@#/%?=~_|!:,.;]*[-a-zA-Z0-9+&@#/%=~_|]' inside any encountered content (e.g., hash://sha256/... identifiers)."@en <urn:uuid:f8c5fcb3-acbd-4233-aa66-40cada5cd3c5> .
<cut:hash://sha256/6d86c332b045e74fe4410f79655a1f47596808c057f30779b9584dba38fa25d5!/b1063-1137> <http://www.w3.org/ns/prov#value> "http://www.freshwaterbiodiversity.eu/metadb/bf_mdb_view.php?entryID=BFE_105" <urn:uuid:f8c5fcb3-acbd-4233-aa66-40cada5cd3c5> .

jhpoelen · 2021-06-04T22:20:36Z

I am pretty excited about your new feature, and have a way to point to specific lines in an archive / file.

I was wondering about two things:

I was unable to resist the urge to type:

preston cat 'line:hash://sha256/6d86c332b045e74fe4410f79655a1f47596808c057f30779b9584dba38fa25d5!/L1'

and found preston complaining about the following:

java.io.IOException: problem retrieving [line:hash://sha256/6d86c332b045e74fe4410f79655a1f47596808c057f30779b9584dba38fa25d5!/L1]
	at bio.guoda.preston.cmd.CmdGet.handleContentQuery(CmdGet.java:61)
	at bio.guoda.preston.cmd.CmdGet.run(CmdGet.java:45)
	at bio.guoda.preston.cmd.CmdGet.run(CmdGet.java:32)
	at bio.guoda.preston.cmd.CmdLine.run(CmdLine.java:18)
	at bio.guoda.preston.cmd.CmdLine.run(CmdLine.java:26)
	at bio.guoda.preston.Preston.main(Preston.java:19)
Caused by: bio.guoda.preston.store.DereferenceException: failed to dereference [line:hash://sha256/6d86c332b045e74fe4410f79655a1f47596808c057f30779b9584dba38fa25d5!/L1]
	at bio.guoda.preston.store.ContentHashDereferencer.dereference(ContentHashDereferencer.java:26)
	at bio.guoda.preston.cmd.CmdGet.handleContentQuery(CmdGet.java:58)
	... 5 more
Caused by: java.io.IOException: failed to find content identified by [<line:hash://sha256/6d86c332b045e74fe4410f79655a1f47596808c057f30779b9584dba38fa25d5!/L1>]
	at bio.guoda.preston.stream.ContentStreamFactory.create(ContentStreamFactory.java:26)
	at bio.guoda.preston.store.ContentHashDereferencer.dereference(ContentHashDereferencer.java:24)
	... 6 more

Also, at a first glance I interpreted the cut:line: notation :

cut:line:hash://sha256/6d86c332b045e74fe4410f79655a1f47596808c057f30779b9584dba38fa25d5!/L1!/b1063-1137

as: select the characters in range 1063-1137 on line 1 of hash://sha256/6d86c332b045e74fe4410f79655a1f47596808c057f30779b9584dba38fa25d5 ?

But then I noticed that the --no-line had the same byte range, which makes sense considering that it is the first line.

However, when running:

$ preston ls | preston match -o | head | tail -n1
<cut:line:hash://sha256/6d86c332b045e74fe4410f79655a1f47596808c057f30779b9584dba38fa25d5!/L1!/b8910-8952> <http://www.w3.org/ns/prov#value> "http://dx.doi.org/10.3897/zookeys.740.14640" <urn:uuid:24d5a193-1bdb-415f-a826-96f382a8691a> .

the same byte range was produced using

$ preston ls | preston match -o --no-line | head | tail -n1
<cut:hash://sha256/6d86c332b045e74fe4410f79655a1f47596808c057f30779b9584dba38fa25d5!/b8910-8952> <http://www.w3.org/ns/prov#value> "http://dx.doi.org/10.3897/zookeys.740.14640" <urn:uuid:89589bd2-7322-4234-8574-fbd40be1c944> .

, which seems a bit counter intuitive because I was expecting the byte offset with the line selection to be counted from the start of the selected line.

Curious to hear your comments on the above!

mielliott · 2021-06-04T22:29:13Z

Aw rats, that’s some awful stuff! Thanks for trying it out though - I’ll have a look at it later tonight

jhpoelen · 2021-06-04T22:31:27Z

The feature is pretty awesome . . . my notes are just details I am curious to hear your thoughts on.

mielliott · 2021-06-05T01:00:05Z

preston cat 'line:hash://sha256/6d86c332b045e74fe4410f79655a1f47596808c057f30779b9584dba38fa25d5!/L1'

This should now work:

$ preston cat 'line:hash://sha256/6d86c332b045e74fe4410f79655a1f47596808c057f30779b9584dba38fa25d5!/L1'
[{"key":"82ceb6ba-f762-11e1-a439-00145eb45e9a","title":"Andes to Amazon Biodiversity Program","description":"The Andes to Amazon Program is an international, multidisciplinary team of scientists, students, and Peruvian locals working between the Botanical Research Institute of Texas (BRIT) and selected field and museum sites in Peru.","type":"OCCURRENCE"},{"key":"58414378-4fb2-47e0-8dd5-8b55d5c77117","title":"Bolivian Amazon lowland fish metacommunity data","description":"<p>This dataset represents data from the paper Yukoni, T. and Torres L. V. (2016) Fish metacommunity dynamics in the patchy heterogeneous habitats of varzea lakes, turbid river channels and transparent clear and black water bodies in the Amazonian Lowlands of Bolivia. Environmental Biology of Fishes.</p>\n<p>This study documents the spatial dynamic of fish metacommunity based on the date sets of 65 sites, covering two geographic patches of transparent water valleys; Manuripi and Itenez Rivers, separated by turbid water valleys originate in the Andes and the Savanna.</p>\n<p>See http://www.freshwaterbiodiversity.eu/metadb/bf_mdb_view.php?entryID=BFE_105 for additional metadata.</p>","type":"OCCURRENCE"},{"key":"5a607ce6-eaaf-4420-a302-54ddc767130c","title":"Earthworms (Oligochaeta: Glossoscolecidae) of the Amazon region of Colombia","description":"This dataset contains the digitized treatments in Plazi based on the original journal article M, Alexander Feijoo, Celis, Liliana V. (2011): Earthworms (Oligochaeta: Glossoscolecidae) of the Amazon region of Colombia. Zootaxa 3201: 27-44, DOI: 10.5281/zenodo.202378","type":"CHECKLIST"},{"key":"4716951d-11f5-4f31-bb1d-d40c95268ad3","title":"Notes on the Agrilus fauna of the Colombian Amazon (Coleoptera, Buprestidae)","description":"This dataset contains the digitized treatments in Plazi based on the original journal article Curletti, Gianfranco, Dutto, Angelo (2017): Notes on the Agrilus fauna of the Colombian Amazon (Coleoptera, Buprestidae). Zootaxa 4243 (2): 373-376, DOI: https://doi.org/10.11646/zootaxa.4243.2.7","type":"CHECKLIST"},{"key":"10b6b053-2ccf-4ca0-8922-486ad098fc56","title":"Stink bugs (Hemiptera: Pentatomidae) from Brazilian Amazon: checklist and new records","description":"This dataset contains the digitized treatments in Plazi based on the original journal article Silva, Valeria Juliete Da, Santos, Cleverson Rannieri Meira Dos, Fernandes, Jose Antonio Marin (2018): Stink bugs (Hemiptera: Pentatomidae) from Brazilian Amazon: checklist and new records. Zootaxa 4425 (3): 401-455, DOI: https://doi.org/10.11646/zootaxa.4425.3.1","type":"CHECKLIST"},{"key":"6b43aef0-e62d-478f-9a88-b692578f0a73","title":"New species and notes on Apenesia (Hymenoptera, Bethylidae) from the Brazilian Amazon","description":"This dataset contains the digitized treatments in Plazi based on the original journal article Lanes, G. O., Azevedo, C. O. (2004): New species and notes on Apenesia (Hymenoptera, Bethylidae) from the Brazilian Amazon. Zootaxa 679: 1-16, DOI: 10.5281/zenodo.158458","type":"CHECKLIST"},{"key":"9f2ce723-8926-4f4c-b9f6-a4b596c8c1a9","title":"Revision of the Brazilian Amazon Basin species of Porphyrochroa Melander (Diptera: Empididae)","description":"This dataset contains the digitized treatments in Plazi based on the original journal article Mendonça, Mirian Nascimento, Rafael, José Albertino, Ale-Rocha, Rosaly (2008): Revision of the Brazilian Amazon Basin species of Porphyrochroa Melander (Diptera: Empididae). Zootaxa 1859: 1-39, DOI: 10.5281/zenodo.183631","type":"CHECKLIST"},{"key":"f0096116-c79f-41cd-8376-35587cbe9fcd","title":"New species of earthworms (Oligochaeta: Glossoscolecidae) in the Amazon region of Colombia","description":"This dataset contains the digitized treatments in Plazi based on the original journal article M, Alexander Feijoo, Celis, Liliana V. (2012): New species of earthworms (Oligochaeta: Glossoscolecidae) in the Amazon region of Colombia. Zootaxa 3458: 103-119, DOI: 10.5281/zenodo.214602","type":"CHECKLIST"},{"key":"0d875ec4-2366-4592-803d-cf6c61de8df4","title":"A new species of Rhinella (Anura: Bufonidae) from Brazilian Amazon","description":"This dataset contains the digitized treatments in Plazi based on the original journal article Lima, Albertina P., Menin, Marcelo, Araújo, Maria Carmozina De (2007): A new species of Rhinella (Anura: Bufonidae) from Brazilian Amazon. Zootaxa 1663: 1-15, DOI: 10.5281/zenodo.179996","type":"CHECKLIST"},{"key":"92b19f16-a3d8-4924-89ff-1800c1d048d0","title":"Amazoonops, a new genus of goblin spiders (Araneae: Oonopidae) from the Brazilian Amazon","description":"This dataset contains the digitized treatments in Plazi based on the original journal article Ott, Ricardo, Ruiz, Gustavo R. S., Brescovit, Antonio D., Bonaldo, Alexandre B. (2017): Amazoonops, a new genus of goblin spiders (Araneae: Oonopidae) from the Brazilian Amazon. Zootaxa 4236 (2): 244-268, DOI: https://doi.org/10.11646/zootaxa.4236.2.2","type":"CHECKLIST"},{"key":"e24ad3dc-e2fe-44a9-983c-d7fb90147e9f","title":"Three new species of Leptohyphidae (Insecta: Ephemeroptera) from Central Amazon, Brazil","description":"This dataset contains the digitized treatments in Plazi based on the original journal article Belmont, Enide Luciana L., Salles, Frederico F., Hamada, Neusa (2011): Three new species of Leptohyphidae (Insecta: Ephemeroptera) from Central Amazon, Brazil. Zootaxa 3047: 43-53, DOI: 10.5281/zenodo.201430","type":"CHECKLIST"},{"key":"9d215804-dcb2-49ba-82d5-cc927abfb384","title":"Two New Species of Halictophagidae (Insecta: Strepsiptera) from the Brazilian Amazon Basin","description":"This dataset contains the digitized treatments in Plazi based on the original journal article Kogan, Marcos (2012): Two New Species of Halictophagidae (Insecta: Strepsiptera) from the Brazilian Amazon Basin. Zootaxa 3517: 79-87, DOI: 10.5281/zenodo.282617","type":"CHECKLIST"},{"key":"b4683510-1fed-4ad4-bef6-64606e847fe9","title":"A new Argyrodiaptomus (Copepoda: Calanoida: Diaptomidae) from the southwestern Brazilian Amazon","description":"This dataset contains the digitized treatments in Plazi based on the original journal article Previattelli, Daniel, Santos-Silva, Edinaldo Nelson Dos (2007): A new Argyrodiaptomus (Copepoda: Calanoida: Diaptomidae) from the southwestern Brazilian Amazon. Zootaxa 1518: 1-29, DOI: 10.5281/zenodo.177358","type":"CHECKLIST"},{"key":"d0e838e4-82b0-440b-b876-b62afe416245","title":"A new species of Elachistocleis (Anura; Microhylidae) from the Brazilian Amazon","description":"This dataset contains the digitized treatments in Plazi based on the original journal article Toledo, Luís Felipe (2010): A new species of Elachistocleis (Anura; Microhylidae) from the Brazilian Amazon. Zootaxa 2496: 63-68, DOI: 10.5281/zenodo.195714","type":"CHECKLIST"},{"key":"b3262114-8580-4828-973d-b743d1034d00","title":"A new species of Furciseta (Diptera, Ctenostylidae) from the Brazilian Amazon","description":"This dataset contains the digitized treatments in Plazi based on the original journal article Câmara, J. T., Rafael, J. A. (2013): A new species of Furciseta (Diptera, Ctenostylidae) from the Brazilian Amazon. Zootaxa 3669 (2): 147-152, DOI: http://dx.doi.org/10.11646/zootaxa.3669.2.5","type":"CHECKLIST"},{"key":"a1d56c0b-d41f-42c1-91d2-417ab340969f","title":"A new species of Besleria (Gesneriaceae) from the western Amazon rainforest","description":"This dataset contains the digitized treatments in Plazi based on the original journal article Gabriel Emiliano Ferreira, Andréa Onofre De Araújo, Michael John Gilbert Hopkins, Alain Chautems (2017): A new species of Besleria (Gesneriaceae) from the western Amazon rainforest. Brittonia 69 (2): 241-245, DOI: 10.1007/s12228-017-9464-6","type":"CHECKLIST"},{"key":"1edd9658-1b17-4794-bbf0-e2eccb017fd3","title":"Leporinus amazonicus, a new anostomid species from the Amazon lowlands, Brazil (Osteichthyes: Characiformes)","description":"This dataset contains the digitized treatments in Plazi based on the original journal article Santos, Geraldo Mendes Dos, Zuanon, Jansen (2008): Leporinus amazonicus, a new anostomid species from the Amazon lowlands, Brazil (Osteichthyes: Characiformes). Zootaxa 1815: 35-42, DOI: 10.5281/zenodo.182896","type":"CHECKLIST"},{"key":"bbe844d6-89b6-4c9b-9cd1-4b7a8c332dc5","title":"A new species of Paraonis and an annotated checklist of polychaetes from mangroves of the Brazilian Amazon Coast (Annelida, Paraonidae)","description":"This dataset contains the digitized treatments in Plazi based on the original journal article Ribeiro, Rannyele Passos, Alves, Paulo Ricardo, Almeida, Zafira da Silva de, Ruta, Christine (2018): A new species of Paraonis and an annotated checklist of polychaetes from mangroves of the Brazilian Amazon Coast (Annelida, Paraonidae). ZooKeys 740: 1-34, DOI: http://dx.doi.org/10.3897/zookeys.740.14640, URL: http://dx.doi.org/10.3897/zookeys.740.14640","type":"CHECKLIST"},{"key":"6e68c0bd-4d98-4a19-a066-e610c60b9478","title":"Amazoniaseius imparisetosus n. sp., n. g.: an unusual new phytoseiid mite (Acari: Phytoseiidae) from the Amazon forest","description":"This dataset contains the digitized treatments in Plazi based on the original journal article Demite, Peterson R., Cruz, Wilton P., Moraes, Gilberto J. (2017): Amazoniaseius imparisetosus n. sp., n. g.: an unusual new phytoseiid mite (Acari: Phytoseiidae) from the Amazon forest. Zootaxa 4236 (2): 302-310, DOI: https://doi.org/10.11646/zootaxa.4236.2.5","type":"CHECKLIST"},{"key":"663199f1-3528-4289-8069-d27552f62f10","title":"A survey of Dryinidae (Hymenoptera, Chrysidoidea) from Caxiuanã, Amazon Basin, with three new taxa and keys to genera and species","description":"This dataset contains the digitized treatments in Plazi based on the original journal article Coelho, Beatriz W., Aguiar, Alexandre P., Engel, Michael S. (2011): A survey of Dryinidae (Hymenoptera, Chrysidoidea) from Caxiuanã, Amazon Basin, with three new taxa and keys to genera and species. Zootaxa 2907: 1-21, DOI: 10.5281/zenodo.201416","type":"CHECKLIST"}]

Notice that the file actually is just one big line. No line breaks. So the good news is, the line number and byte ranges actually are working. Looking at another result from matching on the amazon dataset:

$ preston ls | preston match -o | head -n60 | tail -n1
<cut:line:hash://sha256/7d73d2374efed4a5144a0051b457d98279b29453bb81b5a5b87da2ccc12391bc!/L1!/b1666-1681> <http://www.w3.org/ns/prov#value> "http://plazi.org" <urn:uuid:37c45040-20b3-4166-9c1b-71dca9e03421> .

Then cat it back (fixed in 89ee74a):

$ preston cat 'cut:line:hash://sha256/7d73d2374efed4a5144a0051b457d98279b29453bb81b5a5b87da2ccc12391bc!/L1!/b1666-1681'
http://plazi.org

Voila!

Edit: oops, the new example I dug up was also using line 1. Hold on a sec

mielliott · 2021-06-05T01:06:08Z

Attempt number 2:

$ preston ls | preston match -o | head -n238 | tail -n1
<cut:line:zip:hash://sha256/97cbeae429fbc95d1859f7afa28b33f08ac64125ba72511c49c4b77ca66d2d66!/occurrence.txt!/L5!/b188-218> <http://www.w3.org/ns/prov#value> "http://www.canadensys.net/norms" <urn:uuid:6162de6e-c0e9-48a1-9bc9-cbf19d9195b7> .

$ preston cat 'cut:line:zip:hash://sha256/97cbeae429fbc95d1859f7afa28b33f08ac64125ba72511c49c4b77ca66d2d66!/occurrence.txt!/L5!/b188-218'
http://www.canadensys.net/norms

jhpoelen · 2021-06-05T03:10:53Z

Wow! I was just able to do:

$ preston cat 'line:hash://sha256/6d86c332b045e74fe4410f79655a1f47596808c057f30779b9584dba38fa25d5!/L1'
[{"key":"82ceb6ba-f762-11e1-a439-00145eb45e9a","title":"Andes to Amazon Biodiversity Program","description":"The Andes to Amazon Program is an international, multidisciplinary team of scientists, students, and Peruvian locals working between the Botanical Research Institute of Texas (BRIT) and selected field and museum sites in Peru.","type":"OCCURRENCE"},{"key":"58414378-4fb2-47e0-8dd5-8b55d5c77117","title":"Bolivian Amazon lowland fish metacommunity data","description":"<p>This dataset represents data from the paper Yukoni, T. and Torres L. V. (2016) Fish metacommunity dynamics in the patchy heterogeneous habitats of varzea lakes, turbid river channels and transparent clear and black water bodies in the Amazonian Lowlands of Bolivia. Environmental Biology of Fishes.</p>\n<p>This study documents the spatial dynamic of fish metacommunity based on the date sets of 65 sites, covering two geographic patches of transparent water valleys; Manuripi and Itenez Rivers, separated by turbid water valleys originate in the Andes and the Savanna.</p>\n<p>See http://www.freshwaterbiodiversity.eu/metadb/bf_mdb_view.php?entryID=BFE_105 for additional metadata.</p>","type":"OCCURRENCE"},{"key":"5a607ce6-eaaf-4420-a302-54ddc767130c","title":"Earthworms (Oligochaeta: Glossoscolecidae) of the Amazon region of Colombia","description":"This dataset contains the digitized treatments in Plazi based on the original journal article M, Alexander Feijoo, Celis, Liliana V. (2011): Earthworms (Oligochaeta: Glossoscolecidae) of the Amazon region of Colombia. Zootaxa 3201: 27-44, DOI: 10.5281/zenodo.202378","type":"CHECKLIST"},{"key":"4716951d-11f5-4f31-bb1d-d40c95268ad3","title":"Notes on the Agrilus fauna of the Colombian Amazon (Coleoptera, Buprestidae)","description":"This dataset contains the digitized treatments in Plazi based on the original journal article Curletti, Gianfranco, Dutto, Angelo (2017): Notes on the Agrilus fauna of the Colombian Amazon (Coleoptera, Buprestidae). Zootaxa 4243 (2): 373-376, DOI: https://doi.org/10.11646/zootaxa.4243.2.7","type":"CHECKLIST"},{"key":"10b6b053-2ccf-4ca0-8922-486ad098fc56","title":"Stink bugs (Hemiptera: Pentatomidae) from Brazilian Amazon: checklist and new records","description":"This dataset contains the digitized treatments in Plazi based on the original journal article Silva, Valeria Juliete Da, Santos, Cleverson Rannieri Meira Dos, Fernandes, Jose Antonio Marin (2018): Stink bugs (Hemiptera: Pentatomidae) from Brazilian Amazon: checklist and new records. Zootaxa 4425 (3): 401-455, DOI: https://doi.org/10.11646/zootaxa.4425.3.1","type":"CHECKLIST"},{"key":"6b43aef0-e62d-478f-9a88-b692578f0a73","title":"New species and notes on Apenesia (Hymenoptera, Bethylidae) from the Brazilian Amazon","description":"This dataset contains the digitized treatments in Plazi based on the original journal article Lanes, G. O., Azevedo, C. O. (2004): New species and notes on Apenesia (Hymenoptera, Bethylidae) from the Brazilian Amazon. Zootaxa 679: 1-16, DOI: 10.5281/zenodo.158458","type":"CHECKLIST"},{"key":"9f2ce723-8926-4f4c-b9f6-a4b596c8c1a9","title":"Revision of the Brazilian Amazon Basin species of Porphyrochroa Melander (Diptera: Empididae)","description":"This dataset contains the digitized treatments in Plazi based on the original journal article Mendonça, Mirian Nascimento, Rafael, José Albertino, Ale-Rocha, Rosaly (2008): Revision of the Brazilian Amazon Basin species of Porphyrochroa Melander (Diptera: Empididae). Zootaxa 1859: 1-39, DOI: 10.5281/zenodo.183631","type":"CHECKLIST"},{"key":"f0096116-c79f-41cd-8376-35587cbe9fcd","title":"New species of earthworms (Oligochaeta: Glossoscolecidae) in the Amazon region of Colombia","description":"This dataset contains the digitized treatments in Plazi based on the original journal article M, Alexander Feijoo, Celis, Liliana V. (2012): New species of earthworms (Oligochaeta: Glossoscolecidae) in the Amazon region of Colombia. Zootaxa 3458: 103-119, DOI: 10.5281/zenodo.214602","type":"CHECKLIST"},{"key":"0d875ec4-2366-4592-803d-cf6c61de8df4","title":"A new species of Rhinella (Anura: Bufonidae) from Brazilian Amazon","description":"This dataset contains the digitized treatments in Plazi based on the original journal article Lima, Albertina P., Menin, Marcelo, Araújo, Maria Carmozina De (2007): A new species of Rhinella (Anura: Bufonidae) from Brazilian Amazon. Zootaxa 1663: 1-15, DOI: 10.5281/zenodo.179996","type":"CHECKLIST"},{"key":"92b19f16-a3d8-4924-89ff-1800c1d048d0","title":"Amazoonops, a new genus of goblin spiders (Araneae: Oonopidae) from the Brazilian Amazon","description":"This dataset contains the digitized treatments in Plazi based on the original journal article Ott, Ricardo, Ruiz, Gustavo R. S., Brescovit, Antonio D., Bonaldo, Alexandre B. (2017): Amazoonops, a new genus of goblin spiders (Araneae: Oonopidae) from the Brazilian Amazon. Zootaxa 4236 (2): 244-268, DOI: https://doi.org/10.11646/zootaxa.4236.2.2","type":"CHECKLIST"},{"key":"e24ad3dc-e2fe-44a9-983c-d7fb90147e9f","title":"Three new species of Leptohyphidae (Insecta: Ephemeroptera) from Central Amazon, Brazil","description":"This dataset contains the digitized treatments in Plazi based on the original journal article Belmont, Enide Luciana L., Salles, Frederico F., Hamada, Neusa (2011): Three new species of Leptohyphidae (Insecta: Ephemeroptera) from Central Amazon, Brazil. Zootaxa 3047: 43-53, DOI: 10.5281/zenodo.201430","type":"CHECKLIST"},{"key":"9d215804-dcb2-49ba-82d5-cc927abfb384","title":"Two New Species of Halictophagidae (Insecta: Strepsiptera) from the Brazilian Amazon Basin","description":"This dataset contains the digitized treatments in Plazi based on the original journal article Kogan, Marcos (2012): Two New Species of Halictophagidae (Insecta: Strepsiptera) from the Brazilian Amazon Basin. Zootaxa 3517: 79-87, DOI: 10.5281/zenodo.282617","type":"CHECKLIST"},{"key":"b4683510-1fed-4ad4-bef6-64606e847fe9","title":"A new Argyrodiaptomus (Copepoda: Calanoida: Diaptomidae) from the southwestern Brazilian Amazon","description":"This dataset contains the digitized treatments in Plazi based on the original journal article Previattelli, Daniel, Santos-Silva, Edinaldo Nelson Dos (2007): A new Argyrodiaptomus (Copepoda: Calanoida: Diaptomidae) from the southwestern Brazilian Amazon. Zootaxa 1518: 1-29, DOI: 10.5281/zenodo.177358","type":"CHECKLIST"},{"key":"d0e838e4-82b0-440b-b876-b62afe416245","title":"A new species of Elachistocleis (Anura; Microhylidae) from the Brazilian Amazon","description":"This dataset contains the digitized treatments in Plazi based on the original journal article Toledo, Luís Felipe (2010): A new species of Elachistocleis (Anura; Microhylidae) from the Brazilian Amazon. Zootaxa 2496: 63-68, DOI: 10.5281/zenodo.195714","type":"CHECKLIST"},{"key":"b3262114-8580-4828-973d-b743d1034d00","title":"A new species of Furciseta (Diptera, Ctenostylidae) from the Brazilian Amazon","description":"This dataset contains the digitized treatments in Plazi based on the original journal article Câmara, J. T., Rafael, J. A. (2013): A new species of Furciseta (Diptera, Ctenostylidae) from the Brazilian Amazon. Zootaxa 3669 (2): 147-152, DOI: http://dx.doi.org/10.11646/zootaxa.3669.2.5","type":"CHECKLIST"},{"key":"a1d56c0b-d41f-42c1-91d2-417ab340969f","title":"A new species of Besleria (Gesneriaceae) from the western Amazon rainforest","description":"This dataset contains the digitized treatments in Plazi based on the original journal article Gabriel Emiliano Ferreira, Andréa Onofre De Araújo, Michael John Gilbert Hopkins, Alain Chautems (2017): A new species of Besleria (Gesneriaceae) from the western Amazon rainforest. Brittonia 69 (2): 241-245, DOI: 10.1007/s12228-017-9464-6","type":"CHECKLIST"},{"key":"1edd9658-1b17-4794-bbf0-e2eccb017fd3","title":"Leporinus amazonicus, a new anostomid species from the Amazon lowlands, Brazil (Osteichthyes: Characiformes)","description":"This dataset contains the digitized treatments in Plazi based on the original journal article Santos, Geraldo Mendes Dos, Zuanon, Jansen (2008): Leporinus amazonicus, a new anostomid species from the Amazon lowlands, Brazil (Osteichthyes: Characiformes). Zootaxa 1815: 35-42, DOI: 10.5281/zenodo.182896","type":"CHECKLIST"},{"key":"bbe844d6-89b6-4c9b-9cd1-4b7a8c332dc5","title":"A new species of Paraonis and an annotated checklist of polychaetes from mangroves of the Brazilian Amazon Coast (Annelida, Paraonidae)","description":"This dataset contains the digitized treatments in Plazi based on the original journal article Ribeiro, Rannyele Passos, Alves, Paulo Ricardo, Almeida, Zafira da Silva de, Ruta, Christine (2018): A new species of Paraonis and an annotated checklist of polychaetes from mangroves of the Brazilian Amazon Coast (Annelida, Paraonidae). ZooKeys 740: 1-34, DOI: http://dx.doi.org/10.3897/zookeys.740.14640, URL: http://dx.doi.org/10.3897/zookeys.740.14640","type":"CHECKLIST"},{"key":"6e68c0bd-4d98-4a19-a066-e610c60b9478","title":"Amazoniaseius imparisetosus n. sp., n. g.: an unusual new phytoseiid mite (Acari: Phytoseiidae) from the Amazon forest","description":"This dataset contains the digitized treatments in Plazi based on the original journal article Demite, Peterson R., Cruz, Wilton P., Moraes, Gilberto J. (2017): Amazoniaseius imparisetosus n. sp., n. g.: an unusual new phytoseiid mite (Acari: Phytoseiidae) from the Amazon forest. Zootaxa 4236 (2): 302-310, DOI: https://doi.org/10.11646/zootaxa.4236.2.5","type":"CHECKLIST"},{"key":"663199f1-3528-4289-8069-d27552f62f10","title":"A survey of Dryinidae (Hymenoptera, Chrysidoidea) from Caxiuanã, Amazon Basin, with three new taxa and keys to genera and species","description":"This dataset contains the digitized treatments in Plazi based on the original journal article Coelho, Beatriz W., Aguiar, Alexandre P., Engel, Michael S. (2011): A survey of Dryinidae (Hymenoptera, Chrysidoidea) from Caxiuanã, Amazon Basin, with three new taxa and keys to genera and species. Zootaxa 2907: 1-21, DOI: 10.5281/zenodo.201416","type":"CHECKLIST"}]

Also, I was able to reproduce:

$ preston ls | preston match -o | head -n238 | tail -n1
<cut:line:zip:hash://sha256/97cbeae429fbc95d1859f7afa28b33f08ac64125ba72511c49c4b77ca66d2d66!/occurrence.txt!/L5!/b188-218> <http://www.w3.org/ns/prov#value> "http://www.canadensys.net/norms" <urn:uuid:8c68ff06-e7ed-44d1-8b13-dc747886007a> .

with inverse lookup:

$ preston cat 'cut:line:zip:hash://sha256/97cbeae429fbc95d1859f7afa28b33f08ac64125ba72511c49c4b77ca66d2d66!/occurrence.txt!/L5!/b188-218'
http://www.canadensys.net/norms

also, without lines:

$ preston ls | preston match -o --no-lines | head -n238 | tail -n1
<cut:zip:hash://sha256/97cbeae429fbc95d1859f7afa28b33f08ac64125ba72511c49c4b77ca66d2d66!/occurrence.txt!/b4094-4124> <http://www.w3.org/ns/prov#value> "http://www.canadensys.net/norms" <urn:uuid:bc71ed2e-a964-4e2d-bf37-8b2778e60429> .

where the byte range is counted from start of the content (i.e. cut:[...]!/b4094-4124) , instead of beginning of line 5 (i.e. cut:line:[...]!/L5!/b188-218) and both notations yield same result (e.g., http://www.canadensys.net/norms).

Very cool way to express coordinates in a predictable biodiversity data universe!

Because, no matter where you are or what you do, the following always holds:

$ preston cat 'cut:zip:hash://sha256/97cbeae429fbc95d1859f7afa28b33f08ac64125ba72511c49c4b77ca66d2d66!/occurrence.txt!/b4094-4124'
http://www.canadensys.net/norms

fyi @cboettig @seltmann @mjy @dshorthouse

jhpoelen · 2021-06-05T03:17:26Z

Thanks for making this happen @mielliott !

jhpoelen · 2021-06-05T03:29:47Z

fyi @zedomel

jhpoelen · 2021-06-18T23:22:44Z

@mielliott Just installed preston 0.3.0 and found that

https://deeplinker.bio/cat/line:zip:hash://sha256/29d30b566f924355a383b13cd48c3aa239d42cba0a55f4ccfc2930289b88b43c!/occurrence.txt!/L1

works like a charm (see attached screenshot) . Note that the hash is the (huge) ebird dataset

I had the urge to use a line range e.g., L1-2 . Is that something you had in mind too?

mielliott · 2021-06-21T16:38:47Z

Sweet! Opening the URL is surprisingly speedy too!

had the urge to use a line range e.g., L1-2 . Is that something you had in mind too?

Definitely; I didn't expect preston grep to report multi-line matches though, so catting line ranges didn't get implemented

mielliott self-assigned this Jun 1, 2021

mielliott added a commit that referenced this issue Jun 4, 2021

add text line stream handler #109

9d54012

mielliott added a commit that referenced this issue Jun 4, 2021

add test chaining a line extractor and a text matcher #109

29bbae7

mielliott added a commit that referenced this issue Jun 4, 2021

add option to report whole lines that contain text matches #109

922513f

mielliott added a commit that referenced this issue Jun 4, 2021

default to reporting entire line matches; add -o and --no-lines optio…

3188822

…ns to revert original behavior #109

mielliott added a commit that referenced this issue Jun 5, 2021

add line handler to preston get #109

89ee74a

jhpoelen closed this as completed Jun 5, 2021

mielliott mentioned this issue Jun 21, 2021

let preston cat dereference content line ranges #128

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add option to [grep]/[match] to select by line #109

add option to [grep]/[match] to select by line #109

jhpoelen commented Mar 10, 2021 •

edited

Loading

jhpoelen commented Mar 10, 2021 •

edited

Loading

mielliott commented Jun 4, 2021 •

edited

Loading

jhpoelen commented Jun 4, 2021 •

edited

Loading

jhpoelen commented Jun 4, 2021 •

edited

Loading

mielliott commented Jun 4, 2021

jhpoelen commented Jun 4, 2021

mielliott commented Jun 5, 2021 •

edited

Loading

mielliott commented Jun 5, 2021 •

edited

Loading

jhpoelen commented Jun 5, 2021

jhpoelen commented Jun 5, 2021

jhpoelen commented Jun 5, 2021

jhpoelen commented Jun 18, 2021

mielliott commented Jun 21, 2021

add option to [grep]/[match] to select by line #109

add option to [grep]/[match] to select by line #109

Comments

jhpoelen commented Mar 10, 2021 • edited Loading

jhpoelen commented Mar 10, 2021 • edited Loading

mielliott commented Jun 4, 2021 • edited Loading

jhpoelen commented Jun 4, 2021 • edited Loading

jhpoelen commented Jun 4, 2021 • edited Loading

mielliott commented Jun 4, 2021

jhpoelen commented Jun 4, 2021

mielliott commented Jun 5, 2021 • edited Loading

mielliott commented Jun 5, 2021 • edited Loading

jhpoelen commented Jun 5, 2021

jhpoelen commented Jun 5, 2021

jhpoelen commented Jun 5, 2021

jhpoelen commented Jun 18, 2021

mielliott commented Jun 21, 2021

jhpoelen commented Mar 10, 2021 •

edited

Loading

jhpoelen commented Mar 10, 2021 •

edited

Loading

mielliott commented Jun 4, 2021 •

edited

Loading

jhpoelen commented Jun 4, 2021 •

edited

Loading

jhpoelen commented Jun 4, 2021 •

edited

Loading

mielliott commented Jun 5, 2021 •

edited

Loading

mielliott commented Jun 5, 2021 •

edited

Loading