Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Harvest geospatial metadata when ojsGeo plugin is used #3

Open
nuest opened this issue Oct 23, 2023 · 1 comment
Open

Harvest geospatial metadata when ojsGeo plugin is used #3

nuest opened this issue Oct 23, 2023 · 1 comment

Comments

@nuest
Copy link

nuest commented Oct 23, 2023

I'm working on an OJS Plugin that collects and exposes geospatial metadata for articles, see https://github.com/TIBHannover/ojsGeo/

I would like to document the idea (e.g., in an issue) of extending ojsr with a feature to harvest that metadata as well and ideally provide it as ready to use sf objects so you can show OJS articles on a map with just a few lines of R code. The dependency on sf should be flexible, i.e., the availability if checked only when havest_geo = TRUE.

Full disclosure: no production instance of OJS is using the plugin yet, but I'll add updates here once interesting data becomes available.

@nuest
Copy link
Author

nuest commented Oct 24, 2023

For some more background, here are a few links from our demo/test server:

This is how the geometadata is shown for an issue of a journal: https://service.tib.eu/optimeta/index.php/optimeta/issue/view/5

For the harvesting, the article landing pages are more interesting:

This is an excerpt of the HTML source code of the last one, more specifically the header:

<!DOCTYPE html>
<html lang="de-DE" xml:lang="de-DE">
<head>
	<meta charset="utf-8">
	<meta name="viewport" content="width=device-width, initial-scale=1.0">
	<title>
		Using textual volunteered geographic information to model nature-based activities: A case study from Aotearoa New Zealand
							| OPTIMETA
			</title>

	
<meta name="generator" content="Open Journal Systems 3.3.0.14">
<meta name="gs_meta_revision" content="1.1"/>
<meta name="citation_journal_title" content="OPTIMETA"/>
<meta name="citation_journal_abbrev" content="OPTIMETA"/>
<meta name="citation_author" content="Optimeta Admin"/>
<meta name="citation_author" content="Ekaterina Egorova"/>
<meta name="citation_title" content="Using textual volunteered geographic information to model nature-based activities: A case study from Aotearoa New Zealand"/>
<meta name="citation_language" content="de"/>
<meta name="citation_date" content="2022/07/21"/>
<meta name="citation_volume" content="6"/>
<meta name="citation_issue" content="1"/>
<meta name="citation_doi" content="10.82270/optimeta.v6i1.8"/>
<meta name="citation_abstract_html_url" content="https://service.tib.eu/optimeta/index.php/optimeta/article/view/8"/>
<meta name="citation_keywords" xml:lang="de" content="nature-based recreation"/>
<meta name="citation_keywords" xml:lang="de" content="cultural ecosystem services"/>
<meta name="citation_keywords" xml:lang="de" content="volunteered geographic information"/>
<meta name="citation_keywords" xml:lang="de" content="social media"/>
<meta name="citation_keywords" xml:lang="de" content="digital geography"/>
<meta name="citation_reference" content="Björk BC, Shen C, Laakso M (2016) A longitudinal study of independent scholar-published open access journals. PeerJ 4 https://doi.org/10.7717/peerj.1990"/>
<meta name="citation_reference" content="Brown J (2019) Crossref grant IDs: a global, open database of funding information and identifiers. Autumn 2019 euroCRIS Strategic Membership Meeting. Strategic Membership Meeting 2019 – Autumn, Münster, Nov 18-20, 2019. euroCRIS, 33 pp. URL: http://hdl.handle.net/11366/1249"/>
<meta name="citation_reference" content="Conlon M, et al. (2019) VIVO: a system for research discovery. The Journal of Open Source Software 4 (39). https://doi.org/10.21105/joss.01182"/>
<meta name="citation_reference" content="Fenner M, Aryani A (2019) Introducing the PID Graph. (Version 1.0). https://doi.org/10.5438/JWVF-8A66"/>
<meta name="citation_reference" content="Gil Y, David CH, Demir I, et al. (2016) Toward the Geoscience Paper of the Future: Best practices for documenting and sharing research from data to software to provenance. Earth and Space Science 3 (1): 388-415. https://doi.org/10.1002/2015EA000136"/>
<link rel="schema.DC" href="[http://purl.org/dc/elements/1.1/](view-source:http://purl.org/dc/elements/1.1/)" />
<meta name="DC.Coverage" xml:lang="de" content="[{&quot;name&quot;:&quot;Earth&quot;,&quot;geonameId&quot;:6295630,&quot;bbox&quot;:&quot;not available&quot;,&quot;administrativeUnitSuborder&quot;:[&quot;Earth&quot;],&quot;provenance&quot;:{&quot;description&quot;:&quot;administrative unit created by user (acceppting the suggestion of the geonames API , which was created on basis of a geometric shape input)&quot;,&quot;id&quot;:23}}]"/>
<meta name="DC.Creator.PersonalName" content="Ekaterina Egorova"/>
<meta name="DC.Date.created" scheme="ISO8601" content="2022-07-21"/>
<meta name="DC.Date.dateSubmitted" scheme="ISO8601" content="2022-05-18"/>
<meta name="DC.Date.issued" scheme="ISO8601" content="2022-07-08"/>
<meta name="DC.Date.modified" scheme="ISO8601" content="2022-07-21"/>
<meta name="DC.Description" xml:lang="de" content="A boom in volunteered geographic information has led to extensive data-driven exploration and modeling of places. While many studies have used such data to explore human-environment interaction in urban settings, few have investigated natural, non-urban settings. To address this gap, this study systematically explores the content of online reviews of nature-based recreation activities, and develops a fine-grained hierarchical model that includes 28 aspects grouped into three main domains: activity, settings, and emotions/cognition. It further demonstrates how the model can be used to explore the variation in recreation experiences across activities, setting the stage for the analysis of the spatio-temporal variations in recreation experiences in the future. Importantly, the study provides an annotated corpus that can be used as a training dataset for developing methods to automatically capture aspects of recreation experiences in texts."/>
<meta name="DC.Identifier" content="8"/>
<meta name="DC.Identifier.DOI" content="10.82270/optimeta.v6i1.8"/>
<meta name="DC.Identifier.URI" content="https://service.tib.eu/optimeta/index.php/optimeta/article/view/8"/>
<meta name="DC.Language" scheme="ISO639-1" content="de"/>
<meta name="DC.Rights" content="Copyright (c) 2022 OPTIMETA"/>
<meta name="DC.Rights" content=""/>
<meta name="DC.Source" content="OPTIMETA"/>
<meta name="DC.Source.Issue" content="1"/>
<meta name="DC.Source.Volume" content="6"/>
<meta name="DC.Source.URI" content="https://service.tib.eu/optimeta/index.php/optimeta"/>
<meta name="DC.Subject" xml:lang="de" content="nature-based recreation"/>
<meta name="DC.Subject" xml:lang="de" content="cultural ecosystem services"/>
<meta name="DC.Subject" xml:lang="de" content="volunteered geographic information"/>
<meta name="DC.Subject" xml:lang="de" content="social media"/>
<meta name="DC.Subject" xml:lang="de" content="digital geography"/>
<meta name="DC.Title" content="Using textual volunteered geographic information to model nature-based activities: A case study from Aotearoa New Zealand"/>
<meta name="DC.Type" content="Text.Serial.Journal"/>
<meta name="DC.Type.articleType" content="Artikel"/>
<link rel="schema.DC" href="[http://purl.org/dc/elements/1.1/](view-source:http://purl.org/dc/elements/1.1/)" />
<meta name="DC.SpatialCoverage" scheme="GeoJSON" content="{&quot;type&quot;:&quot;FeatureCollection&quot;,&quot;features&quot;:[{&quot;type&quot;:&quot;Feature&quot;,&quot;properties&quot;:{&quot;provenance&quot;:{&quot;description&quot;:&quot;geometric shape created by user (drawing)&quot;,&quot;id&quot;:11}},&quot;geometry&quot;:{&quot;type&quot;:&quot;Polygon&quot;,&quot;coordinates&quot;:[[[172.32421919703486,-33.48032820798308],[171.44531294703486,-34.21028940511315],[173.73046919703486,-37.41671041179496],[173.29101607203484,-38.593977484432955],[172.85156294703484,-39.68464569666023],[170.39062544703484,-42.076481829833355],[165.64453169703486,-45.43957916133509],[165.64453169703486,-46.84015585733286],[166.08398482203484,-47.970311874725496],[166.96289107203486,-48.72962563238336],[168.28125044703484,-47.970311874725496],[170.47851607203484,-46.71978059098357],[172.23632857203486,-45.13038945024757],[172.67578169703486,-44.12965788602876],[174.34570357203484,-43.6864130605097],[175.31250044703484,-42.271890692692985],[176.80664107203484,-41.618181466359324],[178.12500044703484,-40.75835499236763],[178.91601607203486,-39.0732236224912],[179.26757857203484,-37.34687186193451],[178.82812544703486,-36.22064301814946],[177.50976607203486,-36.78582645869149],[175.57617232203484,-35.57989644550358],[174.08203169703484,-34.06479660536844],[171.35742232203486,-34.06479660536844],[172.32421919703486,-33.48032820798308]]]}}],&quot;administrativeUnits&quot;:[{&quot;name&quot;:&quot;Earth&quot;,&quot;geonameId&quot;:6295630,&quot;bbox&quot;:&quot;not available&quot;,&quot;administrativeUnitSuborder&quot;:[&quot;Earth&quot;],&quot;provenance&quot;:{&quot;description&quot;:&quot;administrative unit created by user (acceppting the suggestion of the geonames API , which was created on basis of a geometric shape input)&quot;,&quot;id&quot;:23}}],&quot;temporalProperties&quot;:{&quot;unixDateRange&quot;:&quot;[1167609600000,1599523199000]&quot;,&quot;provenance&quot;:{&quot;description&quot;:&quot;temporal properties created by user&quot;,&quot;id&quot;:31}}}" />
	<link rel="stylesheet" href="[https://service.tib.eu/optimeta/index.php/optimeta/$$$call$$$/page/page/css?name=stylesheet](view-source:https://service.tib.eu/optimeta/index.php/optimeta/$$$call$$$/page/page/css?name=stylesheet)" type="text/css" /><link rel="stylesheet" href="[https://service.tib.eu/optimeta/index.php/optimeta/$$$call$$$/page/page/css?name=font](view-source:https://service.tib.eu/optimeta/index.php/optimeta/$$$call$$$/page/page/css?name=font)" type="text/css" /><link rel="stylesheet" href="[https://service.tib.eu/optimeta/lib/pkp/styles/fontawesome/fontawesome.css?v=3.3.0.14](view-source:https://service.tib.eu/optimeta/lib/pkp/styles/fontawesome/fontawesome.css?v=3.3.0.14)" type="text/css" /><link rel="stylesheet" href="[https://service.tib.eu/optimeta/plugins/generic/optimetaGeo/js/lib/leaflet/dist/leaflet.css?v=3.3.0.14](view-source:https://service.tib.eu/optimeta/plugins/generic/optimetaGeo/js/lib/leaflet/dist/leaflet.css?v=3.3.0.14)" type="text/css" /><link rel="stylesheet" href="[https://service.tib.eu/optimeta/plugins/generic/optimetaGeo/js/lib/leaflet-draw/dist/leaflet.draw.css?v=3.3.0.14](view-source:https://service.tib.eu/optimeta/plugins/generic/optimetaGeo/js/lib/leaflet-draw/dist/leaflet.draw.css?v=3.3.0.14)" type="text/css" /><link rel="stylesheet" href="[https://service.tib.eu/optimeta/plugins/generic/optimetaGeo/js/lib/daterangepicker/daterangepicker.css?v=3.3.0.14](view-source:https://service.tib.eu/optimeta/plugins/generic/optimetaGeo/js/lib/daterangepicker/daterangepicker.css?v=3.3.0.14)" type="text/css" /><link rel="stylesheet" href="[https://service.tib.eu/optimeta/plugins/generic/optimetaGeo/js/lib/leaflet-control-geocoder/dist/Control.Geocoder.css?v=3.3.0.14](view-source:https://service.tib.eu/optimeta/plugins/generic/optimetaGeo/js/lib/leaflet-control-geocoder/dist/Control.Geocoder.css?v=3.3.0.14)" type="text/css" />
</head>
<body class="pkp_page_article pkp_op_view has_site_logo" dir="ltr">

	<div class="pkp_structure_page">

[...]

The the geospatial and temporal information is included in various formats. The most detailed one is encoded as GeoJSON in the field DC.SpatialCoverage. The content is a bit garbled because it is encoded. If I pass that into https://codebeautify.org/html-decode-string and then copy and paste the output to http://geojson.io/ I can explore the data a little bit.

With a (optional) dependency on geojsonsf, this string can be converted to a meaningful R object (see https://cran.r-project.org/web/packages/geojsonsf/vignettes/geojson-sf-conversions.html) and then plotted for inspection/analysis (https://cran.r-project.org/web/packages/sf/vignettes/sf5.html).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant