Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chronostratigraphic periods #52

Closed
rybesh opened this issue Jun 19, 2018 · 4 comments
Closed

Chronostratigraphic periods #52

rybesh opened this issue Jun 19, 2018 · 4 comments
Assignees

Comments

@rybesh
Copy link
Member

rybesh commented Jun 19, 2018

(via @atomrab)

Would you mind taking a look at the rdf and/or ttl files in this folder: https://utexas.box.com/s/wtzn309lqo1aosp84nylndn0zumft3ro and letting me know if we can ingest the 2014 version programmatically, so that I don't have to add all of these by hand? I feel like this shouldn't be too hard to line up with our model, at least for someone who can actually write scripts, and it would save a tremendous amount of time. The folder also includes half a dozen older versions of the chronostratigraphic chart, which could be really interesting to visualize (but for the moment, I'd settle with having the current version).

In case these aren't already obvious, here are some observations about the rdf and ttl files:

  1. The URIs, which do resolve properly, are in the form http://resource.geosciml.org/classifier/ics/ischart/Aeronian (though they resolve as eg http://vocabs.ands.org.au/repository/api/lda/csiro/international-chronostratigraphic-chart-2016/2016-12-v3/resource.html?uri=http://resource.geosciml.org/classifier/ics/ischart/Aeronian). These URIs, as far as I can tell, appear in the rdf representation but not in the ttl one (??).

  2. The date-range is expressed in rdfs:comment as "older bound-" (="start") and "younger bound-" (="stop"), with a +/- that can be incorporated into four-part dates. All these dates are in Ma (=megayear=one million Julian years=million years ago, usually with "present" as 1950; the date notation doesn't appear in the rdf/ttl, but it does in the pages that the URIs resolve to). So

    <rdfs:comment xml:lang="en">older bound-439 +/-1.8</rdfs:comment><rdfs:comment xml:lang="en">younger bound-436 +/-1.9</rdfs:comment>

    should be parsed as earliestStart:-440798050 (that is, 439ma plus 1.8ma before 1950), latestStart:-437198050.

  3. The alternate languages are expressed with two-character language codes, without script codes, but we could probably identify these manually for the non-Latin scripts (I know the Bulgarian is Cyrillic, but I can't identify the Chinese or Japanese character set off the top of my head).

  4. I think we can use "World" as spatial coverage, at least for a start -- I have a query in with Denné about this.

  5. There are sameAs relations with dbpedia entries here -- should we try to capture those, and if so, how? Although the concepts are the same, the dates are sometimes different (eg http://dbpedia.org/resource/Aptian has 113 +/-1 Ma as the end date, but the corresponding entry in the dataset has 112 +/-1 Ma).

@rybesh rybesh self-assigned this Jun 19, 2018
@rybesh
Copy link
Member Author

rybesh commented Jun 19, 2018

  1. The URIs, which do resolve properly, are in the form http://resource.geosciml.org/classifier/ics/ischart/Aeronian (though they resolve as eg http://vocabs.ands.org.au/repository/api/lda/csiro/international-chronostratigraphic-chart-2016/2016-12-v3/resource.html?uri=http://resource.geosciml.org/classifier/ics/ischart/Aeronian). These URIs, as far as I can tell, appear in the rdf representation but not in the ttl one (??).

They do appear in the Turtle, the prefix "isc" is defined as
http://resource.geosciml.org/classifier/ics/ischart/ so
"isc:Aeronian" is the URI in your example.

  1. The date-range is expressed in rdfs:comment as "older bound-" (="start") and "younger bound-" (="stop"), with a +/- that can be incorporated into four-part dates. All these dates are in Ma (=megayear=one million Julian years=million years ago, usually with "present" as 1950; the date notation doesn't appear in the rdf/ttl, but it does in the pages that the URIs resolve to). So <rdfs:comment xml:lang="en">older bound-439 +/-1.8</rdfs:comment><rdfs:comment xml:lang="en">younger bound-436 +/-1.9</rdfs:comment> should be parsed as earliestStart:-440798050 (that is, 439ma plus 1.8ma before 1950), latest start:-437198050.

OK, that seems reasonable enough.

  1. The alternate languages are expressed with two-character language codes, without script codes, but we could probably identify these manually for the non-Latin scripts (I know the Bulgarian is Cyrillic, but I can't identify the Chinese or Japanese character set off the top of my head).

I'm fairly certain all of these have default scripts, so we don't need
to (and shouldn't, according to the spec) put in script codes anyway.

  1. There are sameAs relations with dbpedia entries here -- should we try to capture those, and if so, how? Although the concepts are the same, the dates are sometimes different (eg http://dbpedia.org/resource/Aptian has 113 +/-1 Ma as the end date, but the corresponding entry in the dataset has 112 +/-1 Ma).

I would argue that this is a misuse of sameAs (closeMatch would be
better) so I'd prefer to leave those statements out of our dataset.

@rybesh
Copy link
Member Author

rybesh commented Jun 19, 2018

@atomrab
Copy link

atomrab commented Jun 19, 2018

@rybesh yes, let's use 2017. Since they're storing all the old ones too, we should figure out something very clever to do with them eventually, but let's take the most current for the moment.

Leaving out the closeMatches is fine with me.

@rybesh rybesh added the data label Sep 26, 2019
@rybesh rybesh added new authority and removed data labels Dec 22, 2019
@rybesh
Copy link
Member Author

rybesh commented Jan 4, 2021

@rybesh rybesh closed this as completed Jan 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants