New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate report comparing launch dates in GCIS vs dbpedia for same platforms #117
Comments
Query to get launch dates for platforms from GCIS. Next I will update the query to compare with launch dates from dbpedia.
|
Great, thanks, I'm adding this to the (newly created) gcis-sparql repo:
I'll send an email (outside this ticket) about this repo. Brian |
I have updated my query to select the dbpedia URI for the matching instance. I will then be able to write use a federated query to retrieve the launch date for the platform from dbpedia.
I have run into an issue where I am unable to retrieve the value of see http://data.globalchange.gov/lexicon/dbpedia.thtml for example in REST API. This statement is generated by the representation.ttl.tut template. @bduggan is it possible RDF from this template is not being included in the triplestore load? |
I have updated my query to use the a
without a limit on the result set size the query times out. The launch dates in dbpedia seem to generally be missing the year component. Looking at example RDF on dbpedia the results: |
On Tuesday, August 25, Stephan Zednik wrote:
Yes, fixed. I'm re-runing the import, should be updated in 30 minutes or so. Brian |
A nice refinement would be to only show entries for which the date differs. e.g. why is the GOES-2 launch listed as 1984 in one system and 1977 in another? |
@bduggan that might be a bit hard to do in the query since we would have to potentially (but not always) combine and reformat the ?launchDate_dbpedia and ?cospar_dbpedia variables into a date. I think it would probably be easier to do that analysis in a spreadsheet where you can use some simple parsing logic to attempt to process dbepdia's inconsistent dates. Also, I am attempting to update the gcis-sparql files for this report but the federated query is frequently timing out. |
@justgo129 @bduggan query added to gcis-sparql. Is this ticket ready to be closed? |
I just took a look. Is there a way to standardize the date formatting in the output? |
It would be far easier to apply some post-processing to the query results to fix the dates then to add that logic to the query. The dbpedia RDF uses inconsistent literal types with the launch date values and updating the query to standardize the formatting would make the query much more complicated and probably make the timeout issue worse. Additionally, because they frequently split the year out of the launch date and encode the month and day of the launch using xsd:gMonthDay (which I have never seen used in RDF before) the process to standardize the query would be to extract the appropriate date components from ?launchDate_dbpedia and ?cospar_dbpedia (with checks because of the data inconsistency) and build a new date serialization using a string concatenation. This could perhaps be done in the query but it would make it very ugly, and probably slower. It would be much easier to do this as post-processing on the CSV using perl or python. |
Works for me. Could we at least git rid of the "Cospar_dbpedia" entries beginning with "SWARM A" or would that also be a post-processing candidate? I'd think we could at least strip out the text within the SPARQL query. After that, feel free to repush and close. |
I am not sure we should strip out values from ?cospar_dbepdia. That property is not explicitly for the year of the launch but for the COSPAR ID. It seems that the year is often (but not always) part of the COSPAR ID. I would keep the cospar id intact and leave the logic of parsing it and extracting relevant year information (if any) to post-processing. |
@justgo129 - yes, I think that post-processing is the best approach. It On Wed, Sep 2, 2015 at 9:38 PM, justgo129 notifications@github.com wrote:
Robert Wolfe, NASA GSFC @ USGCRP, o: 202-419-3470, m: 301-257-6966 |
Compare the launch dates of platforms in GCIS (i.e. from CEOS) to launch dates from dbpedia.
The text was updated successfully, but these errors were encountered: