From e1f58600bf195b76124a8b33a044120bb9226d64 Mon Sep 17 00:00:00 2001 From: Kitty Currier Date: Thu, 27 Jun 2024 11:36:51 +0800 Subject: [PATCH 1/6] Fix typos in ontology.md --- ontology.md | 40 ++++++++++++++++++++-------------------- 1 file changed, 20 insertions(+), 20 deletions(-) diff --git a/ontology.md b/ontology.md index 8bfee4f..4197838 100644 --- a/ontology.md +++ b/ontology.md @@ -18,14 +18,14 @@ KnowWhereGraph uses [OWL time](https://www.w3.org/TR/owl-time/) for temporal inf #### Time Instants -The most basic node that represents time are nodes of type `time:Instant`. These nodes typically have four properties: +The most basic nodes that represent time are nodes of type `time:Instant`. These nodes typically have four properties: 1. A label with the full datetime 2. An XSD:Date representation 3. An XSD:DateTime representation 4. The year -An example is shown below where the full datetime and year are retrieved from a node of type, `time:Instant`. +An example is shown below where the full datetime and year are retrieved from a node of type `time:Instant`. ```SPARQL PREFIX rdfs: @@ -41,24 +41,24 @@ select ?time_label ?datetime ?year where { #### Time Intervals -Time intervals are nodes that represent a period of time. It has two properties of interest that point to nodes time instants (see above): +Time intervals are nodes that represent a period of time. They have two properties of interest that point to time instants (see above): 1. `time:hasBeginning` (when the period of time starts) 2. `time:hasEnd` (when the period of time ends) -The convenient bit about nodes of this type are that they reference `time:Instant` nodes through the two relations above. +The convenient bit about nodes of this type is that they reference `time:Instant` nodes through the two relations above. -An example of pulling out the start and end datetimes of an interval is given below. Note how the pattern of querying `time:Instant` is used. *All we've done is start at a `tine:Interval` and grabbed the values from the attached `time:Instant`*. +An example of pulling out the start and end datetimes of an interval is given below. Note how the pattern of querying `time:Instant` is used. *All we've done is start at a `time:Interval` and grabbed the values from the attached `time:Instant`*. ```SPARQL PREFIX rdfs: PREFIX rdf: PREFIX time: select ?time_label?datetime_begin ?datetime_end where { - ?time_onterval rdf:type time:Interval . - ?time_onterval rdfs:label ?time_label . - ?time_onterval time:hasBeginning ?time_begin . - ?time_onterval time:hasEnd ?time_end . + ?time_interval rdf:type time:Interval . + ?time_interval rdfs:label ?time_label . + ?time_interval time:hasBeginning ?time_begin . + ?time_interval time:hasEnd ?time_end . ?time_begin time:inXSDDateTime ?datetime_begin . ?time_end time:inXSDDateTime ?datetime_end . } limit 1 @@ -66,11 +66,11 @@ select ?time_label?datetime_begin ?datetime_end where { #### Connecting Things to Time -When connecting events and data to time KnowWhereGraph uses its own term, `kwg-ont:hasTemporalScope`. You will *always* use this relation to obtain information temporal information about an event. +When connecting events and data to time, KnowWhereGraph uses its own term, `kwg-ont:hasTemporalScope`. You will *always* use this relation to obtain temporal information about an event. -One important note to take is that data of the same class can have either `time:Instant` *or* `time:Interval` data. This means that if you're querying for data of type `kwg-ont:Hazard` and asking for temporal information - you need to look for both `time:Instant` *and* `time:Interval` connections. +One important note to take is that data of the same class can have either `time:Instant` *or* `time:Interval` data. This means that if you're querying for data of type `kwg-ont:Hazard` and asking for temporal information --- you need to look for both `time:Instant` *and* `time:Interval` connections. -An example is shown below where we count the number of `kwg-ont:Hazard`s that are connected to `time:Instant`. At the time of writing this, there are 1,248,050 hazards with time instant data. +An example is shown below where we count the number of `kwg-ont:Hazard`s that are connected to `time:Instant`. At the time of this writing, there are 1,248,050 hazards with time instant data. ```SPARQL PREFIX rdfs: @@ -121,7 +121,7 @@ select ?datetime ?datetime_begin ?datetime_end where { ?hazard_time rdf:type time:Instant . ?hazard_time time:inXSDDateTime ?datetime . } -} LIMIT 10 +} LIMIT 16 ``` The query results are shown below. The important thing to note here is that we have datetimes for instants and the full range for time intervals. @@ -165,16 +165,16 @@ Data is encoded using the [SOSA](https://www.w3.org/TR/vocab-ssn/) ontology. Loo 1. `sosa:Observation`: Contains numeric or categorical data about an observation which is stored as literal values, accessible through `sosa:hasSimpleResult`. 2. `sosa:ObservationCollection`: Nodes of this type point to one or more `sosa:Observation` nodes through the relation `sosa:hasMember`. -Rather than using the classnames above, KnowWhereGraph subclasses them into more specific instances. For example, thunderstorm events are connected to `kwg-ont:ImpactObservationCollection`, which is a subclass of `sosa:ObservationCollection`. **Nodes can be connected to multiple observation collections, of different types** +Rather than using the class names above, KnowWhereGraph subclasses them into more specific instances. For example, thunderstorm events are connected to `kwg-ont:ImpactObservationCollection`, which is a subclass of `sosa:ObservationCollection`. **Nodes can be connected to multiple observation collections, of different types.** -This won't make a huge impact on the way you query data, but allows you to be more specific in the kinds of data that you want. +This won't make a huge impact on the way you query data, but it allows you to be more specific in the kinds of data that you want. When reading and using the SOSA ontology, it's important to buy into the concept of properties and observations: 1. Properties: Things being observed. This could be the air temperature, number of deaths, obesity rates, number of beds that a hospital has, etc. 2. Observation: The act that resulted in a property being observed. It provides the context *around* the property, such as the value of the property or the name of the observation. This would be the literal *number* of hospital beds. -For example,the following query retrieves *all* the observation collections for a particular hazard. +For example,the following query retrieves *all* the observation collections for a particular hazard instance. ```SPARQL PREFIX sosa: @@ -213,11 +213,11 @@ KnowWhereGraph provides an array of different kinds of data. To make it easy to 3. Injuries caused directly by the incident 4. Injuries caused indirectly by the incident -Each of these nodes is a `sosa:Observation` and can be queried the same way. For a complete list of types of data, refer to the ontology. In addition to the observation having a type - it *also* has an important field `sosa:observedProperty`. This is also used to filter queries to the desired type. Again, the ontology lists the various properties that have been observed. +Each of these nodes is a `sosa:Observation` and can be queried the same way. For a complete list of types of data, refer to the ontology. In addition to the observation having a type --- it *also* has an important field `sosa:observedProperty`. This is also used to filter queries to the desired type. Again, the ontology lists the various properties that have been observed. -drawing +drawing -A complete example of retrieving the number of direct deaths for several hazards is shown below. Note the connections between the `kwg-ont:Hazard`, the `sosa:ObservationCollection` (technically KnowWhereGraph's subclass of it), and the portion of the query obtaining the literal values. +A complete example of retrieving the number of direct deaths for several hazards is shown below. Note the connections between the `kwg-ont:Hazard`, the `sosa:ObservationCollection` (technically, KnowWhereGraph's subclass of it), and the portion of the query obtaining the literal values. ```SPARQL PREFIX sosa: @@ -236,7 +236,7 @@ select ?hazard_name ?direct_deaths where { } LIMIT 10 ``` -The results are shown below (luckily these events had 0 associated deaths) +The results are shown below (luckily these events had 0 associated deaths). drawing From f8b6e96c3365ec40ea46ab7644912ef522aa5ecf Mon Sep 17 00:00:00 2001 From: Kitty Currier Date: Fri, 28 Jun 2024 15:24:03 +0800 Subject: [PATCH 2/6] Make capitalization consistent in README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 7b481d6..7b21857 100644 --- a/README.md +++ b/README.md @@ -10,6 +10,6 @@ Before you can use KnowWhereGraph's data, you must first understand part of the Obtaining data can be a difficult process - if you're using SPAQRL (which you will be), it's even more. Learn a few tips and tricks to getting the data you need from the database [Here](./sparql-download.md). -## Running your own +## Running Your Own It's possible to run your own instance of KnowWhereGraph! Do you *need* to? Do you want to? Read on [Here](./self-hosted.md) to make that call. From 6625e8e22a8a110342ce17da041eeea3284ec305 Mon Sep 17 00:00:00 2001 From: Kitty Currier Date: Fri, 28 Jun 2024 15:36:31 +0800 Subject: [PATCH 3/6] Minor edits/fix typos in sparql-download.md --- sparql-download.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sparql-download.md b/sparql-download.md index f611c6c..d90d61c 100644 --- a/sparql-download.md +++ b/sparql-download.md @@ -48,7 +48,7 @@ print(query_result) ## Going Further -Chances are, you'll want to do than just select one node from the graph. To successfully obtain the data that you need you'll need to first understand the structure of the ontology, which is admittingly not a small task. Rather than becoming familiar with the *entire* ontology - it's recommended that you understand the local area that you're interested in. For example, viewing the ontology documentation for nodes of type `Hazard` and their connections to numerical data. +Chances are, you'll want to do more than just select one node from the graph. To successfully obtain the data that you need you'll need to first understand the structure of the ontology, which is admittedly not a small task. Rather than becoming familiar with the *entire* ontology — it's recommended that you understand the local area that you're interested in. For example, view the ontology documentation for nodes of type `Hazard` and their connections to numerical data. For a refresher on key points of KWG's ontology, jump [Here](/./ontology.md). Reading the ontology, writing a SPARQL query, and referring back to the ontology will undoubtedly be a staple of your data acquisition workflow. From bdeb31da0c607cf1346b6a141b1cf03ce7b5f858 Mon Sep 17 00:00:00 2001 From: Kitty Currier Date: Fri, 28 Jun 2024 15:43:13 +0800 Subject: [PATCH 4/6] Minor edits in ontology.md --- ontology.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/ontology.md b/ontology.md index 4197838..aa2d83b 100644 --- a/ontology.md +++ b/ontology.md @@ -68,7 +68,7 @@ select ?time_label?datetime_begin ?datetime_end where { When connecting events and data to time, KnowWhereGraph uses its own term, `kwg-ont:hasTemporalScope`. You will *always* use this relation to obtain temporal information about an event. -One important note to take is that data of the same class can have either `time:Instant` *or* `time:Interval` data. This means that if you're querying for data of type `kwg-ont:Hazard` and asking for temporal information --- you need to look for both `time:Instant` *and* `time:Interval` connections. +One important note to take is that data of the same class can have either `time:Instant` *or* `time:Interval` data. This means that if you're querying for data of type `kwg-ont:Hazard` and asking for temporal information — you need to look for both `time:Instant` *and* `time:Interval` connections. An example is shown below where we count the number of `kwg-ont:Hazard`s that are connected to `time:Instant`. At the time of this writing, there are 1,248,050 hazards with time instant data. @@ -213,7 +213,7 @@ KnowWhereGraph provides an array of different kinds of data. To make it easy to 3. Injuries caused directly by the incident 4. Injuries caused indirectly by the incident -Each of these nodes is a `sosa:Observation` and can be queried the same way. For a complete list of types of data, refer to the ontology. In addition to the observation having a type --- it *also* has an important field `sosa:observedProperty`. This is also used to filter queries to the desired type. Again, the ontology lists the various properties that have been observed. +Each of these nodes is a `sosa:Observation` and can be queried the same way. For a complete list of types of data, refer to the ontology. In addition to the observation having a type — it *also* has an important field `sosa:observedProperty`. This is also used to filter queries to the desired type. Again, the ontology lists the various properties that have been observed. drawing From 1d3036701a1d591dc72d264c908e70e98c72c041 Mon Sep 17 00:00:00 2001 From: Kitty Currier Date: Fri, 28 Jun 2024 17:42:58 +0800 Subject: [PATCH 5/6] Minor typo/other edits to data-download-walkthrough.md --- data-download-walkthrough.md | 62 ++++++++++++++++++------------------ 1 file changed, 31 insertions(+), 31 deletions(-) diff --git a/data-download-walkthrough.md b/data-download-walkthrough.md index 459f36d..0d3b2eb 100644 --- a/data-download-walkthrough.md +++ b/data-download-walkthrough.md @@ -12,41 +12,41 @@ I want to download the following information for all wildfire events in Santa Ba ### Step 1: Your First Query -When navigating the graph for the data that you want, you need your first query to land you *somewhere* close to what you're after. Because we've already browsed through the [ontology](./ontology.md) (you've done this, right?) at this point and confirmed we have this data for wildfires... We know that there's a type called `kwg-ont:Wildfire`, so let's assume the nodes we're after are going to be that type. Let's take a look at a single wildfire and see how the data connects to it. +When navigating the graph for the data that you want, you need your first query to land you *somewhere* close to what you're after. Because we've already browsed through the [ontology](./ontology.md) (you've done this, right?) at this point and confirmed we have this data for wildfires... We know that there's a type called `kwg-ont:Wildfire`, so let's assume the nodes we're after are going to be of that type. Let's take a look at a single wildfire and see how the data connects to it. In the image below, we did a simple query for a *single* wildfire. -drawing +drawing ### Step 2: Exploring the Node Click on the link for [kwgr:hazard.1180930.5434012](https://stko-kwg.geog.ucsb.edu/graphdb/resource?uri=http:%2F%2Fstko-kwg.geog.ucsb.edu%2Flod%2Fresource%2Fhazard.1180930.5434012&role=subject) in the query results table to bring up the following page. -drawing +drawing On this page, we see all the predicates that this wildfire is connected to. **This is incredibly powerful when figuring out what is connected to what, and how to write your SPARQL**. The predicates that should jump out are -1. kwg-ont:hasTemporalScope: This is the relation that connects the fire to temporal information -2. kwg-ont:sfWithin: This is the relation that connects the fire to spatial information -3. sosa:isFeatureofInterestOf: This is the relation that connects to fire to numeric and categorical data +1. `kwg-ont:hasTemporalScope`: This is the relation that connects the fire to temporal information. +2. `kwg-ont:sfWithin`: This is the relation that connects the fire to spatial information. +3. `sosa:isFeatureOfInterestOf`: This is the relation that connects the fire to numeric and categorical data. These predicates show up time and time again in KnowWhereGraph and are important to recognize. ### Step 3: Getting Numeric & Categorical Data -When thinking about data values, the SOSA ontology should be at the forefront of your mind. If you haven't already, check out the [ontology](./ontology.md) page for the gist on this ontology. +When thinking about data values, the SOSA ontology should be at the forefront of your mind. If you haven't already, check out the [ontology](./ontology.md) page for the gist of this ontology. From our main Wildfire node view, navigate to the node that's in the range of sosa:isFeatureOfInterestOf ([kwgr:impact.1180930.5434012](https://stko-kwg.geog.ucsb.edu/graphdb/resource?uri=http:%2F%2Fstko-kwg.geog.ucsb.edu%2Flod%2Fresource%2Fimpact.1180930.5434012&role=subject)). On this page (shown below), we see that familiar structure of an observation collection. -drawing +drawing Looking at the different types of observations in this collection, the one named `kwgr:deathDirectObs.1180930.5434012` should stick out. Sort of sounds like this node might be related to deaths from the wildfire, right? Let's click on that observation to see what data lies inside. -drawing +drawing Bingo. We see that there's a numeric value that represents the number of people that have died from this wildfire, completing our path from a node of type `kwg-ont:Wildfire` to the actual data value. @@ -68,7 +68,7 @@ SELECT ?fire ?direct_deaths WHERE { ### Step 4: Handling Spatial Data -In this step, we have to find the node that represents Santa Barbara County. Using SPARQL. From the [ontology](./ontology.md), we know that counties are going to be kwg-ont:AdministrativeRegion_ . +In this step, we have to find the node that represents Santa Barbara County. Using SPARQL. From the [ontology](./ontology.md), we know that counties are going to be `kwg-ont:AdministrativeRegion_3` . Building our query, we can start with @@ -78,7 +78,7 @@ SELECT * WHERE { } ``` -We also know that the words "Santa Barbara" should be in the `rdfs:label` - so let's add some REGEX to our query. +We also know that the words "Santa Barbara" should be in the `rdfs:label` — so let's add some REGEX to our query. ```SPARQL SELECT * WHERE { @@ -88,7 +88,7 @@ SELECT * WHERE { } ``` -drawing +drawing From the results, we see that there are *several* counties whose names include "Santa Barbara". By process of eliminiation, we can make a good assumption that the node we're after is the first one, [Earth.North_America.United_States.USA.5.42_1](https://stko-kwg.geog.ucsb.edu/graphdb/resource?uri=http:%2F%2Fstko-kwg.geog.ucsb.edu%2Flod%2Fresource%2FEarth.North_America.United_States.USA.5.42_1&role=subject). @@ -107,31 +107,31 @@ SELECT * WHERE { Running this yields the following results -drawing +drawing ### Step 5: Getting Temporal Data -Looking back at our initial [Hazard node](https://stko-kwg.geog.ucsb.edu/graphdb/resource?uri=http:%2F%2Fstko-kwg.geog.ucsb.edu%2Flod%2Fresource%2Fhazard.1180930.5434012&role=subject) from Step 2, we can see that there's temporal information attached to it. We can come to that conclusion by realizing that there's a relation (kwg-ont:hasTemporalScope) that has words related to time in it. Now this isn't a technical approach to navigating the graph - but it's practical. If you have the ontology memorized (which honestly no one does), then you'd know that `kwg-ont:hasTemporalScope` links nodes to temporal data. +Looking back at our initial [Hazard node](https://stko-kwg.geog.ucsb.edu/graphdb/resource?uri=http:%2F%2Fstko-kwg.geog.ucsb.edu%2Flod%2Fresource%2Fhazard.1180930.5434012&role=subject) from Step 2, we can see that there's temporal information attached to it. We can come to that conclusion by realizing that there's a relation (`kwg-ont:hasTemporalScope`) that has words related to time in it. Now this isn't a technical approach to navigating the graph — but it's practical. If you had the ontology memorized (which honestly no one does), then you'd know that `kwg-ont:hasTemporalScope` links nodes to temporal data. -Based on the name `interval` - we can probably guess that this node is going to have some sort of start and end date. Let's take a look. +Based on the name `interval` — we can probably guess that this node is going to have some sort of start and end date. Let's take a look. -Clicking on the [kwgr:interval.200504221530_200504221930EST](https://stko-kwg.geog.ucsb.edu/graphdb/resource?uri=http:%2F%2Fstko-kwg.geog.ucsb.edu%2Flod%2Fresource%2Finterval.200504221530_200504221930EST&role=subject) object - we're brought to the node that holds the temporal relations for this Wildfire. +Clicking on the [kwgr:interval.200504221530_200504221930EST](https://stko-kwg.geog.ucsb.edu/graphdb/resource?uri=http:%2F%2Fstko-kwg.geog.ucsb.edu%2Flod%2Fresource%2Finterval.200504221530_200504221930EST&role=subject) object — we're brought to the node that holds the temporal relations for this Wildfire. -drawing +drawing -Again, just by looking at the relations we can mostly tell what we're looking at. We see two relations of interest - one that represents the beginning of the wildfire, and one that represents the end. Let's click on the node that represents the beginning. +Again, just by looking at the relations we can mostly tell what we're looking at. We see two relations of interest — one that represents the beginning of the wildfire, and one that represents the end. Let's click on the node that represents the beginning. -drawing +drawing -Recall that we want the year that each wildfire happened. Without referencing the ontology, we should be able to tell that `time:inXSDgYear` is the predicate that we want. +Recall that we want the year that each wildfire began. Without referencing the ontology, we should be able to tell that `time:inXSDgYear` is the predicate that we want. -Using the path that we just followed, we can write a small proof of concept query that gets the years of wildfires. +Using the path that we just followed, we can write a small proof-of-concept query that gets the years of wildfires. -``` +```SPARQL PREFIX kwg-ont: PREFIX time: -select ?year where { - ?wildfire a kwg-ont:Wildfire . +select ?year where { + ?wildfire a kwg-ont:Wildfire . ?wildfire kwg-ont:hasTemporalScope ?temporal_scope . ?temporal_scope time:hasBeginning ?wilfire_start . ?wildfire_start time:inXSDgYear ?year . @@ -145,7 +145,7 @@ Now that we have separate queries to get 1. Wildfires 2. Wildfires within Santa Barbara County 3. Number of deaths per wildfire -4. They year of the wildfire +4. Year each wildfire began We'll combine them together to form our final query to get the data we want. @@ -160,7 +160,7 @@ SELECT ?fire_name ?direct_deaths ?year WHERE { ?fire a kwg-ont:Wildfire . ?fire rdfs:label ?fire_name . ?fire kwg-ont:sfWithin kwgr:Earth.North_America.United_States.USA.5.42_1 . - ?wildfire kwg-ont:hasTemporalScope ?temporal_scope . + ?fire kwg-ont:hasTemporalScope ?temporal_scope . ?temporal_scope time:hasBeginning ?wilfire_start . ?wildfire_start time:inXSDgYear ?year . ?fire sosa:isFeatureOfInterestOf ?observation_collection . @@ -174,8 +174,8 @@ SELECT ?fire_name ?direct_deaths ?year WHERE { To summarize the steps and key points above... -1. Start small, build big - 1. Build paths to each thing that you want - 2. Combine the smaller queries and logic together to form a larger query -2. Use the SPARQL editor to find relevant nodes and explore them in GraphDB's interface -3. The temporal, spatial, and data representations are similar throughout the database. Learn each and be able use the same pattern everywhere +1. Start small, build big. + 1. Build paths to each thing that you want. + 2. Combine the smaller queries and logic together to form a larger query. +2. Use the SPARQL editor to find relevant nodes and explore them in GraphDB's interface. +3. The temporal, spatial, and data representations are similar throughout the database. Learn each and be able use the same pattern everywhere. From daacca6359678f37e7bd2f1df58615f11064f396 Mon Sep 17 00:00:00 2001 From: Kitty Currier Date: Sat, 6 Jul 2024 11:42:38 +0800 Subject: [PATCH 6/6] Edit very minor wording in self-hosted.md --- self-hosted.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/self-hosted.md b/self-hosted.md index 389f3e1..e463187 100644 --- a/self-hosted.md +++ b/self-hosted.md @@ -1,6 +1,6 @@ # Self Hosted Instances -With a bit of work, it's possible to deploy KnowWhereGraph stack on your own machine. +With a bit of work, it's possible to deploy the KnowWhereGraph stack on your own machine. ## Should You Do This? @@ -26,7 +26,7 @@ Doing the above is most likely more than you need, and can be complicated to dep ## Prerequisites 1. Compute resources (requirements are listed in the deployment repository) -2. Raw triplified data (data in the form of .ttl, .trig, etc). **We do not provide this and cannot be generated by you** +2. Raw triplified data (data in the form of .ttl, .trig, etc). **We do not provide this and it cannot be generated by you** 3. Fortitude and willpower to deploy the stack ## Deployment