forked from jmandel/testing-ghpages
-
Notifications
You must be signed in to change notification settings - Fork 6
/
index.html
349 lines (274 loc) · 16 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>A Quick Introduction to RDF and SPARQL</title>
<meta name="author" content="SMART Platforms">
<meta name="viewport" content="width=device-width,initial-scale=1.0">
<!-- Le HTML5 shim, for IE6-8 support of HTML elements -->
<!--[if lt IE 9]>
<script src="http://html5shim.googlecode.com/svn/trunk/html5.js"></script>
<![endif]-->
<link href="/assets/themes/twitter-2.0/css/pygments.css" rel="stylesheet">
<link href="/assets/themes/twitter-2.0/css/bootstrap.min.css" rel="stylesheet">
<link href="/assets/themes/twitter-2.0/css/bootstrap-responsive.min.css" rel="stylesheet">
<link href="/assets/themes/twitter-2.0/css/style.css" rel="stylesheet">
</head>
<body>
<div class="container">
<div class="content">
<section id="jekyll-page">
<div class="row">
<div class="span4">
<a href="/">
<img id='smart_top_logo' src="/images/smart.png"/>
</a>
<div id="left_nav">
<ul class="nav nav-list">
<li class="nav-header">Tutorials</li>
<li><a href="/howto/build_a_smart_app">Build a SMART App</a></li>
<li><a href="/howto/build_a_rest_app">Build a SMART REST App</a></li>
<li><a href="/howto/background_and_helper_apps">Background + Helper Apps</a></li>
<li><a href="/howto/howto_build_smart_frame_ui_apps">Frame UI Apps</a></li>
<li><a href="/howto/got_statins">Got Statins? App</a></li>
<li><a href="/howto/rx_reminder">RxReminder App</a></li>
<li class="nav-header">Data Model + Querying</li>
<li><a href="/datamodel/intro_to_rdf">Intro to RDF and SPARQL</a></li>
<li><a href="/datamodel/sparql_examples">SPARQL Examples for SMART</a></li>
<li><a href="/datamodel/smart_data">SMART Data: Best Practices</a></li>
<li><a href="/datamodel/deferred">$.Deferred for Parallel Queries</a></li>
<li class="nav-header">Reference</li>
<li><a href="/reference/data_model">Data Model</a></li>
<li><a href="/reference/rest_api">REST API</a></li>
<li><a href="/reference/app_manifest">App Manifest</a></li>
<li><a href="/reference/change_log">Changelog</a></li>
<li class="nav-header">Client Libraries</li>
<li><a href="/libraries/javascript">Javascript (SMART Connect)</a></li>
<li><a href="/libraries/python">Python</a></li>
<li><a href="/libraries/java">Java</a></li>
<li><a href="/libraries/dotnet">.NET</a></li>
<li><a href="/libraries/container_javascript">Conatiner-side Javascript</a></li>
<li class="nav-header">Reference EMR Installation</li>
<li><a href="/install/linux">Ubuntu Linux</a></li>
<li><a href="/install/os_x">OS X</a></li>
<li class="nav-header">SMART 0.4 Update Guides</li>
<li><a href="/updates/smart_0_4/app/">For Apps</a></li>
<li><a href="/updates/smart_0_4/container/">For Containers</a></li>
<li class="nav-header">Presentation (2010-08-26)</li>
<li><a href="http://www.slideshare.net/jmandel/2010-0826smartarchitecture">Architecture</a></li>
<li><a href="http://www.slideshare.net/jmandel/2010-08-26-smart-governance">Governance</a></li>
<li><a href="http://media.smartplatforms.org/smart-screencast.mp4">Demo</a></li>
<li class="nav-header">Downloads</li>
<li><a href="/downloads/">Download Source + VM</a></li>
</ul>
</div>
</div>
<div class="span8" id="jekyll-page-content">
<div class="page-header">
<h1>A Quick Introduction to RDF and SPARQL <small></small></h1>
</div>
<div id="toc"> </div>
<p>The SMART API supplies patient record data in the form of an RDF graph. If
you've never used (or even heard of!) RDF, this document should help you get up
to speed. So let's jump right in!</p>
<h2>What is RDF, anyway?</h2>
<p>RDF, the Resource Description Framework, is a web standard "for representing
information about resources" (this according to the <a href="http://www.w3.org/TR/2004/REC-rdf-primer-20040210/">W3C's RDF
Primer</a>). In brief, it's a
flexible way to represent data in the form of sentences or "triples" that link a
subject, a predicate, and an object. For example, let's say we want to represent
the idea that "Mr. Smith takes atorvastatin". We might create the following
triple</p>
<ul>
<li>subject Mr. Smith</li>
<li>predicate takes</li>
<li>object atorvastatin</li>
</ul>
<p>There are two key ideas here</p>
<ol><li>Everything (almost) is a resource.</li>
<li>Resources are related by triples</li>
</ol>
<p>Let's explore each in more depth</p>
<h2>Everything (almost) is a resource</h2>
<p>In RDF, every triple has a resource as its subject. In our example, we call Mr.
Smith a "resource" because he is a particular guy out there in the world. He is
not just the string of letters "M-r-.-S-m-i-t-h." Importantly, if I know Mr.
Alex Smith and you know Mr. Bob Smith, we are not talking about the same
resource! To prevent these kinds of mix-ups, resources in RDF aren't just
identified by strings like "Mr. Smith." Instead, they're represented by Uniform
Resource Identifiers basically URLs that provide a built-in namespace. For
example, let's say my Mr. Smith maintains a web site at
<a href="http://alexsmith.somedomain.com"><a href="http://alexsmith.somedomain.com">http://alexsmith.somedomain.com</a></a>. I might
refer to him by the URL
<a href="http://alexsmith.somedomain.com/me]"><a href="http://alexsmith.somedomain.com/me">http://alexsmith.somedomain.com/me</a></a>. Now
you certainly wouldn't confuse my Mr. Smith for yours! (Note there doesn't have
to be an actual web page served at the address of a URI. The important thing is
that the URI identifies a resource. Uniformly.)</p>
<p>What about the predicate in our example, the word "takes"? Predicates in RDF are
triples, too. If we just used the string "takes" as our predicate, again we
might mean different things I might mean "consumes a drug, as part of a daily
regimen", and you (cynic!) might mean "steals from his wife's pillbox on
Thursday mornings." To resolve this ambiguity, I could represent 'takes' as
<a href="http://joshuamandel.com/my_dr
ug_vocabulary/takes"><a href="http://joshuamandel.com/my_drug_vocabulary/takes">http://joshuamandel.com/my_drug_vocabulary/takes</a></a>. Over time, I could build up a rich vocabulary with all
kinds of terms, and use these as predicates in my RDF triples. In general,
things work best when people can agree on the meanings of terms and use a shared
vocabulary. So folks build up publically defined vocabularies such as
<a href="http://xmlns.com/foaf/spec/">FOAF</a> (used to describe the elements in social
networks like friends, names, and birthdays) or <a href="http://purl.org/dc/elements/1.1/">Dublic
Core</a> (used to describe metadata like the
Titles, Creators, and Publishers or resources). These shared vocabularies become
the basis for rich representation (and interpretation) of information about
resources.</p>
<p>And finally, what about "atorvastatin"? Again, the best way to represent a
concept like atorvastatin is as a URI that everyone can agree on. One
possibility is to use the drug's RxNorm Concept ID (in this case, 83367) as part
of the URI. For example, SMART uses the URI
<a href="http://link.informat
ics.stonybrook.edu/rxnorm/RXCUI/83367"><a href="http://link.informatics.stonybrook.edu/rxnorm/RXCUI/83367">http://link.informatics.stonybrook.edu/rxnorm/RXCUI/83367</a></a>, sharing a vocabulary with Stonybrook.
But recall we said almost everything is a resource. If we want, RDF lets us use
a simple string as the object of a triple. So, for example, consider this
representation of a Haiku</p>
<ul>
<li>subject <a href="http://dilute.net/poems/25"><a href="http://dilute.net/poems/25">http://dilute.net/poems/25</a></a></li>
<li>predicate dcterms title (Dublin Core Terms vocabulary's 'title' predicate)</li>
<li>object "Haiku entitled Substitutability the SMART way to go."</li>
</ul>
<p>In this case, I don't need to point to a resource as the title of my haiku. The
title is really just a string, after all -- so I can just represent it as such. </p>
<h2>Resources are related by triples</h2>
<p>In RDF, the only way to represent relations among resources is by creating
triples. If graph theory is your thing, you can think of triples as arcs in a
directed graph from subject to predicate to object. The same resource can be the
subject (or object) or multiple triples. For example, consider my SMART haiku.
In addition to the triple above, I could some more triples</p>
<ul>
<li>subject <a href="http://dilute.net/poems/25"><a href="http://dilute.net/poems/25">http://dilute.net/poems/25</a></a></li>
<li>predicate dc:creator (Dublin Core vocabulary's 'creator' predicate)</li>
<li><p>object <a href="http://joshuamandel.com/me"><a href="http://joshuamandel.com/me">http://joshuamandel.com/me</a></a></p></li>
<li><p>subject <a href="http://joshuamandel.com/me"><a href="http://joshuamandel.com/me">http://joshuamandel.com/me</a></a></p></li>
<li><p>predicate foaf:name (FOAF vocabulary's 'name' predicate)</p></li>
<li><p>object "Josh Mandel"</p></li>
</ul>
<p>Note that I am the object of one triple (as the creator of the haiku) and the
subject of another (as a person with a name)!</p>
<p>What about more complex relationships? For example, what if I want to represent
the fact that my breakfast this morning consisted of Joe's O's, milk, and
coffee? This is an open-ended data-modeling exercise, but I'll just point out
one approach which involves creaing a resource for "the stuff I had for
breakfast this morning", and adding relations to that. So then (in sketch form)
we'd have</p>
<ul>
<li>subject <a href="http://joshuamandel.com/me">http://joshuamandel.com/me</a></li>
<li>predicate <a href="http://joshuamandel.com/my_food_vocabulary/ate">http://joshuamandel.com/my_food_vocabulary/ate</a></li>
<li><p>object _stuff_I_ate_this_morning </p></li>
<li><p>subject _stuff_I_ate_this_morning</p></li>
<li><p>predicate rdfli (RDF vocabulary's 'list item' predicate)</p></li>
<li><p>object "Joe's O's"</p></li>
<li><p>subject _stuff_I_ate_this_morning</p></li>
<li><p>predicate rdf li</p></li>
<li><p>object "milk"</p></li>
<li><p>subject <tt>_stuff_I_ate_this_morning </p></li>
<li><p>predicate rdf li</p></li>
<li><p>object "coffee"</p></li>
</ul>
<p>Notice that I've loosely referred to a resource here as the "bunch of stuff I
ate this morning". I didn't give it a formal URI, because it doesn't exist
outside of the context of this particular RDF graph, and it's entirely defined
by its relations above. For cases like this, RDF provides anonymous or blank
nodes whose identifiers have meaning only within the context of a particular
graph.</p>
<h2>Representing RDF Graphs</h2>
<p>So far, we've been talking about RDF graphs as theoretical sets of triples. How
do we write down or "serialize" an RDF graph in a way that lets us share it with
others? There are in fact several standard notations for representing an RDF
graph. The simplest representation is to write triples out, one per line, with a
period at the end of each line. URIs are enclosed in angle brackets (e.g.
<<a href="http://my_uri">http://my_uri</a>>\; blank nodes are prefaced with the _ prefix
(e.g. _my-blank-node), and strings are enclosed in quotes (e.g. "my string
value")</p>
<p><a href="http://dilute.net/poems/25">http://dilute.net/poems/25</a> <a href="http://purl.org/dc/terms/title">http://purl.org/dc/terms/title</a> \"Haiku entitled /
Substitutability: / the SMART way to go.\" .</p>
<p>An XML-based representation known as RDF/XML serializes the same triple more
verbosely:</p>
<div class="highlight"><pre><code class="html"><span class="cp"><?xml version="1.0"?></span>
<span class="nt"><rdf:RDF</span> <span class="na">xmlns:rdf=</span><span class="s">"http://www.w3.org/1999/02/22-rdf-syntax-ns#"</span>
<span class="na">xmlns:terms=</span><span class="s">"http://purl.org/dc/terms/"</span><span class="nt">></span>
<span class="nt"><rdf:Description</span> <span class="na">rdf:about=</span><span class="s">"http://dilute.net/poems/25"</span><span class="nt">></span>
<span class="nt"><terms:title></span>Haiku entitled / Substitutability: / the SMART way to
go.<span class="nt"></terms:title></span>
<span class="nt"></rdf:Description></span>
<span class="nt"></rdf:RDF></span>
</code></pre>
</div>
<h1>And what about SPARQL?</h1>
<p>SPARQL is a query language for interacting with RDF graphs. The syntax is
designed to look a bit like SQL, the structured query language used with
relational databases. The W3C maintains an <a href="http://www.w3.org/TR/rdf-sparql-query/">extremely
readable</a> standard that's peppered with
examples. Here, we'll not even skim the surface...</p>
<h2>A simple SPARQL query</h2>
<p>Given our breakfast graph above, let's write a query to find all the things I
ate! Here's a first attempt (not quite perfect)</p>
<pre><code>PREFIX food: <http://joshuamandel.com/my_food_vocabulary/>
SELECT ?f WHERE
{
<http://joshuamandel.com/me> food:ate ?f.
}
</code></pre>
<p>A bit of syntax I've defined a prefix called "food" which I'll use to refer to
my personal food vocabulary. This is just for readability; it lets me later
write food:ate instead of the more verbose
<a href="http://joshuamandel.com/my_food
_vocabulary/ate"><a href="http://joshuamandel.com/my_food_vocabulary/ate">http://joshuamandel.com/my_food_vocabulary/ate</a></a>.</p>
<p>Now here's what the query does: it looks for triples that match the pattern
inside the WHERE clause. In this case, triples whose subject is me; whose
predicate is food:ate and whose object can be anything (indicated by the
question mark in ?f). My decision to use ?f as a variable name was completely
discretionary. I could have called it ?nourishment or ?xyzzy. The name only
matters within the context of my query.</p>
<p>But this query has a problem it returns the blank node
_stuff_I_ate_this_morning -- and not the actual foods! Let's fix it by
adding to our WHERE clause</p>
<pre><code>PREFIX food: <http://joshuamandel.com/my_food_vocabulary/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT ?individual_food WHERE
{
<http://joshuamandel.com/me> food:ate ?bunch_of_food.
?bunch_of_food rdf:li ?individual_food.
}
</code></pre>
<p>Now our where clause includes two statements we're looking for individual foods
that are items in the list of foods eaten by me. In other words, now we're
drilling down into the bunch of food to pull out individual items! This returns
a list of three bindings for the ?individual_food "coffee", "milk", and "Joe's
O's".</p>
<p>This was just the briefest introduction to the anatomy of a SPARQL query. For
lots more specific examples, try SPARQL examples for SMART.</p>
</div>
</div>
</section>
</div>
<footer>
<div>
<a href='http://smartplatforms.org'>SMART Platforms</a> © 2012
</div>
</footer>
</div>
<script src="//ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js"></script>
<script>window.jQuery || document.write('<script src="/assets/themes/twitter-2.0/js/jquery.min.js"><\/script>')</script>
<script src="/assets/themes/twitter-2.0/js/bootstrap_aks.js"></script>
<script type="text/javascript">
var _gaq = _gaq || [];
_gaq.push(['_setAccount', 'UA-33617191-1']);
_gaq.push(['_trackPageview']);
(function() {
var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
})();
</script>
<script src="/assets/app.js?v=0.1"></script>
<script>Toc.init($("#jekyll-page-content"), $("#toc"));</script>
</body>
</html>