-
-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.html
397 lines (391 loc) · 22.5 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no">
<title>Open Standards-Based Development for Discovery and Interoperability - OR2023 Presentation</title>
<link rel="stylesheet" href="dist/reset.css">
<link rel="stylesheet" href="dist/reveal.css">
<link rel="stylesheet" href="dist/theme/solarized.css">
<link rel="stylesheet" href="dist/custom.css">
<link rel="stylesheet" href="dist/timeline.css">
<!-- Theme used for syntax highlighted code -->
<link rel="stylesheet" href="plugin/highlight/monokai.css">
</head>
<body>
<div class="reveal">
<div class="slides">
<section>
<h2>Open Standards-Based Development for Discovery & Interoperability</h2>
<span>Tiffany Chan</span>
<a href="https://github.com/UVicLibrary" target="_blank">
<img class="inline-icon" src="images/github-mark.png">
</a>
<a href="https://twitter.com/TiffChan29" target="_blank">
<img class="inline-icon" src="images/icons8-twitter-50.png">
</a>
 | 
<span>tjychan@uvic.ca</span>
<br/>
<span>Senior Developer | University of Victoria Libraries</span> <img class="inline-icon" style="margin-bottom: -10px" src="images/canada.png"/><br/>
<span>15 June 2023 | Open Repositories 2023</span><br/>
<span>Link to slides: <a href="https://uviclibrary.github.io/OR2023Pres/" target="_blank">uviclibrary.github.io/OR2023Pres</a></span>
</section>
</section>
<section data-background="images/UVicEdge-Zoom-background-library.jpg">
<div class="text-background slide-link">
<span>Link to slides: <a href="https://uviclibrary.github.io/OR2023Pres/" target="_blank">uviclibrary.github.io/OR2023Pres</a></span>
</div>
<div class="text-background">
<h2>University of Victoria ("UVic") Libraries</h2>
<ul>
<li>~20,000 students, 900 full-time faculty, 150 library staff</li>
<li class="fragment">Multiple digital platforms for our collections</li>
</ul>
</div>
<aside class="notes">
<ul>
<li>We are a mid-size university by Canadian standards</li>
<li>We use several platforms to deposit and disseminate our stuff, including DSpace, Internet Archive, Dataverse and others</li>
<li>One of those platforms, which I'm going to describe in detail is called Vault</li>
</ul>
</aside>
</section>
<section data-background="images/all_collections.jpg">
<div class="text-background">
<h2>Vault</h2>
<ul>
<li>Holds digitized materials from Special Collections & Archives, faculty/community partnerships</li>
<li class="fragment">Customized instance of <a href="https://hyku.samvera.org/" target="_blank">Hyku</a> (<a href="https://samvera.org/what-is-samvera/samvera-open-source-repository-framework" target="_blank">Samvera</a>), migrated from ContentDM starting in 2018</li>
<li class="fragment">90+ collections, 12000+ works, 98000+ files</li>
<li class="fragment">2 developers, 2-3 metadata staff working directly in/with <a href="https://vault.library.uvic.ca" target="_blank">Vault</a> (among other projects)</li>
</ul>
</div>
<aside class="notes">
<ul>
<li>We call it Vault because it mostly holds digitized objects from our Special Collections and Archives (typically our most unique and rare materials)</li>
<li>Also houses materials from faculty and community partnerships, such as audio recordings or oral histories</li>
<li>Vault is built on an open-source software called Hyku, which is maintained by the Samvera community</li>
<li>At present, we have 94 collections with over 12000 works in them. Those works are comprised of over 98000 files; about 70% of works were previously in ContentDM</li>
<li>To give you an idea of staffing levels, we have 2 developers who spend ~80% of their time on Vault and 2-3 metadata staff who spend 5-30% of their time creating or editing metadata</li>
</ul>
</aside>
</section>
<section data-background="images/all_collections.jpg">
<div class="text-background">
<h2>Example Collections</h2>
</div>
</section>
<section data-background="images/orpen.jpg">
<div style="position: absolute; width: 45%; top: -330px; right: -150px; box-shadow: 0 1px 4px rgba(0,0,0,0.5), 0 5px 25px rgba(0,0,0,0.2); background-color: rgba(0, 0, 0, 0.9); padding: 20px; font-size: 20px; text-align: left;">
<h2 class="collection-title"><a href="https://vault.library.uvic.ca/collections/61da6cbd-02e8-4dd3-8e9d-ccbd2947ef4f" target="_blank">Sir William Orpen Illustrated Letters</a></h2>
</div>
<aside class="notes">
<li>We have several collections of correspondence (letters) in Vault, many from the early 1900s that have been scanned and uploaded as high-resolution images</li>
<li>IIIF viewer pictured here</li>
</aside>
</section>
<section data-background-video="video/mathew_ko_shortened.mp4" >
<div style="position: absolute; width: 45%; left: -50px; top: 300px; box-shadow: 0 1px 4px rgba(0,0,0,0.5), 0 5px 25px rgba(0,0,0,0.2); background-color: rgba(0, 0, 0, 0.9); padding: 20px; font-size: 20px; text-align: left;">
<h2 class="collection-title"><a href="https://vault.library.uvic.ca/collections/ed61bace-5c1c-4e77-975e-0f6f943ae92d" target="_blank">Mathew Ko Colour Films:</a><br/> Victoria's Chinatown and Region</h2>
</div>
<aside class="notes">
<li>Mathew Ko Colour Films collection is a collection of home movies from the early 1950s</li>
<li>Many of our collections, like this one, depict local places or events</li>
<li>Examples of things we do NOT put in Vault: theses & dissertations (they go in DSpace), datasets (Dataverse)</li>
</aside>
</section>
<section data-background="images/whales.jpg">
<div class="text-background">
<h2 class="no-wrap">Why Migrate? <br/>Support for Open Standards</h2>
<ul>
<li><a href="https://iiif.io/" target="_blank">IIIF</a>: automatic manifest generation, image server (<a href="https://github.com/sul-dlss/riiif" target="_blank">Riiif</a>), and viewer (<a href="https://universalviewer.io/" target="_blank">Universal Viewer</a>)</li>
<li class="fragment">Linked Data Platform with <a href="https://fedora.lyrasis.org/" target="_blank">Fedora</a></li>
</ul>
</div>
<aside class="notes">
<li>Why did we choose Samvera for these types of collections? We wanted a platform that supported openness and interoperability</li>
<li>For example, Hyku uses IIIF, which is a set of web standards and APIs for disseminating & accessing images</li>
<li>Hyku creates IIIF manifests automatically when ingesting images and provides a IIIF server as well as image viewer</li>
<li>Hyku also uses Fedora as its underlying repository system, which has built-in support for Linked Data, an important factor for our institution</li>
</aside>
</section>
<section>
<h2>Metadata:<br/> Linked Data and URI Fields</h2>
<table>
<thead>
<tr>
<th>Field Name(s)</th>
<th class="no-wrap">Authority or URI</th>
</tr>
</thead>
<tbody>
<tr>
<td>Provider, Creator, Contributor, Subject, Geographic Coverage, Physical Repository</td>
<td><a href="https://www.oclc.org/research/areas/data-science/fast.html" target="_blank">(OCLC) FAST</a></td>
</tr>
<tr>
<td>Genre</td>
<td><a href="https://www.getty.edu/research/tools/vocabularies/aat/about.html" target="_blank">Getty Art and Architecture Thesaurus</a></td>
</tr>
<tr>
<td>Rights Statement</td>
<td><a href="https://rightsstatements.org/page/1.0/" target="_blank">rightsstatements.org</a>, <br/><a href="https://creativecommons.org/about/cclicenses/" target="_blank">Creative Commons</a></td>
</tr>
</tbody>
</table>
<p class="fragment" style="font-size: 24px">* See Figure 3 in <a href="https://www.tandfonline.com/doi/full/10.1080/01639374.2023.2204309" target="_blank">our recent article</a> for a more detailed table and discussion of metadata</p>
<aside class="notes">
<ul>
<li>Hyku also lets us use different controlled vocabularies to make our data more interoperable (i.e. able to be understood by other systems that use or are aware of those vocabularies)</li>
<li>This is a table showing all our metadata fields that use controlled vocabularies, along wih the specific vocabulary that we use</li>
<li>As you can see, most of our fields use OCLC FAST, except for our rights and genre fields</li>
<li>One of our biggest, most resource-intensive challenges during migration was not just dumping data out of ContentDM, but also reconciling that data with a new Vault data model</li>
<li>This often meant converting textual values into unique identifiers (URIs)</li>
<li>* Advance slide *</li>
<li>For more details on how we implemented and migrated to FAST, you can read an article that my colleagues, Dean Seeman, Karen Dykes, and I wrote on our implementation of FAST. I've linked to it on this slide.</li>
</ul>
</aside>
</section>
<section>
<h2>URIs and Labels on the Backend</h2>
<pre>
<code class="json">
"creator":["http://id.worldcat.org/fast/78887"],
"creator_label":["Orpen, William, Sir, 1878-1931"],
"contributor":["http://id.worldcat.org/fast/2012667"],
"contributor_label":["Glenavy, Beatrice Moss Campbell,
Baroness, 1883-1970"],
"subject":["http://id.worldcat.org/fast/1050538",
"http://id.worldcat.org/fast/78887",
"http://id.worldcat.org/fast/2012667",
"http://id.worldcat.org/fast/1050534"],
"subject_label":["Painters--Correspondence",
"Orpen, William, Sir, 1878-1931",
"Glenavy, Beatrice Moss Campbell, Baroness, 1883-1970",
"Painters--Biography"</code>
</pre>
<aside class="notes">
<ul>
<li>This is just an example of what the metadata looks like as indexed in Vault</li>
<li>For every controlled vocabulary field, we have both the URI and the human-readable label saved in a corresponding "label field"</li>
</ul>
</aside>
</section>
<section>
<h2>Labels in the Interface</h2>
<img src="images/orpen_metadata.jpg"/>
<aside class="notes">
<p>On the actual website (web interface), we only display the human-readable labels.</p>
</aside>
</section>
<section>
<h3>Accessing Linked Data via the IIIF Manifest or RDF/TTL</h3>
<p>Go the page for any work and add /manifest.json or .ttl to the end of the URL</p>
<div class="fragment">
<p><a href="https://vault.library.uvic.ca/concern/generic_works/1a597e67-17a8-4c97-9c63-c835a41f5474?locale=en" target="_blank">Example:</a></p>
<span class="long-url">vault.library.uvic.ca/concern/generic_works/1a597e67-17a8-4c97-9c63-c835a41f5474</span>
<span class="fragment fade-out long-url" style="color:#e06666">?locale=en</span><span class="fragment long-url fade-in-then-out" style="color:#02951b">/manifest.json</span><span class="fragment long-url" style="color:#4f46e5">.ttl</span>
</div>
<aside class="notes">
<p>How can people or machines access our linked data?</p>
<ol>
<li>Go to the page for any work</li>
<li>If there is a question mark in the URL, delete that and everything after it</li>
<li>Add slash manifest to the end to see the manifest, or add .ttl to the ned to see turtle (RDF)</li>
</ol>
</aside>
</section>
<section data-background="images/open_source.jpg">
<div class="text-background">
<h2 class="no-wrap">Why Migrate to <br/>Open-Source Software</h2>
<ul>
<li>More flexibility and control over how we describe, organize, and present our objects and collections</li>
<li>Customized features and interface tools</li>
</ul>
</div>
<aside class="notes">
<li>The other important consideration is that Vault/Hyku is open-source software, meaning anyone can use, examine, alter and redistribute its source code (definition from <a href="https://www.ibm.com/topics/open-source/topics/open-source" target="_blank">ibm.com</a>)</li>
<li>This lets us build custom features that we use in production</li>
</aside>
</section>
<section data-background="images/time.jpg">
<div class="text-background">
<h2 class="no-wrap">Extended Date/Time Format</h2>
<ul>
<li><a href="https://www.loc.gov/standards/datetime/">EDTF</a>: a way to express nuanced dates (using symbols) in a way that basic ISO formats cannot
<li class="fragment">Examples: 1912~ means "approximately 1912"</li>
<li class="fragment">. ./1000 means "Before 1000 (AD)"</li>
</ul>
</div>
<aside class="notes">
<ul>
<li>One of these custom features is support for EDTF, or Extended Date/Time Format, which was a request from our metadata department</li>
<li>This Hyku implementation was developed by Braydon Justice and myself with the help of a couple ruby gems</li>
<li>EDTF uses special symbols to express complexity in dates that you couldn't get with basic ISO formats</li>
<li><em>* Advance slides *</em></li>
<li>The goal was to save and index EDTF date strings and then display humanized versions on the front end</li>
</ul>
</aside>
</section>
<section>
<h2>EDTF Date Indexing</h2>
<div class="r-stack">
<div>
<p style="margin-bottom: 3rem;">Dates are entered using EDTF notation...</p>
<a href="https://vault.library.uvic.ca/concern/generic_works/41991908-2b23-41bd-a682-eaacb211ede3" target="_blank"><img src="images/date_created_form.jpg"/></a>
</div>
<div class="fragment">
<p style="margin-bottom: 3rem;">And indexed in multiple formats</p>
<pre>
<code class="json">
"date_created":["1476/1500~"],
"year_sort":"1476-01-01T00:00:00Z",
"year_range":[1476, 1477, 1478, 1479 ... 1500]
</code>
</pre>
</div>
</div>
<aside class="notes">
<ul>
<li>To give an example, here is what the data looks like when our metadata staff enters it into Vault...</li>
<li><em>* Advance slide *</em></li>
<li>...and here is what the date looks like in our indexing software. We use the <a href="https://github.com/inukshuk/edtf-ruby" target="_blank">edtf-ruby gem</a>, developed by Sylvester Keil and others, to parse the EDTF date string</li>
<li>Once parsed, the date is saved in 2 additional formats: 2) the earliest possible date is saved for sorting items; 3) a list of numbers/integers that correspond to each possible year.</li>
</ul>
</aside>
</section>
<section>
<section data-background-iframe="https://vault.library.uvic.ca/concern/generic_works/41991908-2b23-41bd-a682-eaacb211ede3?locale=en" data-background-interactive>
<div class="iframe-descr" style="right: 0;">
<h2>EDTF in the Interface</h2>
<p>Dates are humanized for display. The links in the Date Created field link to faceted searches for items in the same date range.</p>
</div>
<aside class="notes">
<ul>
<li><em>* Scroll to Date Created field and highlight text *</em></li>
<li>Here's what EDTF looks like in the interface, where we use the human-readable date, which are generated by the <a href="https://github.com/corylown/edtf-humanize" target="_blank">edtf-humanize</a> gem by Duke Libraries.</li>
<li>The date created links to a search for all items created in the same year or year range using the <a href="https://github.com/projectblacklight/blacklight_range_limit" target="_blank">Blacklight Range Limit</a> gem.</li>
<li><em>* Click link *</em></li>
<li>This draws from the list of years generated from the EDTF date.</li>
</ul>
</aside>
</section>
<section data-background-video="video/edtf.mp4"></section>
</section>
<section>
<section data-background-iframe="https://vault.library.uvic.ca" data-background-interactive>
<div class="iframe-descr" style="top: 275px; right: -50px;">
<h2>Homepage Facets</h2>
<p>Allows users to filter using broad categories (vault.library.uvic.ca). Users can also browse collections and works visually.</p>
</div>
<aside class="notes">
<ul>
<li>We've also heavily customized our home page. Here, users can quickly run faceted searches by genre, time period (again, EDTF dates), popular subjects, or (geographic) place.</li>
<li>We also have the ability to upload collection thumbnails or select them from works in the collection </li>
<li>Collections are also listed alphabetically under "All Collections"</li>
<li>We also display recently created works or collections (this is mostly for staff)</li>
</ul>
</aside>
</section>
<section data-background="images/homepage_wide.jpg"></section>
</section>
<section>
<section data-background-iframe="https://vault.library.uvic.ca/concern/generic_works/496f803b-24bb-415d-a7fc-8e9af1b38741?locale=en" data-background-interactive>
<div class="iframe-descr" style="left: 0; top: 275px;">
<h2>Interactive Video Transcripts</h2>
<p>Able Player uses .vtt files (Web VTT format) to create interactive transcripts for videos when available.</p>
</div>
<aside class="notes">
<ul>
<li>Rather than use the default video viewer that comes with Hyrax/Hyku, we use an accessible audio/video player called <a href="https://ableplayer.github.io/ableplayer/" target="_blank">AblePlayer</a>, created by Terril Thompson.</li>
<li>AblePlayer can automatically display subtitles and transcripts from a specific file format called WebVTT or web video text tracks.</li>
<li>With this player, users can click on specific points in the transcript to skip to a specific timestamp (*demo*).</li>
</ul>
</aside>
</section>
<section data-background-video="video/transcripts.mp4"></section>
</section>
<section data-background="images/code.jpg">
<div class="text-background">
<h2>Running the Code</h2>
<ul>
<li>Code: <a href="https://github.com/UVicLibrary/Vault" target="_blank">github.com/UVicLibrary/Vault</a><br/></li>
<li>Switch to the <code class="inline-code">docker_multitenant</code> branch</li>
<li><a href="https://github.com/UVicLibrary/Vault/blob/docker_multitenant/documentation/Developing_with_Docker.md">Instructions</a>: run Vault locally with Docker</li>
</ul>
</div>
<aside class="notes">
<ul>
<li>Vault's code is open-source and is available on Github</li>
<li>For local testing and development, we use a software called <a href="https://www.docker.com/products/docker-desktop/" target="_blank">Docker Desktop</a> (we don't run Docker in production, although other institutions do)</li>
<li>We have step-by-step instructions for how to set up a local instance of Vault with Docker</li>
</ul>
</aside>
</section>
<section data-background="images/open.jpg">
<div class="text-background">
<h2>On locally developed, open-source software</h2>
<ul>
<li class="fragment">Not free ($) but freedom: labour vs. product</li>
<li class="fragment">More responsive, shorter development cycles</li>
<li class="fragment no-wrap"><a href="https://hyku.samvera.org/2015/12/16/community_input.html"target="_blank">Hannah Frost: tension between customization/sustainability</a></li>
</ul>
</div>
<aside class="notes">
<ul>
<li>None of the software customizations I've just shown you would be possible with proprietary software like ContentDM.</li>
<li>However, there are some things to consider when considering switching to open-source software like Hyku</li>
<li><em>* Advance slide *</em></li>
<li>Although the application itself is free to use, maintaining and customizing it is not; it's a question of where you want to spend the money (product vs. people/labour)</li>
<li>The major advantage of open-source is the freedom to modify (creative control)</li>
<li><em>* Advance slide *</em></li>
<li>Local open-source development also allows developers like me to be more responsive to requests or bugs; priorities are determined locally by people doing the work rather than a vendor</li>
<li>This encourages more communication between units</li>
<li><em>* Advance slide *</em></li>
<li>Hannah Frost points out that code customizations inevitably create "complications in the upgrade path." Any major upgrade to the underlying code base requires a lot of work to sort through and resolve conflicts</li>
</ul>
</aside>
</section>
<section>
<h2>Thanks!</h2>
<span>Tiffany Chan</span>
<a href="https://github.com/UVicLibrary" target="_blank">
<img class="inline-icon" src="images/github-mark.png">
</a>
<a href="https://twitter.com/TiffChan29" target="_blank">
<img class="inline-icon" src="images/icons8-twitter-50.png">
</a>
 | 
<span>tjychan@uvic.ca</span>
<br/>
<span>Vault: <a href="https://vault.library.uvic.ca" target="_blank">vault.library.uvic.ca</a></span>
<br/>
<span>Link to slides: <a href="https://uviclibrary.github.io/OR2023Pres/" target="_blank">uviclibrary.github.io/OR2023Pres</a></span>
<br/>
<span>Slides made with <a href="https://revealjs.com/" target="_blank">reveal.js</a></span>
</section>
</div>
</div>
<div>
</div>
<script src="dist/reveal.js"></script>
<script src="plugin/notes/notes.js"></script>
<script src="plugin/markdown/markdown.js"></script>
<script src="plugin/highlight/highlight.js"></script>
<script src="plugin/zoom/zoom.js"></script>
<script>
// More info about initialization & config:
// - https://revealjs.com/initialization/
// - https://revealjs.com/config/
Reveal.initialize({
hash: true,
preloadIframe: true,
// showNotes: true,
// Learn about plugins: https://revealjs.com/plugins/
plugins: [ RevealZoom, RevealMarkdown, RevealHighlight, RevealNotes ]
});
</script>
</body>
</html>