From f34b88f942123b51988be8cc3d94f89d4c0b78ce Mon Sep 17 00:00:00 2001 From: Raymond Yee Date: Wed, 8 Apr 2026 16:47:35 -0700 Subject: [PATCH] Enhance tutorials landing page and vocabularies page Tutorials landing (tutorials/index.qmd): - Add "What's in the Data?" table with source breakdown - Add data files table with links to data.isamples.org - Expand "Why Browser-Based?" with more detail - Add "For Developers" section with GitHub/Zenodo/query docs links Vocabularies page (models/index.qmd): - Add ARDC link as vocabulary source of truth - Add back-links to Architecture, Requirements, and Metadata Model - Wrap core vocabularies in collapsible callout - Clean up taxonomy descriptions Addresses #87 (tutorials landing), plus metadata back-links and ARDC link from wireframe delta items. Co-Authored-By: Claude Opus 4.6 --- models/index.qmd | 33 ++++++++++++++++----------- tutorials/index.qmd | 54 +++++++++++++++++++++++++++++++-------------- 2 files changed, 58 insertions(+), 29 deletions(-) diff --git a/models/index.qmd b/models/index.qmd index dcbae0d..0bc313b 100644 --- a/models/index.qmd +++ b/models/index.qmd @@ -21,26 +21,33 @@ listing: number-sections: false --- -see [description of model](https://isamplesorg.github.io/metadata/) at https://isamplesorg.github.io/metadata/ +See the [iSamples Metadata Model](https://isamplesorg.github.io/metadata/) for the full schema documentation. -## Taxonomies +::: {.callout-tip} +### Vocabulary Source of Truth +The authoritative versions of iSamples vocabularies are maintained as RDF/SKOS files in the [iSamples GitHub repositories](https://github.com/isamplesorg/). Vocabulary terms are also registered with the [Australian Research Data Commons (ARDC) Research Vocabularies](https://vocabs.ardc.edu.au/). +::: -One of the foundations for interoperability of iSamples material sample descriptions is the definition of vocabularies for the categorization of sample type. There are three core vocabularies for different aspects of sample type: material sample type, material type, and sampled feature type. Each vocabulary is maintained as an RDF file using the SKOS vocabulary, with hierarchical relationships using [`SKOS:broader`](https://www.w3.org/2009/08/skos-reference/skos.html#broader). In order to be domain agnostic, these core taxonomies cover a small set of top level terms. The taxonomies may be extended as necessary to support more specialized domains by relating additional terms using `SKOS:broader` and `SKOS:narrower`. +## Taxonomies {.unnumbered} -The iSamples core taxonomies are controlled vocabularies with terms related by [`SKOS:broader`](https://www.w3.org/2009/08/skos-reference/skos.html#broader) and [`SKOS:narrower`](https://www.w3.org/2009/08/skos-reference/skos.html#narrower). In order to be domain agnostic, the core taxonomies cover a small set of top level terms. The taxonomies may be extended as necessary to support more specialized domains by relating additional terms using `SKOS:broader` and `SKOS:narrower`. +One of the foundations for interoperability of iSamples material sample descriptions is the definition of vocabularies for the categorization of sample type. There are three core vocabularies for different aspects of sample type: material sample type, material type, and sampled feature type. Each vocabulary is maintained as an RDF file using the SKOS vocabulary, with hierarchical relationships using [`SKOS:broader`](https://www.w3.org/2009/08/skos-reference/skos.html#broader). In order to be domain agnostic, these core taxonomies cover a small set of top level terms. The taxonomies may be extended as necessary to support more specialized domains by relating additional terms using `SKOS:broader` and `SKOS:narrower`. The iSamples taxonomies are used to characterize three fundamental concepts pertaining to physical samples: -1. The "iSamples Materials vocabulary" is a taxonomy of terms used to categorize the composition of a physical sample, that is "What material is the sample composed of?" -2. The "Sampled Feature Type Vocabulary" is a taxonomy of terms used to indicate what the sample is representative of. -3. The "iSamples Specimen Type Vocabulary" is a taxonomy of broad categories that classify what type of spcimen the physical sample record represents. +1. The **Materials Vocabulary** categorizes the composition of a physical sample ("What material is the sample composed of?") +2. The **Sampled Feature Type Vocabulary** indicates what the sample is representative of +3. The **Specimen Type Vocabulary** classifies what type of specimen the physical sample record represents -Three taxonomies are currently defined : +::: {.callout-note collapse="true"} +## Core Vocabularies -[Material Sample (specimen) Type Vocabulary](generated/vocabularies/material_sample_object_type.html) - -[Materials Vocabulary](generated/vocabularies/material_type.html) - -[Sampled Feature (context) Type vocabulary](generated/vocabularies/sampled_feature_type.html) +- [Material Sample (specimen) Type Vocabulary](generated/vocabularies/material_sample_object_type.html) +- [Materials Vocabulary](generated/vocabularies/material_type.html) +- [Sampled Feature (context) Type vocabulary](generated/vocabularies/sampled_feature_type.html) +::: +## Related Pages {.unnumbered} +- [Architecture Overview](../design/index.qmd) — system principles and architecture +- [Requirements](../design/requirements.html) — 18 use cases and requirements +- [Metadata Model](https://isamplesorg.github.io/metadata/) — schema and data model documentation diff --git a/tutorials/index.qmd b/tutorials/index.qmd index 43d0274..862661c 100644 --- a/tutorials/index.qmd +++ b/tutorials/index.qmd @@ -1,30 +1,52 @@ --- title: "Tutorials" +subtitle: "Learn to explore 6.7 million physical samples from scientific collections worldwide using modern browser-based tools." +number-sections: false --- -Learn to explore **6.7 million physical samples** from scientific collections worldwide using modern browser-based tools. - -## Start Here +## Start Here {.unnumbered} | Tutorial | What You'll Learn | |----------|-------------------| -| [**Interactive Explorer**](isamples_explorer.qmd) | Search and filter samples with faceted search, view on 3D globe | -| [**Deep-Dive Analysis**](zenodo_isamples_analysis.qmd) | Comprehensive DuckDB-WASM analysis with Observable JS | -| [**3D Globe Visualization**](parquet_cesium_isamples_wide.qmd) | Cesium-based visualization of all iSamples data | -| [**Technical: Narrow vs Wide**](narrow_vs_wide_performance.qmd) | Schema comparison and performance benchmarks | +| [**Interactive Explorer**](isamples_explorer.qmd) | Search and filter samples with faceted search, view results on a 3D globe | +| [**Deep-Dive Analysis**](zenodo_isamples_analysis.qmd) | Comprehensive DuckDB-WASM analysis with Observable JS — charts, maps, statistics | +| [**3D Globe Visualization**](parquet_cesium_isamples_wide.qmd) | Cesium-based progressive visualization with H3 spatial clustering | +| [**Technical: Narrow vs Wide**](narrow_vs_wide_performance.qmd) | Schema comparison and performance benchmarks for the PQG data formats | + +## What's in the Data? {.unnumbered} + +| Source | Samples | Focus | +|--------|---------|-------| +| **SESAR** | 4.6M | Earth science — rocks, minerals, sediments, soils | +| **OpenContext** | 1M | Archaeology — artifacts, excavation materials | +| **GEOME** | 605K | Biology — genomic and tissue specimens | +| **Smithsonian** | 322K | Natural history — museum collections | -## Data Sources +## Data Files {.unnumbered} -All tutorials use **geoparquet files** - no server required: +All data is hosted on [`data.isamples.org`](https://data.isamples.org) with HTTP range request support — DuckDB-WASM only downloads the bytes it needs. -- **iSamples Full Dataset**: ~280 MB wide format, 6.7M samples from SESAR, OpenContext, GEOME, Smithsonian -- **Available via**: Cloudflare R2 with HTTP range requests +| File | Size | Description | +|------|------|-------------| +| [Wide format](https://data.isamples.org/isamples_202601_wide.parquet) | 278 MB | One row per entity, all sources — primary file for tutorials | +| [Wide + H3](https://data.isamples.org/isamples_202601_wide_h3.parquet) | 292 MB | Wide format with H3 spatial indices for globe visualizations | +| [Facet summaries](https://data.isamples.org/isamples_202601_facet_summaries.parquet) | 2 KB | Pre-computed filter counts — loads instantly | +| [H3 clusters (res4)](https://data.isamples.org/isamples_202601_h3_summary_res4.parquet) | 0.6 MB | Zoomed-out globe view | -## Why Browser-Based? +## Why Browser-Based? {.unnumbered} Our approach using **geoparquet + DuckDB-WASM** provides: -- ✅ **Universal access** - No installation, works in any browser -- ✅ **Fast analysis** - 5-10x faster than downloading full datasets -- ✅ **Memory efficient** - Analyze 300MB using <100MB browser memory -- ✅ **Minimal transfer** - Only download the columns/rows you need +- **Universal access** — No installation, works in Chrome, Firefox, Edge, Safari, and Brave +- **Fast analysis** — 5-10x faster than downloading full datasets +- **Memory efficient** — Analyze 300MB datasets using <100MB browser memory +- **Minimal transfer** — HTTP range requests download only the columns and rows you need (typically <1 MB to start) +- **Reproducible** — All code is visible and foldable on tutorial pages + +## For Developers {.unnumbered} + +All tutorial source code is on [GitHub](https://github.com/isamplesorg/isamplesorg.github.io/tree/main/tutorials). Want to build your own analysis? Fork the repo, modify a `.qmd` file, and run `quarto preview`. + +- [GitHub repositories](https://github.com/isamplesorg/) — all source code and data pipelines +- [Zenodo community](https://zenodo.org/communities/isamples) — archived datasets for reproducible research +- [Query architecture](https://github.com/isamplesorg/isamplesorg.github.io/issues/82) — how the Explorer queries work under the hood