Skip to content

Loading Sample Data

michel-heon edited this page Mar 11, 2026 · 7 revisions

Loading Sample Data

🇫🇷 Cette page est également disponible en français : fr_Loading-Sample-Data

The VIVO sample data provides a fictional university (people, departments, publications, grants, memberships) to explore VIVO features on a freshly deployed instance.

Two loading scenarios are covered below:

  • Without i18n — single language (English), one file to load
  • With i18n — multilingual (English + French Canadian), three files to load in sequence

Load sample data only on a fresh or dedicated instance. Do not mix it with real institutional data.


Prerequisites

  • VIVO is running and accessible at https://<public-ip>/
  • You are logged in as the site administrator (the email and password set at deployment)
  • Navigate to: Site Admin → Add/Remove RDF Data

Scenario 1 — Without i18n (English only)

Load a single N3 file. Labels are embedded in English with no language tags.

Step 1 — Load the sample data

In Add/Remove RDF Data:

Field Value
URL https://raw.githubusercontent.com/vivo-project/sample-data/master/sample-data.n3
Action Add instance data
File type N3

Click Submit.

Step 2 — Verify

Navigate to the VIVO home page. You should see research areas, people, and organizations populated. Try searching for Chemistry, Roberts, or Pringle.

Capability Map: If the Capability Map at /vis/capabilitymap shows no nodes despite data being present, see Troubleshooting#capability-map-is-empty--jsonp-blocked-by-mime-type.


Scenario 2 — With i18n (English + French Canadian)

The i18n/ directory of the sample data repository distributes labels by locale:

File Content
sample-data.ttl Base data — structure and relationships, no language-tagged labels
sample-data-en_US.ttl English (en_US) labels for all resources
sample-data-fr_CA.ttl French Canadian (fr_CA) labels for all resources

Step 1 — Enable multi-locale mode

SSH into the VM and configure runtime.properties for multi-locale:

sudo sed -i \
  -e 's|^languages\.forceLocale|#languages.forceLocale|' \
  -e 's|^#\?languages\.selectableLocales.*|languages.selectableLocales = en_US, fr_CA|' \
  -e 's|^RDFService\.languageFilter.*|RDFService.languageFilter = true|' \
  /data/vivo/home/config/runtime.properties
sudo systemctl restart tomcat

Wait for Tomcat to restart (~30 seconds), then verify:

grep -E "languageFilter|forceLocale|selectableLocales" /data/vivo/home/config/runtime.properties

Step 2 — Load base data

In Site Admin → Add/Remove RDF Data:

Field Value
URL https://raw.githubusercontent.com/vivo-project/sample-data/master/i18n/sample-data.ttl
Action Add instance data
File type Turtle

Click Submit.

Step 3 — Load English labels

Field Value
URL https://raw.githubusercontent.com/vivo-project/sample-data/master/i18n/sample-data-en_US.ttl
Action Add instance data
File type Turtle

Click Submit.

Step 4 — Load French Canadian labels

Field Value
URL https://raw.githubusercontent.com/vivo-project/sample-data/master/i18n/sample-data-fr_CA.ttl
Action Add instance data
File type Turtle

Click Submit.

Step 5 — Verify

Navigate to the VIVO home page. A language selector should appear in the header. Switch between English and Français — labels on organizations, people, and research areas should change accordingly.

Capability Map: If the Capability Map at /vis/capabilitymap shows no nodes despite data being present, see Troubleshooting#capability-map-is-empty--jsonp-blocked-by-mime-type.


Post-load verification

After loading either scenario, confirm that data is fully indexed and the Capability Map is functional.

Check Solr index count

The base sample data (without i18n locale files) adds approximately 436 documents to the Solr index.

curl -s "http://localhost:8983/solr/vivocore/select?q=*:*&rows=0" | python3 -c \
  "import sys, json; d=json.load(sys.stdin); print('Docs indexed:', d['response']['numFound'])"

Expected output after sample data load:

Docs indexed: 436

If the count is lower (typically ~394 after a clean boot), Solr indexing may still be in progress or the load did not complete successfully.

Check Capability Map data endpoint

Verify the visualization data API returns results before testing the browser UI:

# List all indexed research area concepts
curl -sk "https://<fqdn>/visualizationAjax?vis=capabilitymap&query=all&data=concepts" | head -c 500

# Retrieve researchers linked to a specific concept
curl -sk "https://<fqdn>/visualizationAjax?vis=capabilitymap&query=Rhetoric&callback=ipretResults&noCacheIE=1" | head -c 500

Expected: A JavaScript JSONP callback containing JSON data (ipretResults({...})).

If the response is HTML (a VIVO error page), the data is missing. If the browser blocks the request, see Troubleshooting#capability-map-is-empty--jsonp-blocked-by-mime-type.

Check Content-Type header

Browsers enforce strict MIME checking. Confirm nginx returns the correct type:

curl -sk -I "https://<fqdn>/visualizationAjax?vis=capabilitymap&query=Rhetoric&callback=ipretResults&noCacheIE=1" \
  | grep -i content-type

Expected:

content-type: application/javascript; charset=UTF-8

If it returns text/html, apply the nginx fix described in Troubleshooting#capability-map-is-empty--jsonp-blocked-by-mime-type.


Namespace note

The VIVO sample data from vivo-project/sample-data uses the default namespace http://vivo.mydomain.edu/individual/. When your VIVO instance is configured with a custom namespace (e.g. http://vivo.myinstitution.edu/individual/), the Capability Map and search features will work correctly, but direct URI navigation to individual profile pages may not resolve.

This is expected when using Lyrasis sample data as-is on a customized deployment. The data serves its purpose for exploring VIVO features; it is not intended for production use.


Resetting to a clean state

To remove all sample data and return to an empty VIVO instance:

  1. SSH into the VM
  2. Stop all services:
    sudo systemctl stop tomcat solr
  3. Clear the TDB1 triplestore and Solr index:
    sudo rm -rf /data/vivo/home/tdbContentModels
    sudo rm -rf /data/vivo/home/tdb
    sudo rm -rf /data/solr/data/vivocore/data
  4. Restart services:
    sudo systemctl start solr
    sleep 15
    sudo systemctl start tomcat

VIVO will reload its built-in ontology data on first start (~2–3 minutes). When complete, the home page will show an empty instance.


References

Clone this wiki locally