updated docco

CSIRO-enviro-informatics · Apr 13, 2019 · 1487cc4 · 1487cc4
1 parent 8759a8b
commit 1487cc4
Show file tree

Hide file tree

Showing 3 changed files with 50 additions and 33 deletions.
diff --git a/DATA_SOURCES.md b/DATA_SOURCES.md
@@ -3,43 +3,44 @@
 VocPrez is designed to be able to read vocabulary information from a number of sources. Currently, the following sources have been configured:
 
 * [Research Vocabularies Australia](http://vocabs.ands.org.au) (RVA)
-* local RDF files
 * [VocBench3](http://vocbench.uniroma2.it/)
+* genric SPARQL endpoint
+* local RDF files
 
-VocPrez requires a static index of vocabularies you want it to display to be created in your installation's config file stored at `_config/__init__.py`. An example list looks like this: 
+VocPrez generates a cached index of vocabularies you want it to display. It gets the vocab information from a `VOCAB_SOURCES varaible` in the [_config/__init__.py](config/) file you set up. An example list of two sources, RVA & SPRAQL are given in the [template config file](_config/template.py), also copied below.
 
 ```
-VOCABS = {
-    'rva-50': {
-        'source': VocabSource.RVA,
-        'title': 'Geologic Unit Type'
+VOCAB_SOURCES = {
+    # an example of a SPARQL endpoint - here supplied by an instance of GrpahDB
+    'gsq-graphdb': {
+        'source': VocabSource.SPARQL,
+        'sparql_endpoint': 'http://graphdb.gsq.digital:7200/repositories/GSQ_Vocabularies_core'
     },
-    'rva-52': {
+    # an example of querying the ARDC RVA vocab system (https://vocabs.ands.org.au)
+    'rva': {
         'source': VocabSource.RVA,
-        'title': 'Contact Type'
-    },
-    
-    ...
-
-    'tenement_type': {
-        'source': VocabSource.FILE,
-        'title': 'Tenement Type'
-    },
-    'Test_Rock_Types_Vocabulary': {
-        'source': VocabSource.VOCBENCH,
-        'title': 'Test Rock Types'
+        'api_endpoint': 'https://vocabs.ands.org.au/registry/api/resource/vocabularies/{}?includeAccessPoints=true',
+        'vocabs': [
+            {
+                'ardc_id': 50,
+                'uri': 'http://resource.geosciml.org/classifierscheme/cgi/2016.01/geologicunittype',
+            },
+            {
+                'ardc_id': 52,
+                'uri': 'http://resource.geosciml.org/classifierscheme/cgi/2016.01/contacttype',
+            },
+            {
+                'ardc_id': 57,
+                'uri': 'http://resource.geosciml.org/classifierscheme/cgi/2016.01/stratigraphicrank',
+            }
+        ]
     }
 }
 ```
 
-Here you see vocabularies with IDs 'rva-50', 'rva-52', 'tenement_type' & 'Test_Rock_Types_Vocabulary' with both titles and sources for them given. The first two are drawn from RVA, the 3rd, a file and the last from a VocBench installation. 
-
-The controlled list of source types (`VocabSource.FILE`, `VocabSource.VOCBENCH` etc.) are handled by dedicated *source* Python code classes that present a standard set of methods for each type. The files currently implemented, all in the `data/` folder, are:
-
-* `RVA.py` - RVA
-* `FILE.py` - FILE
-* `VOCBENCH.py` - VOCBENCH
+Here you see the first source is a SPARQL endpoint. All that's neede here, as specified in the data/source/SPARQL.py file, is a source type ("VocabSource.SPARQL") and a sparql endpoint.
 
-Additional source files for other vocabulary data sources can be made by creating new `source_*.py` files inheriting from `source.py`.
+Next is the "RVA" endpoint for which an API endpoint is needed and also a list of vocab IDs and the vocab's URIs. These are neede by data/source/RVA.py to get all the information it needs about vocabularies from RVA.
 
-The specific requirements for each source are contained within their particular files but, summarising the requirements for the sources already catered for, Vocabularies from RVA need to have endpoints specified in the vocab source file `data/RVA.py` so VocPrez knows where to get info from. RDF files in `data/` will automatically be picked up by VocPrez so don;t need any more config than a title, provided the ID matched the file name, minus file extension. Vocabs from VocBench require that a `VB_ENDPOINT`, `VB_USER` & `VB_PASSWORD` are all given in the config file.
+### New Sources
+Additional source files for other vocabulary data sources can be made by creating new `source_*.py` files inheriting from `source.py`. You will need to supply a static `collect()` method that gets all the vocabs and their metadata from the source for the cached vocab index and either make do with or overload the functions in Source.py (such as `get_vocabylary()`) to supply all the other required forms of access to your source's vocabularies.
diff --git a/README.md b/README.md
@@ -27,8 +27,8 @@ Standard templates for `ConceptScheme`, `Collection`, `Concept` & `Register` are
 * follow the instructions as per pyLDAPI (see [its documentation](https://pyldapi.readthedocs.io))
 * ensure your config file is correct
     * you need to copy the file `_config/template.py` to `_config/__init__.py` and configure carables within it. See the template.py` file for examples
-* configure your data source
-    * you will need to supply this tool with SKOS data from any sort of data source: a triplestore, a relational database or even a test file
+* configure your data source(s)
+    * you will need to supply this tool with SKOS data from any sort of data source: a triplestore, a relational database or even a local file
     * see the [DATA_SOURCES.md](https://github.com/CSIRO-enviro-informatics/VocPrez/blob/master/DATA_SOURCES.md) file for examples
 
 

diff --git a/_config/template.py b/_config/template.py
@@ -25,12 +25,28 @@ class VocabSource:
 
 
 VOCAB_SOURCES = {
-    'graphdb': {
+    # an example of a SPARQL endpoint - here supplied by an instance of GrpahDB
+    'gsq-graphdb': {
         'source': VocabSource.SPARQL,
-        'endpoint': 'http://graphdb.gsq.digital:7200/repositories/GSQ_Vocabularies_core'
+        'sparql_endpoint': 'http://graphdb.gsq.digital:7200/repositories/GSQ_Vocabularies_core'
     },
+    # an example of querying the ARDC RVA vocab system (https://vocabs.ands.org.au)
     'rva': {
         'source': VocabSource.RVA,
-        'vocab_ids': [50, 52, 57]
+        'api_endpoint': 'https://vocabs.ands.org.au/registry/api/resource/vocabularies/{}?includeAccessPoints=true',
+        'vocabs': [
+            {
+                'ardc_id': 50,
+                'uri': 'http://resource.geosciml.org/classifierscheme/cgi/2016.01/geologicunittype',
+            },
+            {
+                'ardc_id': 52,
+                'uri': 'http://resource.geosciml.org/classifierscheme/cgi/2016.01/contacttype',
+            },
+            {
+                'ardc_id': 57,
+                'uri': 'http://resource.geosciml.org/classifierscheme/cgi/2016.01/stratigraphicrank',
+            }
+        ]
     }
 }