# Documents and database

This demo introduces the concepts of using **ndi_document** objects and the **ndi_database** to access stored data or analyses in an experiment. 

**Scenario**: An experimentor analyzes some data and wants to store the values of parameters of the animal and the results of some computation to the database for later retrieval.

In **NDI**, an **ndi_document** is a class for platform-independent storage and retrieval of database data. Every element in the **ndi_database** is a member of class **ndi_document**.

The form of **ndi_document** objects is specified in a JSON file in the `ndi_common/database_documents` subdirectories.

Let's create the most basic type of **ndi_document**.

In [1]:
mydoc = ndi_document()


mydoc = 

  ndi_document with properties:

    document_properties: [1x1 struct]



We just made a "vanilla" ndi_document. Let's examine the fields of this document:

In [3]:
mydoc.document_properties
mydoc.document_properties.ndi_document


ans = 

  struct with fields:

    document_class: [1x1 struct]
      ndi_document: [1x1 struct]


ans = 

  struct with fields:

       experiment_id: ''
                  id: '4126841bbce38274_3fea1237688aba7b'
                name: ''
                type: ''
           datestamp: '2020-01-16T01:51:15.937Z'
    database_version: 1



The fields of `document_properties.ndi_document` describe several features. The document has an id that is unique to every entry in the database. The experiment_id is also provided with an ndi_document so that the document could be put in a database that combines multiple experiments. There is a datestamp in UTC, and fields for a user-defined name and type. 

In [4]:
mydoc.document_properties.document_class


ans = 

  struct with fields:

       definition: '$NDIDOCUMENTPATH/ndi_document.json'
       validation: '$NDISCHEMAPATH/ndi_document_schema.json'
       class_name: 'ndi_document'
    class_version: 1
     superclasses: []



The `document_properties.document_class` fields describe the definition of the document. This particular document class has no superclasses; it is the most basic class. We will see how these superclasses work later.

The fields of an ndi_document are in a JSON file in `ndi_common/database_documents`. Let's examine the ndi_document.json file:

In [5]:
type ndi_document.json


{
	"document_class": {
		"definition":						"$NDIDOCUMENTPATH\/ndi_document.json",
		"validation":						"$NDISCHEMAPATH\/ndi_document_schema.json",
		"class_name":						"ndi_document",
		"class_version":					1,
		"superclasses": [ ]
	},
	"ndi_document": {
		"experiment_id":				"",
		"id":                           "",
		"name":							"",
		"type":							"",
		"datestamp":						"2018-12-05T18:36:47.241Z",
		"database_version":					1
	}
}



At present, each ndi_document has a "document_class" field that specifies the class_name, how it was defined, how it should be validated (currently not used or implemented, and could be changed), a class version, etc. You also see the fields of ndi_document.

Now let's open an expeirment and make a different type of document.

In [8]:
ndi_globals;
dirname = [ndiexampleexperpath filesep 'exp1_eg']
E = ndi_experiment_dir(dirname)


dirname =

    '/Users/vanhoosr/Documents/MATLAB/tools/NDI-matlab/ndi_common/example_experiments/exp1_eg'


E = 

  ndi_experiment_dir with properties:

                path: '/Users/vanhoosr/Documents/MATLAB/tools/NDI-matlab/ndi_common/example_experiments/exp1_eg'
           reference: 'exp1'
    unique_reference: '412682d5b11e4ba9_3fe62b9185af29f4'
           daqsystem: [1x1 ndi_dbleaf_branch]
           syncgraph: [1x1 ndi_syncgraph]
               cache: [1x1 ndi_cache]



Now let's call the newdocument() method of ndi_experiment to build a new document: 


In [11]:
doc = E.newdocument('ndi_document_subjectmeasurement',...
        'ndi_document.name','Animal statistics',...
        'subject.id','vhlab12345', ...
        'subject.species','Mus musculus',...
        'subjectmeasurement.measurement','age',...
        'subjectmeasurement.value',30,...
        'subjectmeasurement.datestamp','2017-03-17T19:53:57.066Z'...
        )


doc = 

  ndi_document with properties:

    document_properties: [1x1 struct]



Let's examine this document:

In [14]:
doc.document_properties
doc.document_properties.ndi_document
doc.document_properties.subjectmeasurement
doc.document_properties.subject
doc.document_properties.document_class


ans = 

  struct with fields:

        document_class: [1x1 struct]
          ndi_document: [1x1 struct]
               subject: [1x1 struct]
    subjectmeasurement: [1x1 struct]


ans = 

  struct with fields:

       experiment_id: 'exp1_412682d5b11e4ba9_3fe62b9185af29f4'
                  id: '4126841bc038507d_3feea3e55ffc605b'
                name: 'Animal statistics'
                type: ''
           datestamp: '2020-01-16T02:00:37.124Z'
    database_version: 1


ans = 

  struct with fields:

    measurement: 'age'
          value: 30
      datestamp: '2017-03-17T19:53:57.066Z'


ans = 

  struct with fields:

    reference: ''
      species: 'Mus musculus'
       strain: ''
      variant: ''
           id: 'vhlab12345'


ans = 

  struct with fields:

       definition: '$NDIDOCUMENTPATH/ndi_document_subjectmeasurement.json'
       validation: '$NDISCHEMAPATH/ndi_document_subjectmeasurement_schema.json'
       class_name: 'ndi_document_subjectmeasurement'
    class_version: 1

In [15]:
type ndi_document_subjectmeasurement.json


{
	"document_class": {
		"definition":						"$NDIDOCUMENTPATH\/ndi_document_subjectmeasurement.json",
		"validation":						"$NDISCHEMAPATH\/ndi_document_subjectmeasurement_schema.json",
		"class_name":						"ndi_document_subjectmeasurement",
		"class_version":					1,
		"superclasses": [
			{ "definition":					"$NDIDOCUMENTPATH\/ndi_document.json" },
			{ "definition":					"$NDIDOCUMENTPATH\/ndi_document_subject.json" }
		]
        },
	"subjectmeasurement": {
		"measurement":						"",
		"value":						"",
		"datestamp":						""
	}
}



If we examine the JSON definition, we can see that this class `subjectmeasurement` has ndi_document and ndi_document_subject as superclasses. The created object will have all of the fields of the base class and its superclasses, as we saw above.

We can now add our ndi_document object to the experiment's database with `ndi_experiment/database_add()`:

In [16]:
E.database_add(doc)


ans = 

  ndi_experiment_dir with properties:

                path: '/Users/vanhoosr/Documents/MATLAB/tools/NDI-matlab/ndi_common/example_experiments/exp1_eg'
           reference: 'exp1'
    unique_reference: '412682d5b11e4ba9_3fe62b9185af29f4'
           daqsystem: [1x1 ndi_dbleaf_branch]
           syncgraph: [1x1 ndi_syncgraph]
               cache: [1x1 ndi_cache]



We can search for this document using a variety of methods. The simplest (but depricated) is to use a `name`,`regexp` pair that matches a parameter name against a regular expression. **ndi_document** objects are always returned from `ndi_experiment/database_search()` as a cell array that can be accessed with curly braces:

In [19]:
finddoc = E.database_search({'subject.id','vhlab12345'})
finddoc{1}
finddoc{1}.document_properties
finddoc{1}.document_properties.subject


finddoc =

  1x1 cell array

    {1x1 ndi_document}


ans = 

  ndi_document with properties:

    document_properties: [1x1 struct]


ans = 

  struct with fields:

        document_class: [1x1 struct]
          ndi_document: [1x1 struct]
               subject: [1x1 struct]
    subjectmeasurement: [1x1 struct]


ans = 

  struct with fields:

    reference: ''
      species: 'Mus musculus'
       strain: ''
      variant: ''
           id: 'vhlab12345'



The better way to search the database is to use **ndi_query** objects. These objects allow for a variety of match definitions. For example:

In [22]:
q_exp = ndi_query('ndi_document.experiment_id','exact_string',E.id(),''); % search for this experiment
q_species = ndi_query('subject.species','exact_string','Mus musculus','');
q_doctype = ndi_query('','isa','ndi_document_subjectmeasurement','');
finddoc = E.database_search(q_exp & q_species & q_doctype)
finddoc{1}
finddoc{1}.document_properties
finddoc{1}.document_properties.subject


finddoc =

  1x1 cell array

    {1x1 ndi_document}


ans = 

  ndi_document with properties:

    document_properties: [1x1 struct]


ans = 

  struct with fields:

        document_class: [1x1 struct]
          ndi_document: [1x1 struct]
               subject: [1x1 struct]
    subjectmeasurement: [1x1 struct]


ans = 

  struct with fields:

    reference: ''
      species: 'Mus musculus'
       strain: ''
      variant: ''
           id: 'vhlab12345'



In [23]:
help ndi_query % see more methods

  NDI_QUERY - create a query object for searching the database
 
  Creates an NDI_QUERY object, which has a single property
  SEARCH that is a structure array of search structures
  appropriate for use with FIELDSEARCH.
 
  Tha is, SEARCH has the fields:
  Field:                   | Description
  ---------------------------------------------------------------------------
  field                      | A character string of the field of A to examine
  operation                  | The operation to perform. This operation determines 
                             |   values of fields 'param1' and 'param2'.
      |----------------------|
      |   'regexp'             - are there any regular expression matches between 
      |                          the field value and 'param1'?
      |   'exact_string'       - is the field value an exact string match for 'param1'?
      |   'contains_string'    - is the field value a char array that contains 'param1'?
      |   'exact_number'       - is 

Every **ndi_document** also contains a binary "fork" that can be used to store / retrieve binary data. One can retrieve the binary object with `ndi_experiment/database_openbinarydoc`. When the binary portion is open, it is also locked so that other processes cannot write to it. Therefore, it is necessary to close the binary doc when one is inished.

In [24]:
binarydoc = E.database_openbinarydoc(doc);
disp(['Storing ' mat2str(0:9) '...'])
binarydoc.fwrite(char([0:9]),'char');
binarydoc = E.database_closebinarydoc(binarydoc);


Storing [0 1 2 3 4 5 6 7 8 9]...


Let's search for the doc and read its contents:

In [25]:
finddoc = E.database_search(q_exp & q_species & q_doctype);
findbinary = E.database_openbinarydoc(finddoc{1});
disp('About to read stored data: ');
data = double(findbinary.fread(10,'char'))',
findbinary = E.database_closebinarydoc(findbinary);


About to read stored data: 

data =

     0     1     2     3     4     5     6     7     8     9



Now let's reset the demo by removing our document.

In [27]:
doc = E.database_search({'subject.id','vhlab12345'});
E.database_rm(doc{1}.id()); % E.database_rm(doc) works, too

We've now covered the basics of documents and the database with one exception. Many of our documents **depend on** the content of other documents in some important way such that a given document isn't interpretible on its own. For example, let's look at the `stimulus_response.json` document type:

In [28]:
type stimulus_response.json


{
	"document_class": {
		"definition":						"$NDIDOCUMENTPATH\/stimulus\/stimulus_response.json",
		"validation":						"$NDISCHEMAPATH\/stimulus\/stimulus_response_schema.json",
		"class_name":						"ndi_document_stimulus_stimulus_response",
		"class_version":					1,
		"superclasses": [
			{ "definition":                                 "$NDIDOCUMENTPATH\/ndi_document.json"}
		]
	},
	"depends_on": [
		{	"name": "thing_id",
			"value": []
		},
		{	"name": "stimulator_id",
			"value": []
		}, 
		{	"name": "stimulus_presentation_id",
			"value": []
		},
		{	"name": "stimulus_control_id",
			"value": []
		}
	],
	"stimulus_response": {
		"stimulator_epochid":						[],
		"thing_epochid":						[]
	}
}



As you can see, this document type depends on a number of other documents, including the id of the stimulator, a document that describes the stimulus_presentations, a document that describes which stimuli serve as controls for each stimulus presentation, and the thing (the item that is giving a response to the stimulus). 

Right now we won't get into it, but we can create search queries to find a document that depends on another document:

In [29]:
q_depends = ndi_query('','depends_on','thing_id','12345')


q_depends = 

  ndi_query with properties:

    searchstructure: [1x1 struct]



We also have special functions that can search across the whole database for dependencies of a given objects or to find objects that have missing dependencies:

In [31]:
help ndi_findalldependencies

  NDI_FINDALLDEPENDENCIES- find documents that have dependencies on documents that do not exist
 
  [D] = NDI_FINDALLDEPENDENCIES(E, VISITED, DOC1, DOC2, ...)
 
  Searches the database of experiment E and returns all documents that have a 
  dependency ('depends_on') field for which the 'value' field corresponds to the
  id of DOC1 or DOC2, etc. If any DOCS do not need to be searched, provide them in VISITED.
  Otherwise, provide empty for VISITED.
 
  D is always a cell array of NDI_DOCUMENTS (perhaps empty, {}).



In [32]:
help ndi_finddocs_missing_dependencies

  NDI_FINDDOCS_MISSING_DEPENDENCIES - find documents that have dependencies on documents that do not exist
 
  D = NDI_FINDDOCS_MISSING_DEPENDENCIES(E)
 
  Searches the database of experiment E and returns all documents that have a 
  dependency ('depends_on') field for which the 'value' field does not 
  correspond to an existing document.
 
  The following form:
 
  D = NDI_FINDDOCS_MISSING_DEPENDENCIES(E, NAME1, NAME2, ...)
   
  works similarly except that it only examines variables with depends_on
  fields with names NAME1, NAME2, etc.



Right now, the function `ndi_experiment/database_rm()` will remove all of the dependencies of any ndi_document that is deleted from the database. Down the road, we might want to have this function in the database itself, and have it require permissions. The current database is running as though everyone has 'root' permissions but we want to add permissions down the road.