A Node.js/Redis implementation of XRI Data Interchange (XDI) specifications
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
README
test

README

 
Basic XDI operations:
      create_context(XDI_Endpoint, Context_XRI) : Promise(Ctx)
      destroy_context(XDI_Endpoint, Context_XRI) : Promise
      connect(Ctx, I_Name, Method, Credentials) : Promise ( Default_Link_Contract_XRI | Reason)
      disconnect(Ctx) : Promise
      establish(Ctx, Link_Contract_Template_XRI)  : Promise( Contract_XRI | Reason[, Counter_Contract])  
      dissolve(Ctx, Contract_XRI) : Promise ( () | Reason)
      create(Ctx, XRI_Instantiated) : Promise
      destroy(Ctx, XRI_Destroyed) : Promise
      add(Ctx, XRI_Container, XRI_Member) : Promise
     remove(Ctx, XRI_Container, XRI_Member) : Promise
      members(Ctx, XRI_Container, Handler_FN, [Filter_FN]*)
      and(Ctx, XRI_Container, XRI_Container, Union_Member_Handler_FN)
      or(Ctx, XRI_Container, XRI_Container, Intersection_Member_Handler_FN)
      
Helper functions:
       testForLiteralXri(XRI) : Promise 
       xriFromLiteral(Value) : Promise
       literalFromXri(Value) : Promise
       in(Ctx, XRI_Container) : Promise
       in(Ctx, XRI_Container, XRI_Member) : Promise

{} - Empty XDI Graph

"There is =Bill, i.e. Bill exists"
=Bill <=> { "=Bill" : {} } <=>  {"=" : [ {"*Bill"  : {}]} <=>  {"=" : [ {"Bill"  : {}]}

== Untyped string literal ==

"Bill has a property (that is a data type property) whose value is 25 (we don't know if 25 is a number or a string, we treat it as a string)"
=Bill/+property/(data:text/plain,25) <=>
	    {"=Bill" : [
	    	       {"+p" : [ "data:text/plain,25"] }
	    	     ]}

"Bill has a property (that is a data type property) whose numeric value is 25
=Bill/+property/'25'
Note that the ' character, according to http://www.ietf.org/rfc/rfc3986.txt, is allowed by URI general syntax - so I am using that to denote a untyped literal value.
In JSON this could be either
	    {"=Bill" : [
	    	       {"+p" : "data:text/plain.25"}
	    	     ]}
or
	    {"=Bill" : [
	    	       {"+p" :  {"+value": "25"} }
	    	     ]}
but not 
	    {"=Bill" : [
	    	       {"+p" : "25"} 
	    	     ]}
because that represents =Bill/+p/25 because that is the address of an XDI object. That object could, and probably is, a literal value - but we can't know that from the semantics.


== Typed numeric literal ==

Note that 'data:text/plain.25', '25', and '$25' may be intended by an xri creator to mean the same thing, but are different. The first two are untyped literals, the third can be either a predicate specifying ordinality or a typed numeric literal. If the number following a $ is not a positive whole number then it can only represent a typed numeric literal.  The type is by default the type with the lowest space complexity that can represent the number given accurately (a whole number with a decimal .0 suffix is treated as a double). This can be modified by delegation. For example to specify 25 as a double you could use $25.0, or you could use $25+xsd+double, or you could use +xsd+double$25.   You could also represent it as a native javascript value 25.

You may be asking "Why can I use delegation either way?". The reason is that every XRI is a set. So if I say $25 I am saying the set of all things 25, including 25 as a float, 25th, and 25 as an int, perhaps even 25 is some other context. Delegation is set intersection. So $25+xsd limits what I am talking about to the members of the $25 set that are also in the set +xsd, which is the set of typed values defined by the XML Schema Spec. By saying $25+xsd*double we are saying that we are further restricting it to members that are in $25, in +xsd, and also in +xsd*double (a subset of +xsd)


== String literal with language specified ==

To explicitly specify that a literal is a string (remember untyped literals are treated as strings) we can either use delegation with the $xsd schema space, use delegation with the $lang space to signify a language (which means it must be a string), or both using $and (very convoluted and unneccessary except to be logically consistent I include it)

Examples:
	'25th'$xsd+string
	$xsd+string'25th'
	'25th'$lang$en
	$lang$en'25th'
	'25th'$and($xsd+string)($lang$en)


== Sidebar : Specifying a restricted type ==

Lets say I wanted to specify the type of an int as per the XML Schema specification that represents a day in March (1..31). I can use the following comparator specifiers in conjunction with $and:
     $gt = Greater than
     $gteq = Greater than or equal to
     $lt = Less than
     $lteq = Less than or equal to
     $eq = equal to
     
Example:  '25'$and($xsd+int)($gteq$1)($lteq$31)
     
I don't restrict the use of comparators to numeric types, they can be used on any thing for which an order can be applied on the set of things of that type.


== Boolean literals ==

Boolean literals are $true, $false, 'true'+xsd*boolean, +xsd*boolean'true',  or javascript value true or false.


== What about other typed values? ==

To specify a typed value in an XRI I can use delegation as with the $25 example above, except the value is contained with single quotes. Note that the single quote character is allowed in both URI and URN syntax. For example to specify $true using the XML Schema specification I'd use either 'true'+xsd*boolean or +xsd*boolean'true'. I'm adopting the convention that '' delimit the start and end of a literal in much the same way ( and ) do for an xref.


== What if the literal value is too large to fit into a URI? ==

For this case we can represent either store the literal at a resolvable URI and use an XREF, or we can inline that literal by making it a sequence of smaller parts. For example let's say a literal "abcdefghi" needed to be inlined for XML signature purposes and only a 3 letter grouping could fit into a URI. We could inline it like so:
    {"+value" :  {
			"$is$a" : "$and($xsd+hexBinary)($set)",
			"$1" : "abc",
			"$2" : "def",
			"$3" : "ghi"
			}
    }


== Multi-valued properties ==

Since every datum in an XDI graph must be XRI addressable each value must have it;s own XRI address. That means that every predicate can have one and only one value. I can hear the cries of "But real-world properties with multiple values are commonplace!", and you're right! The solution is to have that single value be an XDI object that is a collection. Ordered collections (+seq) have ordinal predicates of the form $1, $2, $3, etc. Unordered collections (+set) have the same format but order is not guaranteed to be preserved. A collection that is a map is really nothing more than a standard XDI object whose keys are it's predicates.

As an example let's say we want to specify the sentence "Bill's team colors are Blue and Gold". This would be the XRIs:
=Bill/$has/+team
=Bill+team/+color/+set$1
+set$1/$1/+Blue
+set$1/$2/+Gold

To specify it in one XRI using a sub-context (this used notation still very much under discussion by the TC):
=Bill+team/+color//$and($is$a/+set)($1/+Blue)($2/+Gold)

Note: Is the XRIs correct? Or would it be:
=Bill/$has/+team
+Bill+team+color/$1/+Blue
+Bill+team+color/$2/+Gold

In which case the unified XRI would be:
=Bill+team/+color/$and($is$a/+set)($1/+Blue)($2/+Gold)

We can simplify this a little by noting that the default type of a collection is a +set since every XRI is a set and since a sequence is a subset of a set that has order imposed on it. This leaves us with:
=Bill+team/+color//$and($1/+Blue)($2/+Gold)
or
=Bill+team/+color/$and($1/+Blue)($2/+Gold)

If we further say that $and and $or are instances of an unordered set and that for convenience the ordinal specifier is implied by the order within the XRI we can further simplify this to:
=Bill+team/+color//$and(+Blue)(+Gold)
or
=Bill+team/+color/$and(+Blue)(+Gold)

This of course is simplified by out convention that any Xref consisting of a GCS and a single segment can be abbreviated without the parenthesis, to get:
=Bill+team/+color//$and+Blue+Gold
or
=Bill+team/+color/$and+Blue+Gold

So could we say that $and and $or imply a sub-context? Seems so, but it's something that I'd like to discuss with the group.


== Non-literal object values ==

There are 3 types of non-literal object values: references, links, and sub-contexts. 

A reference is where the XDI object value is within the current graph context but is separately addressable from the subject of the statement that the reference is an object in, this is expressed and XREF.

A link is an XRI property value and includes references.

A sub-context is where the XDI object value does not have meaning without explicitly being addressed as a sub-set of the subject and predicate. For example "Blue and gold" is not a sentence, but is either a noun phrase (an XDI subject) or a direct object (an XDI object). 

For example the XRI  =drummond/+friend//=markus/+friend//=paul.trevithick is roughly  equivalent to the English statement, “Drummond has a friend Markus 
who has a friend Paul Trevithick”. Note the double forward slashes that separate each 
context; they delimit the start of a new XRI addressing space. This means that there are two sub-contexts.
Each sub-context is represented as a JSON object. A sub-context consisting of just a subject X is represented as { "X": {} } 

I represent this in JSON as:
{"=drummond" :
	     {
		"+friend" :
			  {
				"=markus" :
					  {
						"+friend" :
							  {
								"=paul.trevithick" : {}
							  }
					  }
			  }
	     }
}


__TBD : Tie $and into sub-contexts better, with more and/or improved examples__

For instance the JSON representation of =Bill+team/+color//$and+Blue+Gold could be:
{ 
  "=Bill+team" : {
  	  "+color" : {
	  	   "$is$a" : "$and+Blue+Gold"
		   }
          }
 }

Given what we know about $and, that it is an instance of $set+ordered, I can say:
{ 
  "=Bill+team" : {
  	  "+color" : {
	  	   "$is$a" : "$and",
		   "$1" : "+Blue",
		   "$2" : "+Gold"
		   }
          }
 }

Since multi-valued properties with XRI values are so common in XDI I adopted the notation [], so the above could also be expressed as
{ 
  "=Bill+team" : {
  	  "+color" : ["+blue", "+gold"]
         }
 }

== Addition path notation ==

== Summary ==

So to summarize...

I translate subject and predicates to JSON using the following rules:
1. An XDI graph is a JSON object with subjects as keys 
2. Each subject key has a value that is a JSON object with the predicates for that subject as keys
3. Each predicate key has a value according to the object value rules below. These rules are separated into rules for literals and rules for non-literals.

I translate literal object values to JSON using the following rules:
1. If it's an untyped literal "xyz" then value is either {"+value" : "xyz"}, or  "data:text/plain.25", or {"+xri": "data:text/plain.25"}. Note that this also implies that any XDI object of the form {"+value" : "..."} is a literal, as well as that all literals are XRIs.
2. If it's a typed literal X of a numeric type then I can represent the value is one of:
   1. The javascript representation of the value (e.g. 25) (note: this trades some type info loss for convenience)
   2. $X (e.g. $25.0)
   3. $X$xsd+Y , where Y is one of the numeric types defined in the XML Schema specification (e.g $25$xsd+double)
   4. $xsd+Y$X (e.g. $xsd+double$25)
3. If it's a typed literal X of a boolean type then I represent the value as one of:
   1. The javascript boolean value true, or false
   2. $true or $false
   3. 'true'+xsd*boolean or 'true'+xsd*boolean
   4. +xsd*boolean'false' or +xsd*boolean'false'
4.  If it's a typed literal X of a String type with no language specified (default is $lang$en) then I represent the value as one of
    1. 'X'$xsd+string
    2. $xsd+string'X'
5.  If it's a typed literal X of a String type with language Y then I represent the value as one of
    1. 'X'$lang$Y
    2. $lang$Y'X'
    3, 'X'$and($xsd+string)($lang$Y) (Included only for logical consistency, one of the other forms should be used)
6.  If it's a typed literal X of some other type Y defined in schema space X then I represent the value as one of:
    1. 'X'$Z+Y
    2. $Z+Y'X'
7. If the literal of type Y defined in schema space Z is too large to fit in an XRI then it must be broken into pieces or stored elsewhere and linked to. 
   To break it into pieces I use the long form of JSON for literals, which is {"+value" : {...} }, combined with
   the $set notation, like so:
       	    	   {"+value" : {
   			"$is$a" : "$and($Z+Y)($set)",
			"$1" :  "...,",
			"$2" : "...",
			"$3" : " ...", 
			....
			}
		    }
8. For multi-valued properties I use a similar notation but use $and, a narrowing of $set that means an unordered set that is combined, i.e. intersection.
       {"+value" : {
       		 "$is$a" : "$and",
		 "$1" : "1st value here, per rules above",
		 "$2" : "2nd value here, per rules above",
		 "$3" : "3rd value here, per rules above",
		 ...}
       }

I translate non-literal object values to JSON using the following rules:
1. Non-literals are always either XRI's in some form (though they may be a URI wrapped in an Xref) or  sub-contexts
2. If the XRI  (X) is an Xref then I express the value in one of the following ways:
   1.  {"+xri$ref" :  "X" }
   2. as an XRI using the rules that follow
3. I can express an XRI X using a long form of { "+xri" : "X"} , or
4. I can express it using the shortcut notation [X]
5. For multi-valued non-literal properties I can either use the $and notation in rule for literals #8 above, or the shortcut notation. For example
if I had a property whose values were +blue and +gold I could express that as
{ 
   "$is$a" : "$and",
   "$1" : "+Blue",
   "$2" : "+Gold"
}
or as ["+blue", "+gold"]
6. If the value is a sub-context then it's represented as a JSON object. For example the XRI 
=drummond/+friend//=markus/+friend//=paul.trevithick is roughly  equivalent to the English statement, “Drummond has a friend Markus 
who has a friend Paul Trevithick”. Note the double forward slashes that separate each 
context; they delimit the start of a new XRI addressing space. This means that there are two sub-contexts.
Each sub-context is represented as a JSON object. A sub-context consisting of just a subject X is represented as { "X": {} } 

I represent this in JSON as:
{"=drummond" :
	     {
		"+friend" :
			  {
				"=markus" :
					  {
						"+friend" :
							  {
								"=paul.trevithick" : {}
							  }
					  }
			  }
	     }
}
7. Note that $and and $or imply a sub-context, so $and(+blue)(+gold) can also be expressed as
{ "$and$1" : 
  {
	"$1" : "+blue",
	"$2" : "+gold"
  }
} 
Note the use of $and$1 rather than $and. That means the first instance of an $and object in this context. If I had used $and I would have been talking
about the operator itself rather than an instance of it's use.

More generally I take any instance of +set to imply a sub-context in the same way, as does any multi-valued XRI property.

The rules above taken together define the X3J XDI serialization format. X3J is an acronym for X3 for JSON.


== comments ==
For security reasons I consider out-of-band comments that are not part of the graph model undesirable. Being free text they cannot be validated by a JSON schema checker.
I also consider them undesirable for extensibility reasons. I want to be able to describe semantically any extensions and comments have proven with XML to
be a method for some developers to add extensions that are only understandable and detectable by special parsing software.

If the information that needs to be represented cannot be represented with normal properties thenI think it is much better to adopt the XML schema convention of xsd:annotation in the form of $xsd$annotation. This property always has a value that is a sub-context. That sub-context can include a property "+role" that has one or more XRI values. The sub-context must include either a +appinfo property or a +documentation property , but must not include both.  The value of the +appinfo property is a sub-context. The value of the +documentation property is a typed string literal.

== An algorithm to translate X3 Standard to X3J
The ABNF of X3 Standard is
    X3   = *( "[" sub *( "[" pred *( "[" obj "]" ) "]" ) "]" ) 
    sub  = [ comment ] xri-reference [ comment ] 
    pred  = [ comment ] xri [ comment ] 
    obj  = [ comment ] ( xri-reference / literal / X3 ) [ comment ] 
    literal = """ *char """ 
    comment = "<--" *c-char "-->" 

This ABNF is in progress and so does not include the following:
1. Formal spec of xri-reference, char, and c-char
2. Typed literals
3. Language for String literals

For the purposes of this algorithm I'll adopt the convention that there is another def xri-c that defines the character that can be in an XRI (from the XRI syntax ABNF) and that 
xri-reference = *xri-char, and the convention that all literals from X3 standard are untyped.

The result is initially {} and is represented as Result variable.  The algorithm is presented as an event-based algorithm with each event being passed the current State and the Value
of that token. The State is initially { "result" : {}, "sub" : undefined, "pred": undefined }. Error handling is left out for readability. ++ operator is used to denote appending to an array. 
Parsing X3 is done by calling parse(X3) which returns the JSON object representing the X3.  Note that there are several optimizations possible that change the algorithm and that this algorithm is only presented as an example of a way to do the conversion.

__ TBD : Pseudo code below needs clean up __

on sub(State, XriReferenceString):
   if (State.result[XriReferenceString] == undefined)
      State.result[XriReferenceString] = {}
  State.sub = XriReferenceString

on sub(State, end):
  State.sub = undefined

on pred(State, end):
  State.pred = undefined

on pred(State, Xri):
  State.pred = Xri

// There are several ways the following could be represented, as 
// elsewhere in this algorithm I used what I though was the simplest
// approach
on obj(State, XriReferenceString):
  Current = State.result[State.sub][State.pred]
  if (Current == undefined)
     Next = [XriReferenceString]
  else
     Next = Current ++ XriReferenceString
   State.result[State.sub][State.pred] = Next

on obj(State, X3):
  Val = parse(X3)
  Current = State.result[State.sub][State.pred]
  if (Current == undefined)
     Next = Val
  else
	if (Current["+value"]["$is$a"] == "$and")
	   Next = addToMultiValue(State, Val, Current)
        else
	   Next = createMultiValue(State, Val, Current)
  State.result[State.sub][State.pred] = Next
   
// This uses the test functions not defined here that return true if the
// untyped literal is of a certain type:
// 	   isNum - If the string parses as a number
//	   
on obj(State, LiteralString):
  if (LiteralString == "true")
     Val = "$true"
  else if (LiteralString == "false")
     Val = "$false"
  else if (isNum(LiteralString))
     Val = "$" + LiteralString
  else if (length(LiteralString) > MAX_XRI_LENGTH)
     Val = splitLongLiteralToSet(LiteralString)
  else
     Val = LiteralString
  Current = State.result[State.sub][State.pred]
  if (Current == undefined)
     Next = Val
  else
	if (Current["+value"]["$is$a"] == "$and")
	   Next = addToMultiValue(State, Val, Current)
        else
	   Next = createMultiValue(State, Val, Current)
  State.result[State.sub][State.pred] = Next

splitLongLiteralToSet(LiteralString):
...

addToMultiValue(State, Val, Current):
...

createMultiValue(State, Val, Current):
...

**We need to edit: '25'$xsd$int looks good but *can't be* as it's not a valid XRI. Only '25' and $xsd$int'25' and $xsd$int*25

***********
LEFT OFF HERE*******************
*********
------
A difference from another implementation is how sub-context links are handled.  The other implementation  uses the following (irrelevant bits removed):
{"=drummond": {
	      "+email":
		{
			"=drummond+home": {
				"+email": null
			},
			"=drummond+work": {
				"+email": null
			}
		},
	},
	"=drummond+home": {
		"+email": "drummond@example.com"
	},
	"=drummond+work": {
		"+email":
		{
			"=drummond@cordance": {
				"+email": null
			},
			"=drummond@parity": {
				"+email": null
			}
		}
	}
}

This to me scans as =drummond has an +email that is =drummond+home, which has a +email. Also the use of null works but feels problematic.
{"=drummond": {
	      "+email": [{"=drummond+home/+email" : {}}, {"=drummond+work/+email" : {}}]
	},
"=drummond+home": {
		"+email": "drummond@example.com"
	},
"=drummond+work": {
		"+email":
		{
			"=drummond@cordance": {
				"+email": null
			},
			"=drummond@parity": {
				"+email": null
			}
		}
	}
}

This feels cleaner to me and also has the side benefit of allowing statements about the subcontext. Let's say I wanted to say that I use
Drummond's work email between 9 and 5 EST and his home email other times. I'll use a predicate my app knows about $applies which
allows the app to make a node invisible to access if the node does not pass the applied filter:
{"=drummond": {
	      "+email": [
				{"=drummond+work/+email" : {
							 "$applies" : ["$when/$gteq/$EST$9", "$when/$lteq/$EST$5"]"
				}}, 
				{"=drummond+home/+email" : {}}
			]
	},
"=drummond+home": {
		"+email": "drummond@example.com"
	},
"=drummond+work": {
		"+email":
		{
			"=drummond@cordance": {
				"+email": null
			},
			"=drummond@parity": {
				"+email": null
			}
		}
	}
}