Skip to content
/ CBOR Public

The most comprehensive CBOR module in the Lua universe.

License

Notifications You must be signed in to change notification settings

spc476/CBOR

Repository files navigation

             The Concise Binary Object Represenation Lua Modules

The major module in this collection is 'org.conman.cbor', the most
comprehensive CBOR module in the Lua universe, as it supports everything
mentioned in RFC-8949 and the extensions published by the IANA.  It's easy
to use, but also allows enough control to get exactly what you want when
encoding to CBOR.  For example:

	cbor = require "org.conman.cbor"

	data =
	{
	  name   = "Sean Conner",
	  userid = "spc476",
	  login  =
	  {
	    active     = true,
	    last_login = os.time(),
	  }
	}

	enc = cbor.encode(data)

It's that easy.  To decode is just as simple:

	data = cbor.decode(enc)

For a bit more control over the encoding though, say, to include some
semantic tagging, requires a bit more work, but is still quite doable.  For
example, if we change things slghtly:

	mt =
	{
	  __tocbor = function(self)
	    return cbor.TAG._epoch(os.time(self))
	  end
	}
	
	function date()
	  local now = os.date("*t")
	  setmetatable(now,mt)
	  return now
	end
	
	data = 
	{
	  name = "Sean Conner",
	  userid = "spc476",
	  login = 
	  {
	    active = true,
	    last_login = date()
	  }
	}
	
	enc = cbor.encode(data)

You can supply the '__tocbor' metamethod to a metatable to have the item
encode itself as this example shows.  This is one approach to have userdata
encoded into CBOR.

Decoding tagged items is also quite easy:

	tagged_items =
	{
	  _epoch = function(value)
	    local t = os.date("*t",value)
	    setmetatable(t,mt)
	    return t
	  end
	}

	-- first parameter is the data we're decoding
	-- second paramter is the offset to start with (optional)
	-- third parameter is a table of routines to process
	-- tagged data (optional)

	data = cbor.decode(enc,1,tagged_items)

The _epoch tagged type is passed to our function where we return the value
from os.date().  The resulting data looks like:

	data =
	{
	  name   = "Sean Conner",
	  userid = "spc476",
	  login  =
	  {
	    active     = true,
	    last_login = 
	    {
	      year  = 2016,
	      month = 4,
	      day   = 4,
	      hour  = 1,
	      min   = 42,
	      sec   = 53,
	      wday  = 2,
	      yday  = 95,
	      isdst = true,
	    } -- and we have a metatable that can encode this
	  }
	}

Of course, you could also do it like:

	enc = cbor.TYPE.MAP(3)
	   .. cbor.encode "name"   .. cbor.encode(data.name)
	   .. cbor.encode "userid" .. cbor.encode(data.userid)
	   .. cbor.encode "login"  .. cbor.TYPE.MAP(2)
	   	.. cbor.encode "active" 
	   	.. cbor.encode(data.login.active)
	   	.. cbor.encode "last_login" 
	   	.. cbor.TYPE._epoch(os.time(data.login.last_login))

if you really want to get down into the details of encoding CBOR.

Tagged types are not the only thing that can be post-processed though.  If
we wanted to ensure that all the field names were lower case, we can do that
as well:

	tagged_items =
	{
	  TEXT = function(value,iskey)
	    if iskey then 
	      value = value:lower()
	    end
	    return value
	  end,
	  
	  _epoch = function(value)
	    local t = os.date("*t",value)
	    setmetatable(t,mt)
	    return t
	  end,
	}
	
	data = cbor.decode(enc,1,tagged_items)
	
The second parameter indicates if the given value is a key in a MAP.

This module can also deal with circular references with an additional
parameter when encoding:

	x = {}
	x.x = x
	
	enc = cbor.encode(x,{})
	
The addition of the empty table as the second parameter let's the module
know that MAPs and ARRAYs might be referenced and to keep track as it's
processing.  The reason it's not done by default is that each MAP and ARRAY
is tagged as "shareable", which adds some overhead to the output.  If there
are no circular references, such additions are just wasted.  Just something
to keep in mind. 

But the above does work and can be passed to cbor.decode() without any
additional parameters (decoding will deal with such references
automatically).

Additionally, if the data you are sending might use repetative strings, say:

	data =
	{
	  { filename = "cbor_c.so"    , package = "org.conman" },
	  { filename = "cbor.lua"     , package = "org.conman" },
	  { filename = "cbor_s.lua"   , package = "org.conman" },
	  { filename = "cbormisc.lua" , package = "org.conman" },
	}
	
You can save some space by using string references:

	enc = cbor.encode(data,nil,{}) -- third parameter to use string refs
	
Normally, this would take 160 bytes to encode, but with string references,
it saves 54 bytes.  Not much in this example, but it could add up
significantly.

And yes, you can request both shared references and string references in the
same call.  

The CBOR values null and undefined map to Lua nil, and all the baggage that
comes with that.  If you really need to track null and/or undefined values,
you can define the following with a unique value:

	cbor.null
	cbor.undefined

And the value of those two fields will be checked for when encoding and the
appropriate CBOR value used.  Also, when decoding, the those values will be
used when the CBOR null and undefined values are decoded.  The safest value
to use is an empty table for each one:

	cbor.null      = {}
	cbor.undefined = {}

That will ensure both have a unique value.

If the 'org.conman.cbor' module is too heavy for your application, you can
try the 'org.conman.cbor_s' module.  It's similar to the full blown
'org.conman.cbor' module, but without the full support of the various tagged
data items.  And it too, supports custom null and undefined values---just
set

	cbor_s.null
	cbor_s.undefined

Our original example is the same:

	cbor_s = require "org.conman.cbor_s"

	data =
	{
	  name   = "Sean Conner",
	  userid = "spc476",
	  login  =
	  {
	    active     = true,
	    last_login = os.time(),
	  }
	}

	enc   = cbor_s.encode(data)
	data2 = cbor_s.decode(enc)

But the tagged version is slightly different:

	mt =
	{
	  __tocbor = function(self)
	  
	    -- ---------------------------------------------------
	    -- because of limited support, we need to specify the
	    -- actual number for the _epoch tag.
	    -- ---------------------------------------------------
	    
	    return cbor.encode(os.time(self),1)
	  end
	}
	
	function date()
	  local now = os.date("*t")
	  setmetatable(now,mt)
	  return now
	end
	
	data = 
	{
	  name = "Sean Conner",
	  userid = "spc476",
	  login = 
	  {
	    active = true,
	    last_login = date()
	  }
	}
	
	enc = cbor.encode(data)
	
	tagged_items = 
	{
	  [1] = function(value)	    
	    local t = os.date("*t",value)
	    setmetatable(t,mt)
	    return t
	  end,
	}
	
	data2 = cbor.decode(enc,1,tagged_items)

The first major difference is the lack of tag names (you have to use the
actual values).  The second difference is the lack of support for the
reference tags (you can still catch them, but due to a lack of support in
the underlying engine you can't really do anything with them at this time;
you the full blown 'org.conman.cbor' module if you really need circular
references) and string references (same issue).

Dropping down yet another level is the 'org.conman.cbor_c' module.  This is
the basic core of the two previous modules and only supplies two functions,
cbor_c.encode() and cbor_c.decode().  The modules 'org.conman.cbor_s' and
'org.conman.cbormisc' were developed to showcase the low level usage of the
'org.conman.cbor_c' module and can be the basis for a module for even more
constrained devices.

**************************************************************************
*
*                                  CAVEATS
*
**************************************************************************

Encoding a mixed Lua table (e.g.  { 1 , 2 , 3 , foo = 1 , bar = 2 }) may not
encode as expected.  For best results, ensure that all Lua tables are either
pure sequences (arrays, e.g. { 1 , 2 , 3 , 4 }) or are non-sequential in
nature.

NOTE:  The following array is problematic:

	{ foo = 1 , bar = 2 , [1] = 1 }

because it is both a Lua sequence and contains non-sequence indecies, and
will thus be encoded as an ARRAY.  The following array:

	{ foo = 1 , bar = 2 , [100] = 1 }

is NOT considered a sequence and will be encoded as a MAP.

**************************************************************************
*
*                         QUICK FUNCTION REFERENCE
*
* See http://cbor.io/spec.html for more information on CBOR
*
*	org.conman.cbor
*
*************************************************************

Full blown CBOR support.  All of RFC-8949 as well as the currently defined
IANA extensions are supported with this module.  This module also includes
enhanced error detection and type checking.  As a result, it's quite a bit
larger than the org.conman.cbor_s module.  This module also contains a large
number of functions.

==============================================================

Usage:	bool = cbor.isnumber(ctype)
Desc:	returns true of the given CBOR type is a number
Input:	type (enum/cbor) CBOR type
Return:	bool (boolean) true if number, false otherwise

==============================================================

Usage:	bool = cbor.isinteger(ctype)   
Desc:	returns true if the given CBOR type is an integer
Input:	ctype (enum/cbor) CBOR type
Return:	bool (boolean) true if number, false othersise

==============================================================

Usage:	bool = cbor.isfloat(ctype)
Desc: 	returns true if the given CBOR type is a float
Input:	ctype (enum/cbor) CBOR type
Return:	bool (boolean) true if number, false otherwise

==============================================================

Usage:	value,pos2,ctype = cbor.decode(packet[,pos][,conv][,ref][,iskey])
Desc:	Decode CBOR encoded data
Input:	packet (binary) CBOR binary blob
	pos (integer/optional) starting point for decoding
	conv (table/optional) table of conversion routines
	ref (table/optional) reference table (see notes)
	iskey (boolean/optional) is a key in a MAP (see notes)
Return:	value (any) the decoded CBOR data
	pos2 (integer) offset past decoded data
	ctype (enum/cbor) CBOR type of value
	
Note:	The conversion table should be constructed as:
	{
	  UINT      = function(v) return munge(v) end,
	  _datetime = function(v) return munge(v) end,
	  _url      = function(v) return munge(v) end,,
	}
	
	The keys are CBOR types (listed above).  These functions are
	expected to convert the decoded CBOR type into a more appropriate
	type for your code.  For instance, an _epoch can be converted into a
	table.
	
	Users of this function *should not* pass a reference table into this
	routine---this is used internally to handle references.  You need to
	know what you are doing to use this parameter.  You have been
	warned.
	
	The iskey is true if the value is being used as a key in a map, and
	is passed to the conversion routine; this too, is an internal use
	only variable and you need to know what you are doing to use this. 
	You have been warned.
	
	This function can throw an error.  The returned error object MAY BE
	a table, in which case it has the format:
	
	{
	  msg = "Error text",
	  pos = 13 -- position in binary object of error
	}
	
	This function can throw errors

==============================================================

Usage:	value,pos2,ctype[,err] = cbor.pdecode(packet[,pos][,conv][,ref])
Desc:	Protected call to cbor.decode(), which will return an error
Input:	packet (binary) CBOR binary blob
	pos (integer/optional) starting point for decoding
	conv (table/optional) table of conversion routines (see cbor.decode())
	ref (table/optional) reference table (see cbor.decode())
Return:	value (any) the decoded CBOR data, nil on error
	pos2 (integer) offset past decoded data; if error, position of error
	ctype (enum/cbor) CBOR type
	err (string/optional) error message (if any)

==============================================================

Usage:	blob = cbor.encode(value[,sref][,stref])
Desc:	Encode a Lua type into a CBOR type
Input:	value (any)
	sref (table/optional) shared reference table
	stref (table/optional) shared string reference table
Return:	blob (binary) CBOR encoded value
	
Note:	This function can throw errors

==============================================================

Usage:	blob[,err] = cbor.pencode(value[,sref][,stref])
Desc:	Protected call to encode a CBOR type
Input:	value (any)
	sref (table/optional) shared reference table
	stref (table/optional) shared string reference table
Return:	blob (binary) CBOR encoded value, nil on error
	err (string/optional) error message

==============================================================

	cbor.__ENCODE_MAP

A table of functions to map Lua values to CBOR encoded values.  nil,
boolean, number and string are handled directly (if a Lua string is valid
UTF8, then it's encoded as a CBOR TEXT.

For the other four types, only tables are directly supported without
metatable support.  If a metatable does exist, if the method '__tocbor' is
defined, that function is called and the results returned.  If '__len' is
defined, then it is mapped as a CBOR ARRAY.  For Lua 5.2, if '__ipairs' is
defined, then it too, is mapped as a CBOR ARRAY.  If Lua 5.2 or higher and
'__pairs' is defined, then it's mapped as a CBOR MAP.

Otherwise, an error is thrown.

	--------------------------------------------------------------

Usage:	blob = cbor.__ENCODE_MAP[luatype](value,sref,stref)
Desc:	Encode a Lua type into a CBOR type
Input:	value (any) a Lua value who's type matches luatype.
	sref (table/optional) shared reference table
	stref (table/optional) shared string reference table
Return:	blob (binary) CBOR encoded data

==============================================================

	cbor.TYPE

Both encoding and decoding functions for CBOR base types are in this table. 
The functions 'UINT' (unsigned integer), 'NINT' (negative integer), 'BIN'
(binary string), 'TEXT' (UTF-8 encoded text), 'ARRAY' (a Lua array or
sequence) and 'MAP' are used to encode the given type.  The numeric entries
are for decoding CBOR data.

	--------------------------------------------------------------

Usage:	blob = cbor.TYPE['name'](n,sref,stref)
Desc:	Encode a CBOR base type
Input:	n (integer string table) Lua type (see notes)
	sref (table/optional) shared reference table
	stref (table/optional) shared string reference table
Return:	blob (binary) CBOR encoded value
	
Note:	UINT and NINT take an integer.
	
	BIN and TEXT take a string.  TEXT will check to see if
	the text is well formed UTF8 and throw an error if the
	text is not valid UTF8.
	
	ARRAY and MAP take a table of an appropriate type. No
	checking is done of the passed in table, so a table
	of just name/value pairs passed in to ARRAY will return
	an empty CBOR encoded array.
	
	TAG and SIMPLE encoding are handled elsewhere.
	
	If the data you have includes tables with cycles, you will need to
	pass in an empty table for the sref parameter, otherwise, you run
	the risk of blowing out the stack.
	
	The parameter stref is used to pass strings as references in the 
	resulting CBOR.  This will make the resulting CBOR data smaller, but
	make sure the receiving end can deal with string references.
	
	ARRAY and MAP references are separate from string referenes.  It is
	also safe to use both with this module.

	--------------------------------------------------------------

Usage:	value2,pos2,ctype = cbor.TYPE[n](packet,pos,info,value,conv,ref)
Desc:	Decode a CBOR base type
Input:	packet (binary) binary blob of CBOR data
	pos (integer) byte offset in packet to start parsing from
	info (integer) CBOR info (0 .. 31)
	value (integer) CBOR decoded value
	conv (table) conversion table (passed to decode())
	ref (table) used to generate references (TAG types only)
Return:	value2 (any) decoded CBOR value
	pos2 (integer) byte offset just past parsed data
	ctype (enum/cbor) CBOR deocded type
	
Note:	tag_* is returned for any non-supported TAG types.  The
	actual format is 'tag_' <integer value>---for example,
	'tag_1234567890'.  Supported TAG types will return the
	appropriate type name.
	
	simple is returned for any non-supported SIMPLE types. 
	Supported simple types will return the appropriate type
	name.
	
	The ref parameter is not marked optional here---it's used internally
	to handle ARRAY/MAP and string references.  Make sure you always
	pass in the same table (it should be empty initally) here for
	consistent results.

==============================================================

		cbor.TAG

Encoding and decoding of CBOR TAG types are here.  Like cbor.TYPE, the named
entries are for encoding and the numbered entries are for decoding.  Named
types are:

		* _datetime	datetime (TEXT)
		* _epoch	see cbor.isnumber()
		* _pbignum	positive bignum (BIN)
		* _nbignum	negative bignum (BIN)
		* _decimalfraction ARRAY(integer exp, integer mantissa)
		* _bigfloat	ARRAY(float exp,integer mantissa)
		* _tobase64url	should be base64url encoded (BIN)
		* _tobase64	should be base64 encoded (BIN)
		* _tobase16	should be base16 encoded (BIN)
		* _cbor		CBOR encoded data (BIN)
		* _url		URL (TEXT)
		* _base64url	base64url encoded data (TEXT)
		* _base64	base64 encoded data (TEXT)
		* _regex	regex (TEXT)
		* _mime		MIME encoded messsage (TEXT)
		* _magic_cbor	itself (no data, used to self-describe CBOR data)
		
		* _nthstring	shared string
		* _perlobj	Perl serialized object
		* _serialobj	Generic serialized object
		* _shareable	sharable resource (ARRAY or MAP)
		* _sharedref	reference (UINT)
		* _rational	Rational number
		* _uuid		UUID value (BIN)
		* _language	Language-tagged string
		* _id		Identifier
		* _stringref	string reference
		* _bmime	Binary MIME message
		* _decimalfractionexp like _decimalfraction, non-int exponent
		* _bigfloatexp	like _bigfloat, non-int exponent
		* _indirection	Indirection
		* _rains	RAINS based message

	--------------------------------------------------------------

Usage:	blob = cbor.TAG['name'](value,sref,stref)
Desc:	Encode a CBOR tagged value
Input:	value (any) any Lua type
	sref (table/optional) shared reference table
	stref (table/optional) shared string reference table
Return:	blob (binary) CBOR encoded tagged value
	
Note:	Some tags only support a subset of Lua types.

	--------------------------------------------------------------

Usage:	value,pos2,ctype = cbor.TAG[n](packet,pos,conv,ref)
Desc:	Decode a CBOR tagged value
Input:	packet (binary) binary blob of CBOR tagged data
	pos (integer) byte offset into packet
	conv (table) conversion routines (passed to decode())
	ref (table) reference table
Return:	value (any) decoded CBOR tagged value
	pos2 (integer) byte offset just past parsed data
	ctype (enum/cbor) CBOR type of value

==============================================================

		cbor.SIMPLE

Encoding and decoding of CBOR simple types are here.  These are:

		* false		false value	(Lua false)
		* true		true value	(Lua true)
		* null		NULL value	(Lua nil)
		* undefined	undefined value	(Lua nil)
		* half		half precicion   IEEE 754 float
		* single	single precision IEEE 754 float
		* double	double precision IEEE 754 float
		* __break	SEE NOTES

	--------------------------------------------------------------

Usage:	blob = cbor.SIMPLE['name'](n)
Desc:	Encode a CBOR simple type
Input:	n (number/optional) floating point number to encode (see notes)
Return:	blob (binary) CBOR encoded simple type
	
Note:	Some functions ignore the passed in parameter.  
	
	WARNING! The functions that do not ignore the parameter may
	throw an error if floating point precision will be lost
	during the encoding.  Please be aware of what you are doing
	when calling SIMPLE.half(), SIMPLE.float() or
	SIMPLE.double().

	--------------------------------------------------------------

Usage:	value2,pos,ctype = cbor.SIMPLE[n](pos,value)
Desc:	Decode a CBOR simple type
Input:	pos (integer) byte offset in packet
	value (number/optional) floating point number
Return:	value2 (any) decoded value as Lua value
	pos (integer) original pos passed in (see notes)
	ctype (enum/cbor) CBOR type of value
	
Note:	The pos parameter is passed in to avoid special cases in
	the code and to conform to all other decoding routines.

*************************************************************
*
*	org.conman.cbor_s
*
*************************************************************

This module provides a simple (or small, take your pick) implementation of
CBOR, which should be fine for most uses, as long as such uses don't really
require the need of TAGS (although there is some minimal support for such)
and as long as you stick to simple Lua types like nil, booleans, numbers,
strings and tables of only simple Lua types and that have no cycles.  This
function defines four functions.  No type checking is done when encoding
tagged values.

==============================================================

Usage:		blob = cbor.encode(value[,tag])
Desc:		Encode a Lua type into a CBOR type
Input:		value (any)
		tag (number/optional) CBOR tag value
Return:		blob (binary) CBOR encoded value
		
Note:		This function can throw errors

==============================================================

Usage:		blob[,err] = cbor_s.pencode(value[,tag])
Desc:		Protected call to encode into CBOR
Input:		value (any)
		tag (number/optional) CBOR tag value
Return:		blob (binary) CBOR encoded value
		err (string/optional) error message

==============================================================

Usage:		value,pos2,ctype = cbor.decode(packet[,pos][,conv])
Desc:		Decode CBOR encoded data
Input:		packet (binary) CBOR binary blob
		pos (integer/optional) starting point for decoding
		conv (table/optional) table of tagged conversion routines
Return:		value (any) the decoded CBOR data
		pos2 (integer) offset past decoded data
		ctype (enum/cbor) CBOR type of value
		
Note:		The conversion table should be constructed as:
		
		{
		  [ 0] = function(v) return munge(v) end,
		  [32] = function(v) return munge(v) end,,
		}
		
		The keys are CBOR types (as integers).  These functions are
		expected to convert the decoded CBOR type into a more
		appropriate type for your code.  For instance, [1] (epoch)
		can be converted into a table.
		
		This function can throw errors.

==============================================================

Usage:		value,pos2,ctype[,err] = cbor.pdecode(packet[,pos][,conv])
Desc:		Protected call to decode CBOR data
Input:		packet (binary) CBOR binary blob
		pos (integer/optional) starting point for decoding
		conv (table/optional) table of tagged conversion routines
Return:		value (any) the decoded CBOR data, nil on error
		pos2 (integer) offset past decoded data, 0 on error
		ctype (enum/cbor) CBOR type of value
		err (string/optional) error message, if any

*************************************************************
*
*	org.conman.cbor_c
*
*************************************************************

This module provides the foundation of the CBOR modules and is written in C.
This module deals with the lowest level details of encoding and decoding
CBOR data.  It will encode data with the minimal encoding size [1].  This
module provides just two functions, and it helps to be familiar with
RFC-8949 to use this module properly.

NOTE:  Both functions can throw an error.

[1]	Floating point values will by default be encoded with the minimal
	encoding size without losing precision.  It is possible to use a
	larger encoding if wanted.

==============================================================

Usage:		blob = cbor_c.encode(type,value[,value2])
Desc:		Encode a CBOR value
Input:		type (integer) CBOR type
		value (number) value to encode (see note)
		value (number/optional) float to encode (see note)
Return:		blob (binary) CBOR encoded value
		
Note:		value is optional for type of 0xE0.
		value2 is optional for type of 0xE0; otherwise it's ignored.
		
		To encode a break value: 
			blob = cbor_c.encode(0xE0)
			
		To encode a floating point value with a minimal encoding:
			blob = cbor_c.encode(0xE0,nil,1.23)
			
		To force a particular encoding of a float value:
			blob = cbor_c.encode(0xE0,27,math.huge)
			
		To encode an array of indeterminate length:
			blob = cbor_c.encode(0xA0)
				-- encode entries
			blob = blob .. cbor_c.encode(0xE0)

==============================================================

Usage:		ctype,info,value,pos2 = cbor_c.decode(blob,pos)
Desc:		Decode a CBOR-encoded value
Input:		blob (binary) binary CBOR sludge
		pos (integer) position to start decoding from
Return:		ctype (integer) CBOR major type
		info (integer) sub-major type information
		value (integer number) decoded value
		pos2 (integer) position past decoded data

Note:		Throws in invalid parameter

*************************************************************
*
*	org.conman.cbormisc
*
*************************************************************

This module contains miscellaneous routines related to CBOR.  Currently, two
functions are defined, and these are not required for normal CBOR usage.

==============================================================

Usage:		diag = cbormisc.diagnostic(packet[,pos])
Desc:		Output CBOR encoded data in the CBOR diagnostic output format
Input:		packet (binary) CBOR encoded data
		pos (integer/optional) starting point for decoding
Return:		diag (string) CBOR data in CBOR diagnostic format

Note:		This function can throw errors

==============================================================

Usage:		diag[,err] = cbormisc.pdiagnostic(packet[,pos])
Desc:		Protected call to cbormisc.diagnostic
Input:		packet (binary) CBOR encoded data
		pos (integer/optional) starting point for decoding
Return:		diag (string) CBOR data in CBOR diagnostic format
		err (string/optional) error message if any