Skip to content

Document Path Basic Usage [Deprecated]

Manuel edited this page May 30, 2018 · 1 revision

FB DocStore is deprecated, and users will need to find alternative solutions.

Document path is implemented like a virtual column. We use dot-separated key paths as an intuitive way to refer objects in JSON documents. DocStore supports document paths in SQL queries as general expression to refer JSON objects.

The syntax is as follows:

<full_document_path> ::= <database_name> '.' <table_name> '.' <document_path>
                 | <table_name> '.' <document_path>
                 | <document_path>

<document_path> ::= <document_column_name> '.' <dot_separated_json_key_path>

<dot_separated_json_key_path> ::= <json_key_name_or_array_index>
                  | <json_key_name_or_array_index> '.' <dot_separated_json_key_path>

In general we can use document paths in a SQL query where a column or an expression can be used. Below are some examples of document path usage in queries.

  • Document path in projection list (with alias) and where condition. Note we use backquote for the key that contains a space.
select doc.address.`street name` 
from t1 where doc.children.2 like "Gary";

select doc.address.`street name` as StreeName 
from t1 where cast(doc.address.zipcode as unsigned) = 94403;
  • Document path in a function
select id from t1 where substring(test.t1.doc.name, 1, 4) like "Jack";
  • Document path in order-by. By default order-by will treat document path as string values. You can also add type hints (int, double, and string) to the order-by list.
select id, doc.address.zipcode
from t1 order by doc.address.zipcode;

-- use alias and type hint
select id, doc.address.houseNumber as houseNumber
from t1 order by houseNumber as int;

-- multiple document paths
select doc.address.state as state, doc.address.zipcode as zipcode
from t1 order by state, zipcode;
  • Document path in aggregate queries
select count(*) as cnt, doc.address.state as state
from t1 group by state
having cnt >= 1;

-- use cast() in group-by
select count(*), test.t1.doc.address.zipcode as zipcode
from t1 group by cast(zipcode as unsigned)
order by zipcode as int desc;
  • Document path in sub-queries and UNION queries
select count(doc.address.state) from (
  select doc.address
  from t1
) as tmp;

select doc.address.state, addr.state from (
  select doc.address.state, doc.address as addr
  from t1
) as tmp1 where tmp1.addr.state in ('CA', 'WA');

(select doc.address from t1) 
union
(select '{"zipcode":98766}');

-- union in subquery
select doc.address.state from (
  (select doc.address from t1)
  union
  (select doc.address from t2)
) as tmp;
  • Document path in JOIN queries
select addr.zipcode from (
  select t2.id, t1.doc.address as addr, t1.doc.name as name1, t2.doc.name as name2
  from t1 join t2
  on t2.doc.address.zipcode = t1.doc.address.zipcode
) as tmp;
  • Document path in a view
create algorithm = merge view v1 as
select doc.address as addr from t1;

select addr from v1;
select addr.zipcode from v1;

Additional properties of document path

  • Any document key that is not in in [ 0-9,a-z,A-Z,$_ ] needs to be in backquotes, e.g. doc.`street name`, just as database identifiers.
  • A number, e.g. 3, could be either a key name, or an array index, depending on the object pointed to by the document path.
  • We use the following order to disambiguate dot-separated identifiers during parsing:
    • db_name.table_name.col_name.[document path]
    • table_name.col_name.[document path]
    • col_name.[document path]
Clone this wiki locally