Creates a cartesian product of all properties identified using value pointers.
/**
* @param {Object} tree A hierarchical data structure.
* @returns {Array} A cartesian product of all values identified using value-pointers in the input object.
*/
unnest(tree);
In the context of Unnest, value-pointer refers to an object property whose key begins with @
, e.g.
{
'@foo': {
name: 'bar'
}
}
@foo
is a value-pointer.
Data pointers are used to identify all members that are used to create the cartesian product.
Unnest solves the problem of translating hierarchical dataset into a collection of atomic data records. This is a common requirement when extracting information from a hierarchical document (e.g. HTML) with intent to store it.
To illustrate an example use case, consider an HTML document that describes event locations, dates and times:
<ul>
<li class='location'>
<h1>foo0</h1>
<ol>
<li class='date'>
<h2>bar0</h2>
<ol>
<li class='time'>baz0</li>
<li class='time'>baz1</li>
</ol>
</li>
<li class='date'>
<h2>bar2</h2>
<ol>
<li class='time'>baz2</li>
<li class='time'>baz3</li>
</ol>
</li>
</ol>
</li>
<ul>
We want to extract location, date and time information into a collection of objects that each describe all attributes of the event, i.e. The desired result is a cartesian product of all 3 variables (location, date and time):
[
{
"date": "bar0",
"location": "foo",
"time": "baz0"
},
{
"date": "bar0",
"location": "foo",
"time": "baz1"
},
{
"date": "bar1",
"location": "foo",
"time": "baz2"
},
{
"date": "bar1",
"location": "foo",
"time": "baz3"
}
]
We can extract the subject data from the document using Surgeon. Surgeon uses declarative instructions to extract information out of a HTML document, e.g.
- sm .location
- '@location': so h1 | rdtc
children:
- sm .date
- '@date': so h2 | rdtc
children:
- '@time': sm .time | rdtc
The result is a hierarchical object describing the relevant variables contained in the HTML document:
[
{
"@location": "foo0",
"children": [
{
"@date": "bar0",
"children": [
{
"@time": "baz0"
},
{
"@time": "baz1"
}
]
},
{
"@date": "bar1",
"children": [
{
"@time": "baz2"
},
{
"@time": "baz3"
}
]
}
]
}
]
To get a cartesian product of all the variables, we need to iterate the tree data structure:
const locations = input;
const result = {};
for (const locationDatum of locations) {
for (const dateDatum of locationDatum.children) {
for (const timeDatum of dateDatum.children) {
result.push({
date: dateDatum['@date'],
location: locationDatum['@location']
time: timeDatum['@time']
});
}
}
}
Unnest replaces the last step:
import unnest from 'unnest';
unnest(input);
// [
// {
// "@date": "bar0",
// "@location": "foo",
// "@time": "baz0"
// },
// {
// "@date": "bar0",
// "@location": "foo",
// "@time": "baz1"
// },
// {
// "@date": "bar1",
// "@location": "foo",
// "@time": "baz2"
// },
// {
// "@date": "bar1",
// "@location": "foo",
// "@time": "baz3"
// }
// ]