Dataset
A dataset
is a structure of values in tabular format ultimately represented by an array of
of javascript objects for which querying (SQL-like) operations are available.
Below, the fluent-data library is loaded into the variable $$
and three datasets are
constructed with array arguments. These datasets are used by the examples in this page.
let $$ = require('./dist/fluent-data.server.js');
let students = $$([
{ id: 'a', name: 'Andrea', topic: 'Abelard', bias: 'analytic' },
{ id: 'b', name: 'Brielle', topic: 'Bentham', bias: 'buddhist' }
]);
let teachers = $$([
{ id: 'b', name: 'Brielle', topic: 'bijection', school: 'Berkley' },
{ id: 'c', name: 'Chloe', topic: 'change', school: 'Cambridge' }
]);
let purchases = $$([
{ customerId: 'b', books: 4, time: 16.68, price: 560, rating: 73 },
{ customerId: 'a', books: 1, time: 11.50, price: 80, rating: 95 },
{ customerId: 'a', books: 1, time: 12.03, price: 150, rating: 92 },
{ customerId: 'b', books: 2, time: 14.88, price: 220, rating: 88 },
{ customerId: 'a', books: 3, time: 13.75, price: 340, rating: 90 },
{ customerId: 'b', books: 4, time: 18.11, price: 330, rating: 66 },
{ customerId: 'a', books: 5, time: 21.09, price: 401, rating: 54 },
{ customerId: 'b', books: 5, time: 23.77, price: 589, rating: 31 }
]);
using fluent-data
A dataset will be produced by executing fluent-data as a function.
Parameters:
- data: An iterable object (such as an array) having complex objects as properties.
Example:
The example below creates a dataset by executing the fluent-data
function. It
utilizes the log
method defined on datasets.
$$(purchases).log(null, '$$(purchases):');
$$(purchases):
ββββββββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬βββββββββ
β customerId β books β time β price β rating β
ββββββββββββββΌββββββββΌββββββββΌββββββββΌβββββββββ€
β b β 4 β 16.68 β 560 β 73 β
β a β 1 β 11.5 β 80 β 95 β
β a β 1 β 12.03 β 150 β 92 β
β b β 2 β 14.88 β 220 β 88 β
β a β 3 β 13.75 β 340 β 90 β
β b β 4 β 18.11 β 330 β 66 β
β a β 5 β 21.09 β 401 β 54 β
β b β 5 β 23.77 β 589 β 31 β
ββββββββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄βββββββββ
using fluent-data.dataset
A dataset will be produced by referencing fluent-data.dataset
.
Parameters:
- data: An iterable object (such as an array) having complex objects as properties.
Example:
The example below creates a dataset by referencing fluent-data.dataset
. It
utilizes the log
method defined on datasets.
new $$.dataset(purchases).log(null, '$$.dataset(purchases):');
$$.dataset(purchases):
ββββββββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬βββββββββ
β customerId β books β time β price β rating β
ββββββββββββββΌββββββββΌββββββββΌββββββββΌβββββββββ€
β b β 4 β 16.68 β 560 β 73 β
β a β 1 β 11.5 β 80 β 95 β
β a β 1 β 12.03 β 150 β 92 β
β b β 2 β 14.88 β 220 β 88 β
β a β 3 β 13.75 β 340 β 90 β
β b β 4 β 18.11 β 330 β 66 β
β a β 5 β 21.09 β 401 β 54 β
β b β 5 β 23.77 β 589 β 31 β
ββββββββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄βββββββββ
distinct
Eliminates duplicate rows in a dataset.
The following parameters are available:
- func: An optional parameter takes a function having a dataset row as input and produces a value that is used to determine equality between rows. If omitted, the full row is considered.
-
sorter: When two distinct rows that are equal under the
func
equality comparer, this parameter ensures there is a definition of order so that the first one can be chosen.
Below is an example of distinct
used without a parameter. The number of
records in the dataset are fewer because duplicates are removed.
purchases
.map(p => ({
customerId: p.customerId,
books: p.books
}))
.distinct()
.log();
ββββββββββββββ¬ββββββββ
β customerId β books β
ββββββββββββββΌββββββββ€
β b β 4 β
β a β 1 β
β b β 2 β
β a β 3 β
β a β 5 β
β b β 5 β
ββββββββββββββ΄ββββββββ
And this is an example of distinct used with both the optional
func
and sorter
parameters. Just as before, the number of
rows are reduced and duplicates are removed. But this time the
definition of what consitutes a duplicate does not involve the
whole row. So for the rows representing the distinct group,
all columns (even not ones involved in distinct equality)
survive.
The sorter helps control which row is selected to represent each distinct group.
purchases
.distinct(
p => p.customerId,
p => [p.customerId, -p.rating]
)
.log();
ββββββββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬βββββββββ
β customerId β books β time β price β rating β
ββββββββββββββΌββββββββΌββββββββΌββββββββΌβββββββββ€
β b β 2 β 14.88 β 220 β 88 β
β a β 1 β 11.5 β 80 β 95 β
ββββββββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄βββββββββ
filter
Returns elements of a dataset that pass a boolean test.
The parameter func should be a function with a dataset row as input and a boolean value as output. A true value will return the row in the final result set, a false value will exclude the row.
The example below filters the purchases dataset to only include purchases from customer 'a'.
purchases
.filter(p => p.customerId == 'a')
.log();
ββββββββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬βββββββββ
β customerId β books β time β price β rating β
ββββββββββββββΌββββββββΌββββββββΌββββββββΌβββββββββ€
β a β 1 β 11.5 β 80 β 95 β
β a β 1 β 12.03 β 150 β 92 β
β a β 3 β 13.75 β 340 β 90 β
β a β 5 β 21.09 β 401 β 54 β
ββββββββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄βββββββββ
map
Creates a new dataset produced by calling a function on every element of the input dataset.
The sole parameter expects a one-parameter function with a dataset row as input and a reshaped row as output.
The example below shows a complex use of the map
function that demonstrates many of its
flexible features.
purchases
.map(p => ({
...p, // return all properties of 'p'
speed: p.time, // but also add a 'speed' property that copies 'time'
time: undefined, // then delete 'time' [.get() will omit undefined props]
rating: undefined, // and delete 'rating',
perBook: $$.round( // and create a new property
p.price / p.books,
1e-2
)
}))
.log();
The use of the spread operator, the fact that repeated properties will return the value
listed last, and the fact that get()
will not output undefined properties; all combine to
result in a mapping that outputs all properties, but with 'time' renamed to 'speed', and
with 'rating' deleted, and with 'perBook' added.
ββββββββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬ββββββββββ
β customerId β books β price β speed β perBook β
ββββββββββββββΌββββββββΌββββββββΌββββββββΌββββββββββ€
β b β 4 β 560 β 16.68 β 140 β
β a β 1 β 80 β 11.5 β 80 β
β a β 1 β 150 β 12.03 β 150 β
β b β 2 β 220 β 14.88 β 110 β
β a β 3 β 340 β 13.75 β 113.33 β
β b β 4 β 330 β 18.11 β 82.5 β
β a β 5 β 401 β 21.09 β 80.2 β
β b β 5 β 589 β 23.77 β 117.8 β
ββββββββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄ββββββββββ
These tactics aren't always warranted for small numbers of properties, but for more complex rows, they can result in a syntax more pleasing than SQL.
matrix
Converts a dataset to a matrix
Parameters:
- selector: Comma-separated string of property names or function with a dataset row as input and an array of numbers as output.
- rowNames: A string pointing to a property name of the dataset rows or a function with a dataset row ad input and returning the name as a string for the given row.
Example:
let froCsv =
purchases
.matrix('books, price, time', 'customerId')
.log();
let fromFuncs =
purchases
.matrix(p => [p.books, p.price, p.time], p => p.customerId)
.log();
βββββ¬ββββββββ¬ββββββββ¬ββββββββ
β β books β price β time β
βββββΌββββββββΌββββββββΌββββββββ€
β b β 4 β 560 β 16.68 β
β a β 1 β 80 β 11.5 β
β a β 1 β 150 β 12.03 β
β b β 2 β 220 β 14.88 β
β a β 3 β 340 β 13.75 β
β b β 4 β 330 β 18.11 β
β a β 5 β 401 β 21.09 β
β b β 5 β 589 β 23.77 β
βββββ΄ββββββββ΄ββββββββ΄ββββββββ
βββββ¬βββββ¬ββββββ¬ββββββββ
β β c0 β c1 β c2 β
βββββΌβββββΌββββββΌββββββββ€
β b β 4 β 560 β 16.68 β
β a β 1 β 80 β 11.5 β
β a β 1 β 150 β 12.03 β
β b β 2 β 220 β 14.88 β
β a β 3 β 340 β 13.75 β
β b β 4 β 330 β 18.11 β
β a β 5 β 401 β 21.09 β
β b β 5 β 589 β 23.77 β
βββββ΄βββββ΄ββββββ΄ββββββββ
sort
Orders the rows of a dataset.
The function sorter
establishes the criteria in which to sort the rows. If sorter
is a one-parameter function, then it's result directly serves as the ordering criteria.
If it is a two-parameter function, then it should return an integer. Assume the two
parameters, in order, are a
and b
. If sorter
returns a negative number, then
effectively, 'a comes before b'. If it returns 0, then 'a equals b', at least in terms
of ordering. IF it returns a positive number, then 'b comes before a'.
The example below sorts purchases the 'hard way'. In other words, it uses the two-parameter syntax that returns an integer.
purchases.sort((p,p2) =>
p.customerId > p2.customerId ? 1
: p.customerId < p2.customerId ? -1
: p.rating > p2.rating ? -1
: p.rating < p2.rating ? 1
: 0
)
.log();
ββββββββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬βββββββββ
β customerId β books β time β price β rating β
ββββββββββββββΌββββββββΌββββββββΌββββββββΌβββββββββ€
β a β 1 β 11.5 β 80 β 95 β
β a β 1 β 12.03 β 150 β 92 β
β a β 3 β 13.75 β 340 β 90 β
β a β 5 β 21.09 β 401 β 54 β
β b β 2 β 14.88 β 220 β 88 β
β b β 4 β 16.68 β 560 β 73 β
β b β 4 β 18.11 β 330 β 66 β
β b β 5 β 23.77 β 589 β 31 β
ββββββββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄βββββββββ
But because sort works with arrays, sorting in tiers based on the element positions, the is done. The elements are first sorted by customer in ascending order, and secondarily ordered by rating in descending order. It is equivalent to the snippet above.
purchases
.sort(p => [p.customerId, -p.rating])
.log();
ββββββββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬βββββββββ
β customerId β books β time β price β rating β
ββββββββββββββΌββββββββΌββββββββΌββββββββΌβββββββββ€
β a β 1 β 11.5 β 80 β 95 β
β a β 1 β 12.03 β 150 β 92 β
β a β 3 β 13.75 β 340 β 90 β
β a β 5 β 21.09 β 401 β 54 β
β b β 2 β 14.88 β 220 β 88 β
β b β 4 β 16.68 β 560 β 73 β
β b β 4 β 18.11 β 330 β 66 β
β b β 5 β 23.77 β 589 β 31 β
ββββββββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄βββββββββ
A dataset
can have a nested structure. Allthough most methods are built around this
feature of a dataset, the methods in this section particularly highlight the nested
structure.
apply
Applies a function to every base-level grouping of a dataset.
The sole tableLevelFunc parameter expects a dataset-like object as input (i.e. an iterable that produces a dataset row on each iteration). It returns a dataset-like object as output. Alternatively, it can be made async and yield objects as rows.
The apply()
method is not recommended for direct use. However, most methods defined
on dataset
use it under the hood. If it is ever used directly, it is likely in
order to extend dataset
and write your own method that can operate on grouped data.
This example extends dataset
and uses apply
in order to create a custom method
that converts every row's value to it's type description.
class myDataset extends $$.dataset {
typeOfs () {
function* tableLevelFunc (data) {
for(let row of data) {
for(let key of Object.keys(row))
row[key] = typeof row[key];
yield row;
}
};
this.apply(tableLevelFunc);
return this;
}
}
new myDataset(purchases)
.group(p => p.customerId)
.typeOfs()
.log();
ββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β key: "b" β
β ββββββββββββββ¬βββββββββ¬βββββββββ¬βββββββββ¬βββββββββ β
β β customerId β books β time β price β rating β β
β ββββββββββββββΌβββββββββΌβββββββββΌβββββββββΌβββββββββ€ β
β β string β number β number β number β number β β
β β string β number β number β number β number β β
β β string β number β number β number β number β β
β β string β number β number β number β number β β
β ββββββββββββββ΄βββββββββ΄βββββββββ΄βββββββββ΄βββββββββ β
β key: "a" β
β ββββββββββββββ¬βββββββββ¬βββββββββ¬βββββββββ¬βββββββββ β
β β customerId β books β time β price β rating β β
β ββββββββββββββΌβββββββββΌβββββββββΌβββββββββΌβββββββββ€ β
β β string β number β number β number β number β β
β β string β number β number β number β number β β
β β string β number β number β number β number β β
β β string β number β number β number β number β β
β ββββββββββββββ΄βββββββββ΄βββββββββ΄βββββββββ΄βββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββ
group
Gathers the rows of a dataset into seperate nestings based on a criteria.
The sole parameter expects a function that has a dataset row as input and returns a value to be used as the criteria on which to group the rows. The value can be complex, such as an array or object.
An important feature of a dataset is that most methods applied to grouped datasets operate inside of each grouping.
The example below creates a special 'flag' property out of certain thresholds stemming from 'rating'. The mapped rows are then grouped by customerId and by the flag. Finally, the rows in each group are filtered to output only rows with a rating greater than 50.
purchases
.map(p => ({
...p,
flag: p.rating < 60 ? 'bad' : p.rating < 90 ? 'okay' : 'good'
}))
.group(p => [p.customerId, p.flag])
.filter(p => p.rating > 50)
.log();
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β key: ["b","okay"] β
β ββββββββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬βββββββββ¬βββββββ β
β β customerId β books β time β price β rating β flag β β
β ββββββββββββββΌββββββββΌββββββββΌββββββββΌβββββββββΌβββββββ€ β
β β b β 4 β 16.68 β 560 β 73 β okay β β
β β b β 2 β 14.88 β 220 β 88 β okay β β
β β b β 4 β 18.11 β 330 β 66 β okay β β
β ββββββββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄βββββββββ΄βββββββ β
β key: ["a","good"] β
β ββββββββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬βββββββββ¬βββββββ β
β β customerId β books β time β price β rating β flag β β
β ββββββββββββββΌββββββββΌββββββββΌββββββββΌβββββββββΌβββββββ€ β
β β a β 1 β 11.5 β 80 β 95 β good β β
β β a β 1 β 12.03 β 150 β 92 β good β β
β β a β 3 β 13.75 β 340 β 90 β good β β
β ββββββββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄βββββββββ΄βββββββ β
β key: ["a","bad"] β
β ββββββββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬βββββββββ¬βββββββ β
β β customerId β books β time β price β rating β flag β β
β ββββββββββββββΌββββββββΌββββββββΌββββββββΌβββββββββΌβββββββ€ β
β β a β 5 β 21.09 β 401 β 54 β bad β β
β ββββββββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄βββββββββ΄βββββββ β
β key: ["b","bad"] β
β ββββ β
β β β β
β ββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ungroup
Rolls back the lowest level of grouping and flattens rows in sibling groups into one set of data.
The optional mapper
parameter allows a user to pass a mapping function to
be applied to each row simultaneous with the ungrouping process.
If grouping is already at the top level, it is possible to apply ungroup and output a naked object, provided that the original dataset only had one row.
In the code below, purchases are grouped, then filtered, then ungrouped.
purchases
.group(p => p.customerId)
.filter(p => p.rating >= 55)
.ungroup()
.log();
ββββββββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬βββββββββ
β customerId β books β time β price β rating β
ββββββββββββββΌββββββββΌββββββββΌββββββββΌβββββββββ€
β a β 1 β 11.5 β 80 β 95 β
β a β 1 β 12.03 β 150 β 92 β
β a β 3 β 13.75 β 340 β 90 β
β b β 4 β 16.68 β 560 β 73 β
β b β 2 β 14.88 β 220 β 88 β
β b β 4 β 18.11 β 330 β 66 β
ββββββββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄βββββββββ
Methods in this section ultimately wrap the merge()
method. It is where the various
'join' and 'exists' methods reside.
exists
A row in a 'left' dataset is output only if it has a 'match' in the 'right' dataset.
Parameters:
- rightData: Dataset, required. The dataset to combine with the 'leftData' (the dataset from which the method is called).
- matcher: Function, required. The logic on which to determine equality, indicating whether two records are a 'match' or not. Can also be the string '=', which
-
mergeOptions: Optional, object. Allows other configurations of the merge process.
For a list of recognized properties that can be passed, see the description of the
'options' parameter under the
merge
method.
Example:
students
.exists(teachers, (s,t) => s.id == t.id)
.log();
ββββββ¬ββββββββββ¬ββββββββββ¬βββββββββββ
β id β name β topic β bias β
ββββββΌββββββββββΌββββββββββΌβββββββββββ€
β b β Brielle β Bentham β buddhist β
ββββββ΄ββββββββββ΄ββββββββββ΄βββββββββββ
join
Merges two datasets such that when rows between them 'match', a new row is output
that combines both their properties (on name collision the right property wins out).
Unmached rows are omitted from the results.
Parameters:
- rightData: Dataset, required. The dataset to combine with the 'leftData' (the dataset from which the method is called).
- matcher: Function, required. The logic on which to determine equality, indicating whether two records are a 'match' or not. Can also be the string '=', which
-
mergeOptions: Optional, object. Allows other configurations of the merge process.
For a list of recognized properties that can be passed, see the description of the
'options' parameter under the
merge
method.
Example:
students
.join(teachers, (s,t) => s.id == t.id)
.log();
ββββββ¬ββββββββββ¬ββββββββββββ¬βββββββββββ¬ββββββββββ
β id β name β topic β bias β school β
ββββββΌββββββββββΌββββββββββββΌβββββββββββΌββββββββββ€
β b β Brielle β bijection β buddhist β Berkley β
ββββββ΄ββββββββββ΄ββββββββββββ΄βββββββββββ΄ββββββββββ
joinLeft
Merges two datasets such that when rows between them 'match', a new row is output
that combines both their properties (on name collision the right property wins out).
Unmached rows from the left dataset are output as-is. And rows from the right
dataset are omitted from the results.
Parameters:
- rightData: Dataset, required. The dataset to combine with the 'leftData' (the dataset from which the method is called).
- matcher: Function, required. The logic on which to determine equality, indicating whether two records are a 'match' or not. Can also be the string '=', which
-
mergeOptions: Optional, object. Allows other configurations of the merge process.
For a list of recognized properties that can be passed, see the description of the
'options' parameter under the
merge
method.
Example:
students
.joinLeft(teachers, (s,t) => s.id == t.id)
.log();
ββββββ¬ββββββββββ¬ββββββββββββ¬βββββββββββ¬ββββββββββ
β id β name β topic β bias β school β
ββββββΌββββββββββΌββββββββββββΌβββββββββββΌββββββββββ€
β a β Andrea β Abelard β analytic β β
β b β Brielle β bijection β buddhist β Berkley β
ββββββ΄ββββββββββ΄ββββββββββββ΄βββββββββββ΄ββββββββββ
joinRight
Merges two datasets such that when rows between them 'match', a new row is output
that combines both their properties (on name collision the right property wins out).
Unmached rows from the left dataset are omitted from the results. And rows from
the right dataset are output as-is.
Parameters:
- rightData: Dataset, required. The dataset to combine with the 'leftData' (the dataset from which the method is called).
- matcher: Function, required. The logic on which to determine equality, indicating whether two records are a 'match' or not. Can also be the string '=', which
-
mergeOptions: Optional, object. Allows other configurations of the merge process.
For a list of recognized properties that can be passed, see the description of the
'options' parameter under the
merge
method.
Example:
students
.joinRight(teachers, (s,t) => s.id == t.id)
.log();
ββββββ¬ββββββββββ¬ββββββββββββ¬βββββββββββ¬ββββββββββββ
β id β name β topic β bias β school β
ββββββΌββββββββββΌββββββββββββΌβββββββββββΌββββββββββββ€
β b β Brielle β bijection β buddhist β Berkley β
β c β Chloe β change β β Cambridge β
ββββββ΄ββββββββββ΄ββββββββββββ΄βββββββββββ΄ββββββββββββ
joinFull
Merges two datasets such that when rows between them 'match', a new row is output
that combines both their properties (on name collision the right property wins out).
Unmached rows from either dataset are output as-is.
Parameters:
- rightData: Dataset, required. The dataset to combine with the 'leftData' (the dataset from which the method is called).
- matcher: Function, required. The logic on which to determine equality, indicating whether two records are a 'match' or not. Can also be the string '=', which
-
mergeOptions: Optional, object. Allows other configurations of the merge process.
For a list of recognized properties that can be passed, see the description of the
'options' parameter under the
merge
method.
Example:
students
.joinFull(teachers, (s,t) => s.id == t.id)
.log();
ββββββ¬ββββββββββ¬ββββββββββββ¬βββββββββββ¬ββββββββββββ
β id β name β topic β bias β school β
ββββββΌββββββββββΌββββββββββββΌβββββββββββΌββββββββββββ€
β a β Andrea β Abelard β analytic β β
β b β Brielle β bijection β buddhist β Berkley β
β c β Chloe β change β β Cambridge β
ββββββ΄ββββββββββ΄ββββββββββββ΄βββββββββββ΄ββββββββββββ
merge
Combines the results of two datasets into a single dataset.
This method expects the following parameters:
- rightData: Dataset, required. The dataset to combine with the 'leftData' (the dataset from which the method is called).
- matcher: Function, required. The logic on which to determine equality, indicating whether two records are a 'match' or not. Can also be the string '=', which
- mapper: Function, required. The logic of what to output when there is a match vs not. Expects two parameters, the 'left row' and the 'right row'. If there is a match, then both are defined. If a match is not found, then the left or the right parameter will be populated, depending on which dataset is sending the row.
- options: Optional, object. Allows other configurations of the merge process.
If the 'options' parameter is set, then the following properties are recognized:
-
singular: Boolean. Sets
leftSingular
andrightSingular
to this value, assuming they're not set already. -
leftSingular: Boolean. If true, only distinct rows from the left dataset will
be considered. Equality to determine distinction is what is passed to
matcher
. So it will not be full object equality (unless that is what's passed tomatcher
). - rightSingular: Boolean. The right-dataset counterpart to leftSingular.
-
hasher: Function. If set,
leftHasher
andrightHasher
are also set to this value, assuming they're not set already. -
leftHasher: Function. If set, the output of this function is compared with the outputs
of rightHasher to determine equality between left and right dataset rows using a hashing
algorithm. It can work without
matcher
to be the final word on equality, or it can work together with amatcher
to first narrow down nearly equal objects into buckets, and then to make the final determination with thematcher
. - rightHasher: Function. The right-dataset counterpart to leftHasher.
- algo: String. The algorithm to use. Can be 'hash' or 'loop'.
All methods in this section wrap this method. In other words, this is the core implementation of all merges. But it can be fairly complex to use directly, hence the existence of the other methods.
Below, two merges are preformed. The purpose of the first is to effectively 'stack' the students and teachers to create a dataset of all people. If a person is both a student and a teacher, then the mapper logic chooses the student record to survive. If a person is only one but not the other, it will output the record regardless. The purpose of the second merge is to effectively 'left join' people to purchases. The final output is mapped in order to select a subset of columns, simply for cleaner visualisation.
students
.merge(
teachers,
(s,t) => s.id == t.id, // seek to match records based on the 'id' property
(s,t) =>
(s&&t) // check if s and t both exist (if a match was found for the rows)
? s // if so, ignore t, just output s
: (s||t) // if not, output whichever exists
)
.merge(
purchases,
null,
(st,p) =>
(st&&p) // check if s and t both exist
? { ...st, ...p } // if so, output an object combining the properties of both
: st, // if not, only output st, ignore unmatched p's.
{
leftHasher: st => st.id, // seek to match based on the 'id' property ...
rightHasher: p => p.customerId // .. against the customerId property
}
)
.map(stp => ({
name: stp.name,
topic: stp.topic,
school: stp.school,
books: stp.books,
price: stp.price
}))
.log();
βββββββββββ¬ββββββββββ¬ββββββββ¬ββββββββ¬ββββββββββββ
β name β topic β books β price β school β
βββββββββββΌββββββββββΌββββββββΌββββββββΌββββββββββββ€
β Andrea β Abelard β 1 β 80 β β
β Andrea β Abelard β 1 β 150 β β
β Andrea β Abelard β 3 β 340 β β
β Andrea β Abelard β 5 β 401 β β
β Brielle β Bentham β 4 β 560 β β
β Brielle β Bentham β 2 β 220 β β
β Brielle β Bentham β 4 β 330 β β
β Brielle β Bentham β 5 β 589 β β
β Chloe β change β β β Cambridge β
βββββββββββ΄ββββββββββ΄ββββββββ΄ββββββββ΄ββββββββββββ
notExists
A row in a 'left' dataset is output only if it has no 'match' in the 'right' dataset.
Parameters:
- rightData: Dataset, required. The dataset to combine with the 'leftData' (the dataset from which the method is called).
- matcher: Function, required. The logic on which to determine equality, indicating whether two records are a 'match' or not. Can also be the string '=', which
-
mergeOptions: Optional, object. Allows other configurations of the merge process.
For a list of recognized properties that can be passed, see the description of the
'options' parameter under the
merge
method.
Example:
students
.notExists(teachers, (s,t) => s.id == t.id)
.log();
ββββββ¬βββββββββ¬ββββββββββ¬βββββββββββ
β id β name β topic β bias β
ββββββΌβββββββββΌββββββββββΌβββββββββββ€
β a β Andrea β Abelard β analytic β
ββββββ΄βββββββββ΄ββββββββββ΄βββββββββββ
notExistsFull
Returns all 'unmatched' rows between two datasets, regardless of which dataset the row comes from.
Parameters:
- rightData: Dataset, required. The dataset to combine with the 'leftData' (the dataset from which the method is called).
- matcher: Function, required. The logic on which to determine equality, indicating whether two records are a 'match' or not. Can also be the string '=', which
-
mergeOptions: Optional, object. Allows other configurations of the merge process.
For a list of recognized properties that can be passed, see the description of the
'options' parameter under the
merge
method.
Example:
students
.notExistsFull(teachers, (s,t) => s.id == t.id)
.log();
ββββββ¬βββββββββ¬ββββββββββ¬βββββββββββ¬ββββββββββββ
β id β name β topic β bias β school β
ββββββΌβββββββββΌββββββββββΌβββββββββββΌββββββββββββ€
β a β Andrea β Abelard β analytic β β
β c β Chloe β change β β Cambridge β
ββββββ΄βββββββββ΄ββββββββββ΄βββββββββββ΄ββββββββββββ
reduce
reduce
accumulates dataset rows to return an aggregated result. It expects an object
as a parameter. This object should have properties with reducers as values.
A reducer
is an aggregating function. If a reducer only has one parameter, then when
aggregating treat the parameter as an array of objects (iterate it, for instance). If
a reducer has two parameters, then it works similarly to Array.reduce
in that the
first parameter is the accumulator and the second parameter is a row. If the
two-parameter function passed has a 'seed' property defined, then its value is used as
the seed (otherwise the seed is 0). Conveniently, you can set the seed by defining
a string property of the same name but with '.seed' suffixed.
Examples of built-in reducers are first()
, avg()
, and cor()
. See the resources
and examples below for their usage.
Related resources:
- For a list of built-in reducers, see Built-In Reducers.
- For a guide on how to build custom reducers, see Custom Reducers.
Reducing grouped data:
The example below groups purchaces by customer and then applies the first
, avg
, and
cor
reducers. It also demonstrates the use of a seeded two-parameter reducer with the
logic in-line, as well as a one-parameter in-line reducer.
purchases
.group(p => p.customerId)
.reduce(({
customer: $$.first(p => p.customerId),
time: $$.avg(p => p.time),
rating: $$.avg(p => p.rating),
correlation: $$.cor(p => [p.time, p.rating]),
timeSum: (acc,next) => acc + next.time,
['timeSum.seed']: -10, // eliminate some common time
timeMin: (data) => Math.min(...data.map(row => row.time))
}))
.log(null,null,1e-8);
This results in an ungrouped array with one row of aggregated results per customer.
ββββββββββββ¬ββββββββββ¬βββββββββ¬ββββββββββββββ¬ββββββββββ¬ββββββββββ
β customer β time β rating β correlation β timeSum β timeMin β
ββββββββββββΌββββββββββΌβββββββββΌββββββββββββββΌββββββββββΌββββββββββ€
β a β 14.5925 β 82.75 β -0.99187759 β 48.37 β 11.5 β
β b β 18.36 β 64.5 β -0.99795664 β 63.44 β 14.88 β
ββββββββββββ΄ββββββββββ΄βββββββββ΄ββββββββββββββ΄ββββββββββ΄ββββββββββ
Reducing ungrouped data:
The following example aggregates purchases. This time, it does not first group by customer.
purchases
.reduce({
time: $$.avg(p => p.time),
rating: $$.avg(p => p.rating),
correlation: $$.cor(p => [p.time, p.rating])
})
.log();
The result becomes an object, as opposed to an array:
{
time: 16.47625,
rating: 73.625,
correlation: -0.9821574166144001
}
Keeping the Group Level the Same:
In general, one level of grouping is lost on each use of reduce
. If this is not desired,
reduce
has a second parameter: ungroup
. This defaults to true, but can be set to false.
The example below is the same as the last, except that it prevents ungrouping:
purchases
.reduce({
time: $$.avg(p => p.time),
rating: $$.avg(p => p.rating),
correlation: $$.cor(p => [p.time, p.rating])
}, false)
.log();
This time, the result is still a one-item array:
ββββββββββββ¬βββββββββ¬ββββββββββββββββββββββ
β time β rating β correlation β
ββββββββββββΌβββββββββΌββββββββββββββββββββββ€
β 16.47625 β 73.625 β -0.9821574166144001 β
ββββββββββββ΄βββββββββ΄ββββββββββββββββββββββ
Reducing to Primitive Values:
The return value of a reduction need not be a dataset of objects having properties.
It can instead return a dataset of primitive values, or a single primitive value.
To do this, return a single reducer not wrapped in an object.
The example below averages the speed property for all purchases.
purchases
.reduce($$.avg(p => p.time))
.log();
The result is an integer:
16.47625
Reduce Syntax Limitation:
Reduce will fail if the input function returns an object that does not have a reducer as one of it's properties. In the example below, either of the properties would cause a failure.
purchases
.reduce({
speed: $$.avg(p => p.speed) + 10,
rating: $$.sum(p => p.rating) / $$.count(p.rating)
})
.get();
// This would fail
window
For each row of a dataset, identifies a subdataset of relative rows, and appends columns representing the subdataset aggregations.
The parameters are an objet with one or more of the following properties:
- reduce: An object with reducers as properties (see 'reduce'). This is the only required property.
- group: A function expecting a dataset row as a parameter. The output determines how rows are to be grouped to identify subdataset boundaries.
- sort: A function expecting a dataset row as a parameter. The output determines how subdataset rows are sorted.
- filter: A function expecting a dataset row as its first parameter, and returning a boolean value.
- scroll: A function expecting two integer parameters, and returning a boolean value. The first input paramter (currentIx) is the index of the current row processed in a loop. The second input paramter (compareIx) is the index of a comparison row processed in a nested loop of the subdataset. Using this parameter will impact performance due to the nested looping.
Note that use of window
does not preserve the ordering of the resultset.
The example below appends the various columns to the dataset which represent
a running row count (n), a time summation (timeSum), a leading rating (rating0),
a running row count (nRun), and a running time summation (tRun). The totals
are grouped by customerId, meaning they pertain to that given customer and no
other. They are based on a sorting by time, and their analysis excludes any
price below '100'. The scroll
property in the second calling ensures that,
for each row, a subdataset of rows preceding the current row (inclusive) are
considered for aggregation. This is what produces the 'run' in the 'running'
totals.
purchases
.window({ // the 'standard' windowed totals
group: p => p.customerId,
sort: p => p.time,
filter: p => p.price >= 100,
reduce: {
n: $$.count(p => p),
timeSum: (accum,p) => accum + p.time,
rating0: $$.first(p => p.rating)
}
})
.window({ // the 'running' totals
group: p => p.customerId,
sort: p => p.time,
filter: p => p.price >= 100,
scroll: (currentIx,compareIx) => currentIx >= compareIx,
reduce: {
nRun: $$.count(p => p),
tRun: (accum,p) => accum + p.time
}
})
.log(null, null, 1e-8);
ββββββββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬βββββββββ¬ββββ¬ββββββββββ¬ββββββββββ¬βββββββ¬ββββββββ
β customerId β books β time β price β rating β n β timeSum β rating0 β nRun β tRun β
ββββββββββββββΌββββββββΌββββββββΌββββββββΌβββββββββΌββββΌββββββββββΌββββββββββΌβββββββΌββββββββ€
β b β 2 β 14.88 β 220 β 88 β 4 β 73.44 β 88 β 1 β 14.88 β
β b β 4 β 16.68 β 560 β 73 β 4 β 73.44 β 88 β 2 β 31.56 β
β b β 4 β 18.11 β 330 β 66 β 4 β 73.44 β 88 β 3 β 49.67 β
β b β 5 β 23.77 β 589 β 31 β 4 β 73.44 β 88 β 4 β 73.44 β
β a β 1 β 12.03 β 150 β 92 β 3 β 46.87 β 92 β 1 β 12.03 β
β a β 3 β 13.75 β 340 β 90 β 3 β 46.87 β 92 β 2 β 25.78 β
β a β 5 β 21.09 β 401 β 54 β 3 β 46.87 β 92 β 3 β 46.87 β
β a β 1 β 11.5 β 80 β 95 β β β β β β
ββββββββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄βββββββββ΄ββββ΄ββββββββββ΄ββββββββββ΄βββββββ΄ββββββββ
dimReduce
Takes a number of numeric fields and reduces them to a smaller number of dimensions.
At a future date, this method can toggle between various implementations of principal components or factor analysis. The term 'dimReduce' was chosen as an abstract term that can encompass either of these areas. Also please not that the testing related to this method is different than tests in other methods. It tests broader properties in the result, as opposed to exact figures. This is because the algorighms involved are sensitive and can easily change with changes to the algorithm. This is true not just for this method in particular but for this area of mathematics in general. That's not to say there isn't room for improvement here, of course. That will come with time.
Parameters:
- csvSelector: A string of the field names that the user desires to reduce in dimension, separated by commas. Note that the goal for the future is that the user can alternatively pass in a function returning an object.
- options: Optional. An object with properties that aid in the configuration of the analysis.
'Options' has the following properties available:
-
eigenArgs: Arguments passed to the
eigen
method when producing the eigenvalues. Default is { valueThreshold 1e-12, vectorThreshold: 1e-4, testThreshold: 1e-3 }. - maxDims: The maximum mumber of dimensions to extract. Default is null.
- minEigenVal: = The minimum level an eigenvalue must be to be extracted. Default is 1.
-
rotationMaxIterations: The maximum number of times to iterate when rotating.
Default is 1000. - rotationAngleThreshold: The change in angle below which rotation is considered complete. Defaut is 1e-8.
- attachData: A boolean indicating whether to return data and append factor scores to that data. Default is false. If true, a new property, 'data' is included in the output.
Return object:
- correlations: The correlation matrix of the original dimensions.
- eigenValues: An array of the eigenvalues prouced from the correlation matrix.
- unrotated: An object with properties of the unrotated dimensions.
- rotated: An object with properties of the rotated dimensions.
- data: If attachData = true, then the original data, with the new dimension scores appended. Dimension scores are named 'dim#' where # is the dimension number of the score. Existing columns of the same name will be overwritten. If attachData = false, then this is undefined.
-
log: A method that outputs the analysis details in friendly format. There are
three parameters that work the same way as
dataset.log()
andmatrix.log()
'Unrotated' and 'Rotated' outputs have the following structure:
- loadings: A matrix of the extracted dimension loadings on each original dimension
- communalities: A matrix of the extracted dimension communalities for each original dimension
- :sums:: A matrix of any relevant sums. Presently has the sum of the communalities.
- sumSqs:: A matrix of the sum of squared loadings for each extracted dimension
- props: A matrix of the proportion of explained variance for each extracted dimension
- log: A method that outputs the loading details in friendly format
The example below demonstrates a dimension reduction of the numeric properties of the
purchases
dataset. The 'minEigenVal' was set to a low value simply to produce two
dimensions in the output. Likely you will not choose such a low value in real life.
purchases.dimReduce(
'books, time, price, rating',
{ minEigenVal: 0.25 } // just for the sake of presenting more than one
)
.log(null, null, 1e-4);
For guidance on how to query dimResults, call "dimResults.help" on the fluent-data object, or see the github wiki for this project
rotated:
ββββββββββββ€ββββββββββ€ββββββββββ¦ββββββββββββββ€ββββββββββββββ
β β dim0 β dim1 β communality β specificVar β
β βββββββββββͺββββββββββͺββββββββββ¬ββββββββββββββͺββββββββββββββ£
β books β 0.9716 β 0.0243 β 0.9445 β 0.0555 β
β time β 0.9584 β -0.2758 β 0.9946 β 0.0054 β
β price β 0.9359 β 0.3281 β 0.9835 β 0.0165 β
β rating β -0.9396 β 0.3146 β 0.9818 β 0.0182 β
β βββββββββββͺββββββββββͺββββββββββ¬ββββββββββββββͺββββββββββββββ£
β sums β β β 3.9044 β 0.0956 β
β sumSqs β 3.6212 β 0.2832 β β β
β propVars β 0.9275 β 0.0725 β β β
ββββββββββββ§ββββββββββ§ββββββββββ©ββββββββββββββ§ββββββββββββββ
unrotated:
ββββββββββββ€ββββββββββ€ββββββββββ¦ββββββββββββββ€ββββββββββββββ
β β dim0 β dim1 β communality β specificVar β
β βββββββββββͺββββββββββͺββββββββββ¬ββββββββββββββͺββββββββββββββ£
β books β 0.9677 β 0.0905 β 0.9445 β 0.0555 β
β time β 0.975 β -0.2098 β 0.9946 β 0.0054 β
β price β 0.9113 β 0.3911 β 0.9835 β 0.0165 β
β rating β -0.9588 β 0.2498 β 0.9818 β 0.0182 β
β βββββββββββͺββββββββββͺββββββββββ¬ββββββββββββββͺββββββββββββββ£
β sums β β β 3.9044 β 0.0956 β
β sumSqs β 3.6369 β 0.2676 β β β
β propVars β 0.9315 β 0.0685 β β β
ββββββββββββ§ββββββββββ§ββββββββββ©ββββββββββββββ§ββββββββββββββ
eigenValues: [ 3.63687607982, 0.26756319121, 0.08629955021, 0.00926117876 ]
correlations:
ββββββββββ¬βββββββββ¬ββββββββββ¬ββββββββββ¬ββββββββββ
β β books β time β price β rating β
ββββββββββΌβββββββββΌββββββββββΌββββββββββΌββββββββββ€
β books β 1 β 0.9245 β 0.887 β -0.878 β
β time β 0.9245 β 1 β 0.8061 β -0.9822 β
β price β 0.887 β 0.8061 β 1 β -0.7913 β
β rating β -0.878 β -0.9822 β -0.7913 β 1 β
ββββββββββ΄βββββββββ΄ββββββββββ΄ββββββββββ΄ββββββββββ
regress
Returns an analysis of multivariate regression between variables.
Parameters:
- ivSelector: Determines which columns are the independent variables. Use a comma-separated string of property names.
- dvSelector: Determines which column is the dependent variable. Can be the name of the column.
- options: An object with properties that configure the analysis and output.
The options parameter recognizes the following properties:
- attachData: boolean, default = false. If true, the 'data' property returned in the output will include actual, estimated, and residual values when applying the regression model to each row.
-
ci: number, default = undefined. This parameter, if set, should be the quantile
desired for the confidence interval around each regression coefficient. It will return
a two-item array representing the lower and upper bounds, respectively. If not set,
then the
ci
property for each coefficient will instead be a function that expects a quantile as input and will output such an array.
Return Object:
regress
returns an object with the following properties:
- coefficients: A dataset containing properties of the regression coefficients. There is one row for each coefficient. Each row is an object with the following properties: name, value, stdErr, t, df, pVal, ci.
-
model: an object with the following properties: rSquared, rSquaredAdj, F, pVal.
If attachData = true, then also breuchPagan and breuchPaganPval. - data: if attachData = true, then this is the original data with estimates appended. The estimate properties are 'estimate', 'actual', and 'residual'. Fields in the original data with these names will be overwritten. If attachData = false, this is undefined.
-
log: a method to display the output described above in friendly form. There are
three parameters that work the same way as
dataset.log()
andmatrix.log()
Example:
This example runs a regression on the 'purchases' database.
let regression =
purchases.regress(
'books, time',
'rating',
{ci: 0.95, maxDigits: 4, attachData: true }
);
regression.log(null, 'Regression Objects:', 1e-6);
regression.data.log(null, '\r\nRegression data:', 1e-6);
-----------------------------------
Regression Objects:
For guidance on how to query regress, call "regress.help" on the fluent-data object, or see the github wiki for this project
coefficients:
βββββββββββββ¬βββββββββββββ¬βββββββββββ¬ββββββββββββ¬βββββ¬βββββββββββ¬ββββββββββββββββββββββββ
β name β value β stdErr β t β df β pVal β ci β
βββββββββββββΌβββββββββββββΌβββββββββββΌββββββββββββΌβββββΌβββββββββββΌββββββββββββββββββββββββ€
β intercept β 164.851565 β 9.869433 β 16.703246 β 5 β 0.000997 β 139.481381,190.221749 β
β books β 2.821533 β 2.740061 β 1.029733 β 5 β 0.483385 β -4.222019,9.865085 β
β time β -6.072004 β 1.037281 β -5.85377 β 5 β 0.020076 β -8.73842,-3.405588 β
βββββββββββββ΄βββββββββββββ΄βββββββββββ΄ββββββββββββ΄βββββ΄βββββββββββ΄ββββββββββββββββββββββββ
model: {
rSquared: 0.970821,
rSquaredAdj: 0.95915,
F: 83.17851,
pVal: 0.000145,
breuchPagan: 2.12636,
breuchPaganPval: 0.345356
}
Note: Data has been output with estimates attached. Query "data" on the return object to get it.
-----------------------------------
Regression data:
ββββββββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬βββββββββ¬ββββββββββββ¬βββββββββ¬ββββββββββββ
β customerId β books β time β price β rating β estimate β actual β residual β
ββββββββββββββΌββββββββΌββββββββΌββββββββΌβββββββββΌββββββββββββΌβββββββββΌββββββββββββ€
β b β 4 β 16.68 β 560 β 73 β 74.85667 β 73 β -1.85667 β
β a β 1 β 11.5 β 80 β 95 β 97.845052 β 95 β -2.845052 β
β a β 1 β 12.03 β 150 β 92 β 94.62689 β 92 β -2.62689 β
β b β 2 β 14.88 β 220 β 88 β 80.143212 β 88 β 7.856788 β
β a β 3 β 13.75 β 340 β 90 β 89.826109 β 90 β 0.173891 β
β b β 4 β 18.11 β 330 β 66 β 66.173705 β 66 β -0.173705 β
β a β 5 β 21.09 β 401 β 54 β 50.900666 β 54 β 3.099334 β
β b β 5 β 23.77 β 589 β 31 β 34.627695 β 31 β -3.627695 β
ββββββββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄βββββββββ΄ββββββββββββ΄βββββββββ΄ββββββββββββ
The breuchPagan properties represent tests of uniformity of variance.
standardize
Converts values into standard (z) scores.
Parameters:
- obj:: An object of functions that accept a dataset row as a parameter and output a number. This parameter is required.
- isSample:: A boolean indicating whether the sample or population standard deviation should be used to calculate the z-scores. This parameter is optional, the default is 'false'.
The example below takes the purchases dataset, removes some fields, and calcuates standard scores for the time and ratings fields.
purchases
.map(p => ({ ...p, books: undefined, price: undefined}))
.standardize({
zTime: p => p.time,
zRating: p => p.rating
})
.log();
ββββββββββββββ¬ββββββββ¬βββββββββ¬βββββββββββββββββββββββ¬ββββββββββββββββββββββββ
β customerId β time β rating β zTime β zRating β
ββββββββββββββΌββββββββΌβββββββββΌβββββββββββββββββββββββΌββββββββββββββββββββββββ€
β b β 16.68 β 73 β 0.050215204428030694 β -0.029753999243460543 β
β a β 11.5 β 95 β -1.2264216492514772 β 1.0175867741263505 β
β a β 12.03 β 92 β -1.0958005039908327 β 0.87476757775774 β
β b β 14.88 β 88 β -0.39340377947604516 β 0.6843419825995924 β
β a β 13.75 β 90 β -0.671897919371382 β 0.7795547801786662 β
β b β 18.11 β 66 β 0.4026458416407133 β -0.3629987907702186 β
β a β 21.09 β 54 β 1.1370817149930172 β -0.9342755762446611 β
β b β 23.77 β 31 β 1.7975810910279748 β -2.029222748404009 β
ββββββββββββββ΄ββββββββ΄βββββββββ΄βββββββββββββββββββββββ΄ββββββββββββββββββββββββ
A dataset itself is more like a processor than a final set of data. But ultimately you will want to print out the data or read out it's real state into a variable. Methods in this section describe how to do that.
get
get
processes the stored commands of a dataset and returns an array of objects. The get()
function will omit any records or properties that are undefined.
Accepts an optional one-parameter function that invokes a mapping before producing the array. If omitted, the full results of the dataset will be returned.
The example below converts numeric rating values to descriptive labels, then outputs the state of data as an array.
let result =
purchases
.get(p => ({
customerId: p.customerId,
rating: p.rating,
flag: p.rating < 60 ? 'bad' : p.rating < 90 ? 'okay' : 'good'
}));
console.log(result);
[
{ customerId: 'b', rating: 73, flag: 'okay' },
{ customerId: 'a', rating: 95, flag: 'good' },
{ customerId: 'a', rating: 92, flag: 'good' },
{ customerId: 'b', rating: 88, flag: 'okay' },
{ customerId: 'a', rating: 90, flag: 'good' },
{ customerId: 'b', rating: 66, flag: 'okay' },
{ customerId: 'a', rating: 54, flag: 'bad' },
{ customerId: 'b', rating: 31, flag: 'bad' },
key: 'null'
]
log
Output data to the console in a friendly table form that converts row property names to headers. After logging, the method returns the matrix that called it so that the fluent syntax chain can continue.
Parameters:
-
element: An html element to print to. Use the same selector syntax as you would use in
document.querySelector()
. Be sure the element is one in which it makes sense to append adiv
to. If null (default), prints to the console. - caption: A string representing a title for the output. If null (default), does not print out a caption.
-
mapper: A function giving a final mapping of the rows before output. If null
(default),
x => x
is ultimately passed. Alternatively, a number representing the multiple to which output should round numbers. - limit: An integer (default = 50). The maximum number of rows to be printed.
purchases
.log(null, 'pre-mapped')
.map(p => ({
customerId: p.customerId,
time: p.time,
rating: p.rating,
flag: p.rating < 60 ? 'bad' : p.rating < 90 ? 'okay' : 'good'
}))
.log(null, 'post-mapped', 1); // '1' indicates rounding to multiple of 1 (integer)
pre-mapped
ββββββββββββββ¬ββββββββ¬ββββββββ¬ββββββββ¬βββββββββ
β customerId β books β time β price β rating β
ββββββββββββββΌββββββββΌββββββββΌββββββββΌβββββββββ€
β b β 4 β 16.68 β 560 β 73 β
β a β 1 β 11.5 β 80 β 95 β
β a β 1 β 12.03 β 150 β 92 β
β b β 2 β 14.88 β 220 β 88 β
β a β 3 β 13.75 β 340 β 90 β
β b β 4 β 18.11 β 330 β 66 β
β a β 5 β 21.09 β 401 β 54 β
β b β 5 β 23.77 β 589 β 31 β
ββββββββββββββ΄ββββββββ΄ββββββββ΄ββββββββ΄βββββββββ
post-mapped
ββββββββββββββ¬βββββββ¬βββββββββ¬βββββββ
β customerId β time β rating β flag β
ββββββββββββββΌβββββββΌβββββββββΌβββββββ€
β b β 17 β 73 β okay β
β a β 12 β 95 β good β
β a β 12 β 92 β good β
β b β 15 β 88 β okay β
β a β 14 β 90 β good β
β b β 18 β 66 β okay β
β a β 21 β 54 β bad β
β b β 24 β 31 β bad β
ββββββββββββββ΄βββββββ΄βββββββββ΄βββββββ
toJsonString
Converts a dataset to a JSON string.
The method allows two optional parameters, replacer
and space
, which are passed
to the underlying JSON.stringify
function and so work the same way.
This method is designed to be invertible by using fromJson()
. If the structure of
a dataset changes in the future, such as by adding another property, your code can
break if you're not accounting for it. Using toJson()
and fromJson()
keeps
your code more stable.
The example below manipulates customers and then converts it to JSON, with some prettification.
let json =
$$(students)
.map(c => ({...c, initial: c.name.substring(0,1)}))
.toJsonString(null, 4);
console.log(json);
[
{
"id": "a",
"name": "Andrea",
"topic": "Abelard",
"bias": "analytic",
"initial": "A"
},
{
"id": "b",
"name": "Brielle",
"topic": "Bentham",
"bias": "buddhist",
"initial": "B"
}
]
fromJson
Creates a dataset from a JSON object or from a JSON string.
This method is designed to be used in tandem with toJson
.
This example takes a json string, instantiates a dataset with it, then manipulates the data a bit.
let json = `[
{
"id": "a",
"name": "Andrea",
"topic": "Abelard",
"bias": "analytic",
"initial": "A"
},
{
"id": "b",
"name": "Brielle",
"topic": "Bentham",
"bias": "buddhist",
"initial": "B"
}
]`;
$$.dataset.fromJson(json)
.map(s => ({ ...s, initial: s.initial.toLowerCase() }))
.log();
ββββββ¬ββββββββββ¬ββββββββββ¬βββββββββββ¬ββββββββββ
β id β name β topic β bias β initial β
ββββββΌββββββββββΌββββββββββΌβββββββββββΌββββββββββ€
β a β Andrea β Abelard β analytic β a β
β b β Brielle β Bentham β buddhist β b β
ββββββ΄ββββββββββ΄ββββββββββ΄βββββββββββ΄ββββββββββ
However, keep in mind that you'll never really want to instantiate from a JSON
string directly. Rather, you'll want to work with a string created from
dataset.toString()
, likely created due to the need to transfer the data through
a layer.
Guide: Server and Client Communication
To get dataset instances from the client to the server and back, use toJsonString()
and fromJson()
.
Imangine a JSON Sender:
// _jsonSender.r.js
async function serve(req, res) {
let data = await sample(); // gets some sample datasets.
let json = $$(data.customers).toJson();
res.writeHead(200, 'ok');
res.end(json);
}
It can be fetched and rebuilt on the client quite easily:
let ds = await
fetch('._/jsonSender.r.js')
.then(resp => $$.dataset.fromJson(resp));
Imagine a JSON reciever:
// _jsonReciever.r.js
async function serve(req, res) {
let json = '';
req.on('data', chunk => json += chunk);
req.on('end', () => {
let ds = $$.dataset.fromJson(json);
res.writeHead(200, 'ok');
res.end(
(ds.get().length > 0)
? 'got it'
: 'nope'
);
});
}
It can be posted to quite easily:
return await
fetch('./_jsonReciever.r.js', { body: json, method: 'post' })
.then(resp => resp.text())
.then(result => {
if (result !== 'got it')
throw `${prefix}: test did not pass on server.`;
});