Skip to content

Commit

Permalink
Reserve JavaScript objects and object arrays for R data.frame objec…
Browse files Browse the repository at this point in the history
…ts (#401)

* Make RList.isDataFrame a method

* Create RClass type for additional helper constructors

This commit generalises the type union `RType | 'object'` used when
constructing new R objects. We create a new `RClass` type for `object`,
(corresponding to the generic `RObject` helper constructor).

This will become useful in a moment when we add a new `data.frame` helper
constructor. Rather than an ever growing type union of R class names,
we'll have a neater `RClass` type.

* Add RDataFrame helper class constructor

* Add tests for RDataFrame construction

* Update documentation for data.frame construction

* Update NEWS.md

* Fix newRClassProxy documentation

* Define names using second arg in RList constructor
  • Loading branch information
georgestagg authored Apr 4, 2024
1 parent a06b0bb commit 442e5ac
Show file tree
Hide file tree
Showing 10 changed files with 236 additions and 120 deletions.
6 changes: 6 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,14 @@

* The `captureGraphics` option in `EvalROptions` now allows the caller to set the arguments to be passed to the capturing `webr::canvas()` device.

* A subclass `RDataFrame` is now available for explicit construction of an R object with class `data.frame`. `RDataFrame` extends the `RList` class, and construction must be with data that can be coerced into an R `data.frame`, otherwise an error is thrown.

* The `RList` constructor now takes an optional second argument to define names when constructing a list. The argument should be an array of strings, or `null` for an unnamed list (the default).

## Breaking changes

* When using the generic `RObject` constructor, JavaScript objects and object arrays are now reserved for constructing an R `data.frame`. To create a standard R list, use the `RList` constructor directly.

## Bug Fixes

* When capturing graphics with `captureR()`, clean-up now occurs even when the evaluated R code throws an error. This avoids leaking graphics devices on the device stack.
Expand Down
46 changes: 23 additions & 23 deletions src/docs/convert-js-to-r.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Once webR has been loaded into a web page, new R objects can be created from the

## Creating new R objects

New R objects can be created from the main JavaScript thread by using the `new` operator with proxy classes on an initialised instance of the [`WebR`](api/js/classes/WebR.WebR.md) class.
New R objects can be created from the main JavaScript thread by using the `new` operator with proxy classes on an initialised instance of the [`WebR`](api/js/classes/WebR.WebR.md) class. The `RObject` proxy class can be used as a general constructor for creating new R objects from a given JavaScript argument.

When an R object is instantiated in this way, webR communicates with the worker thread to orchestrate object creation in WebAssembly memory. As such, new R objects can only be created once communication with the worker thread has been established and the promise returned by [`WebR.init()`](api/js/classes/WebR.WebR.md#init) has resolved.

Expand Down Expand Up @@ -47,38 +47,36 @@ Sometimes webR constructs new R objects implicitly behind the scenes. For instan

### Constructing R objects from JavaScript objects

The [`WebR` proxy classes](api/js/classes/WebR.WebR.md#properties) take a single JavaScript argument in their constructor functions which will be used for the content of the new R object. JavaScript objects that are able to be converted for use as R objects have type [`WebRData`](api/js/modules/RObject.md#webrdata).

The resulting R object type is chosen based on the contents of the JavaScript argument provided. When there is ambiguity, the following conversion rules are used,

| Constructor Argument | R Type |
| ------------------------------------------------------------------------ | --------------------------------------------------------------------------- |
| `null` | Logical `NA` |
| `boolean` | Logical atomic vector |
| `number` | Double atomic vector |
| `{ re: 1, im: 2 }` | Complex atomic vector |
| `string` | Character atomic vector |
| `TypedArray`, `ArrayBuffer`, `ArrayBufferView` | Raw atomic vector |
| `Array` | A vector or list of type following the coercion rules of R's `c()` function |
| `RObject` | Given by the type of the referenced R object |
| `{a: [...], b: [...], ...}` | R list object, possibly in the form of a `data.frame` |
| `[{a: 0, b: 'x'}, {a: 1, b: 'y'}, ...]` | R list object in the form of a `data.frame` |
| [`WebRDataJs`](convert-r-to-js.qmd#serialising-r-objects) | Given by the `type` property in the provided object |
| Other JavaScript object | Reserved for future use |
When using the generic `RObject` constructor the resulting R object type is chosen based on the contents of the JavaScript argument provided. Where there is ambiguity, the following conversion rules are used,

| Constructor Argument | R Type |
|-----------------------------------------------------------|-------------------------------------------------------------|
| `null` | Logical `NA` |
| `boolean` | Logical atomic vector |
| `number` | Double atomic vector |
| `{ re: 1, im: 2 }` | Complex atomic vector |
| `string` | Character atomic vector |
| `TypedArray`, `ArrayBuffer`, `ArrayBufferView` | Raw atomic vector |
| `Array` | A vector following the coercion rules of R's `c()` function |
| `{a: [0, 1], b: ['x', 'y']}` | Data frame |
| `[{a: 0, b: 'x'}, {a: 1, b: 'y'}]` | Data frame |
| `RObject` | Given by the type of the referenced R object |
| [`WebRDataJs`](convert-r-to-js.qmd#serialising-r-objects) | Given by the `type` property in the provided object |
| Other JavaScript type | Reserved for future use |

#### Further details

For JavaScript objects with a collection of properties, the above rules will be applied recursively to construct an R list with named components corresponding to each property.
JavaScript objects (in "long" form) and JavaScript object arrays (in "wide" or "D3" form), as shown in the table above, are reserved for constructing R data frames. The object properties should correspond to data with columns of equal length and consistent data type, and values should be compatible with R atomic vectors (or `null`, indicating a missing value).

If each property of the JavaScript object is an `Array`, all of equal length, all containing values compatible with R atomic vectors (or `null`, indicating a missing value), the resulting R object will be automatically^[This coercion may be avoided by constructing an `RList` object directly.] coerced into an R [`data.frame`](https://stat.ethz.ch/R-manual/R-devel/library/base/html/data.frame.html).
The creation of a `data.frame` can be avoided by explicitly constructing an `RList` object, as shown in the next section.

When `RObject` references are used for constructing new R objects, no underlying copy is made. The resulting R object reference will point to the same memory location.

### Creating an R object with specific type

As an alternative to constructing an [`WebRDataJs`](api/js/modules/RObject.md#webrdatajs) with an explicit `type` property, several class proxies of different type are available on the [`WebR`](api/js/classes/WebR.WebR.md) instance. For example, the [`WebR.RList`](api/js/classes/WebR.WebR.md#rlist) class proxy can be used to specifically construct an R list object, rather than an atomic vector, using a JavaScript array of values.
Class constructors for various types of R object are also available on the [`WebR`](api/js/classes/WebR.WebR.md) instance. For example, the [`WebR.RList`](api/js/classes/WebR.WebR.md#rlist) constructor can be used to specifically create an R list object from a given JavaScript argument, rather than the atomic vector or `data.frame` that would be created when using the generic `RObject` constructor.

To see how this can be useful, consider the difference in structure between the following two R object construction examples, using the same JavaScript object as the constructor argument,
To see how this can be useful, consider the difference in structure between the following two examples, using the same JavaScript object as the constructor argument,

``` javascript
let foo = await new webR.RObject([123, 'abc']);
Expand All @@ -101,6 +99,8 @@ await foo.toJs();
]
}

It is recommended that specific R object constructors are used, rather than relying on the conversion rules of the generic `RObject` constructor, when webR is used non-interactively or in production.

### Creating objects using `RObject` references

An [`RObject`](api/js/classes/RWorker.RObject.md) can be used as part of R object construction, either on its own, included in a JavaScript array, or as the values in an [`WebRDataJs`](api/js/modules/RObject.md#webrdatajs).
Expand Down
85 changes: 80 additions & 5 deletions src/tests/webR/webr-main.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -609,7 +609,7 @@ describe('Create R objects from JS objects using proxy constructors', () => {
describe('Create R lists from JS objects', () => {
test('Create an R list from basic JS object', async () => {
const jsObj = { a: [1, 2], b: [3, 4, 5], c: ['x', 'y', 'z'] };
const rObj = await new webR.RObject(jsObj) as RList;
const rObj = await new webR.RList(jsObj);
expect(await rObj.type()).toEqual('list');
expect(await rObj.names()).toEqual(['a', 'b', 'c']);
const a = await rObj.get('a') as RDouble;
Expand All @@ -620,9 +620,47 @@ describe('Create R lists from JS objects', () => {
expect(await c.toArray()).toEqual(jsObj.c);
});

test('Create an unnamed R list from JS array', async () => {
const jsArray = [[1, 2, 3], ['x', 'y', 'z']];
const rObj = await new webR.RList(jsArray);
expect(await rObj.type()).toEqual('list');
expect(await rObj.names()).toEqual(null);
const foo = await rObj.get(1) as RDouble;
const bar = await rObj.get(2) as RCharacter;
expect(await foo.toArray()).toEqual(jsArray[0]);
expect(await bar.toArray()).toEqual(jsArray[1]);
});

test('Create a named R list from values and names arrays', async () => {
const jsArray = [[1, 2, 3], ['x', 'y', 'z']];
const names = ["a", "b"];
const rObj = await new webR.RList(jsArray, names);
expect(await rObj.type()).toEqual('list');
expect(await rObj.names()).toEqual(names);
const foo = await rObj.get(1) as RDouble;
const bar = await rObj.get(2) as RCharacter;
expect(await foo.toArray()).toEqual(jsArray[0]);
expect(await bar.toArray()).toEqual(jsArray[1]);
});

test('Create a named R list with duplicate names', async () => {
const jsArray = [[1, 2, 3], ['x', 'y', 'z'], 7];
const names = ["foo", "foo", "bar"];
const rObj = await new webR.RList(jsArray, names);
expect(await rObj.type()).toEqual('list');
expect(await rObj.names()).toEqual(names);
});

test('Reject a named R list with inconsistent names length', async () => {
const jsArray = [[1, 2, 3], ['x', 'y', 'z']];
const names = ["a"];
const rObj = new webR.RList(jsArray, names);
await expect(rObj).rejects.toThrow("Can't construct named `RList`");
});

test('Create an R list from JS object with coercion and missing values', async () => {
const jsObj = { a: [0, true], b: [null, 4, '5'], c: [null] };
const rObj = await new webR.RObject(jsObj) as RList;
const rObj = await new webR.RList(jsObj);
expect(await rObj.type()).toEqual('list');
expect(await rObj.names()).toEqual(['a', 'b', 'c']);
const a = await rObj.get('a') as RDouble;
Expand All @@ -640,7 +678,7 @@ describe('Create R lists from JS objects', () => {

test('Create an R list from JS object with R TypedArray', async () => {
const jsObj = { a: [1, 2], b: new Uint8Array([3, 4, 5]), c: new Uint8Array([6, 7, 8]).buffer };
const rObj = await new webR.RObject(jsObj) as RList;
const rObj = await new webR.RList(jsObj);
expect(await rObj.type()).toEqual('list');
expect(await rObj.names()).toEqual(['a', 'b', 'c']);
const a = await rObj.get('a') as RDouble;
Expand All @@ -653,7 +691,7 @@ describe('Create R lists from JS objects', () => {

test('Create an R list from JS object with R object references', async () => {
const jsObj = { a: webR.objs.true, b: [1, webR.objs.na, 3], c: webR.objs.globalEnv };
const rObj = await new webR.RObject(jsObj) as RList;
const rObj = await new webR.RList(jsObj);
expect(await rObj.type()).toEqual('list');
expect(await rObj.names()).toEqual(['a', 'b', 'c']);
const a = await rObj.get('a') as RLogical;
Expand All @@ -672,7 +710,7 @@ describe('Create R lists from JS objects', () => {
describe('Create R data.frame from JS objects', () => {
test('Create an R data.frame from basic JS object', async () => {
const jsObj = { a: [1, 2, 3], b: [3, 4, 5], c: ['x', 'y', 'z'] };
const rObj = await new webR.RObject(jsObj) as RList;
const rObj = await new webR.RObject(jsObj);
expect(await rObj.type()).toEqual('list');
expect(await rObj.names()).toEqual(['a', 'b', 'c']);
const attrs = await rObj.attrs() as RPairlist;
Expand All @@ -687,6 +725,43 @@ describe('Create R data.frame from JS objects', () => {
expect(await c.toArray()).toEqual(jsObj.c);
});

test('Create an R data.frame using explicit constructor', async () => {
const jsObj = { a: [1, 2, 3], b: [3, 4, 5], c: ['x', 'y', 'z'] };
const rObj = await new webR.RDataFrame(jsObj);
const attrs = await rObj.attrs() as RPairlist;
const classes = await attrs.get('class') as RCharacter;
expect(await classes.toArray()).toContain('data.frame');
});

test('Create an R data.frame by wrapping an R list object', async () => {
const rList = await webR.evalR('data.frame(a = c(1,2,3), b = c(4,5,6))') as RList;
const rDataFrame = await new webR.RDataFrame(rList);
const attrs = await rDataFrame.attrs() as RPairlist;
const classes = await attrs.get('class') as RCharacter;
expect(await classes.toArray()).toContain('data.frame');

const a = await rDataFrame.get('a') as RDouble;
const b = await rDataFrame.get('b') as RDouble;
expect(await a.toArray()).toEqual([1, 2, 3]);
expect(await b.toArray()).toEqual([4, 5, 6]);
});

test('Reject constructing R data.frame from ineligible JS object', async () => {
const jsObj = { a: [1, 2, 3], b: [3, 4, 5], c: ['x', 'y'] };
const rObj = new webR.RObject(jsObj);
await expect(rObj).rejects.toThrow("Can't construct `data.frame`.");
});

test('Reject constructing R data.frame from ineligible D3 JS object', async () => {
const d3Obj = [
{ a: true, b: 3, c: 'u' },
{ a: webR.objs.false, b: 4 },
{ z: 123 },
];
const rObj = new webR.RObject(d3Obj);
await expect(rObj).rejects.toThrow("Can't construct `data.frame`.");
});

test('Create an R data.frame from JS object with coercion and missing values', async () => {
const jsObj = { a: [0, 1, true], b: [null, 4, '5'], c: [null, null, null] };
const rObj = await new webR.RObject(jsObj) as RList;
Expand Down
16 changes: 8 additions & 8 deletions src/webR/proxy.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
import { ChannelMain } from './chan/channel';
import { replaceInObject } from './utils';
import { isWebRPayloadPtr, WebRPayloadPtr, WebRPayload } from './payload';
import { RType, WebRData, WebRDataRaw } from './robj';
import { RType, RCtor, WebRData, WebRDataRaw } from './robj';
import { isRObject, RObject, isRFunction } from './robj-main';
import * as RWorker from './robj-worker';
import { ShelterID, CallRObjectMethodMessage, NewRObjectMessage } from './webr-chan';
Expand Down Expand Up @@ -184,15 +184,15 @@ export function targetMethod(chan: ChannelMain, prop: string, payload?: WebRPayl
*/
async function newRObject(
chan: ChannelMain,
objType: RType | 'object',
objType: RType | RCtor,
shelter: ShelterID,
value: WebRData
...args: WebRData[]
) {
const msg: NewRObjectMessage = {
type: 'newRObject',
data: {
objType,
obj: replaceInObject(value, isRObject, (obj: RObject) => obj._payload),
args: replaceInObject<WebRData[]>(args, isRObject, (obj: RObject) => obj._payload),
shelter: shelter,
},
};
Expand Down Expand Up @@ -243,8 +243,8 @@ export function newRProxy(chan: ChannelMain, payload: WebRPayloadPtr): RProxy<RW
* Proxy an {@link RWorker.RObject} class.s
* @param {ChannelMain} chan The current main thread communication channel.
* @param {ShelterID} shelter The shelter ID to protect returned objects with.
* @param {(RType | 'object')} objType The R object type, or `'object'` for the
* generic {@link RWorker.RObject} class.
* @param {(RType | RCtor)} objType The R object type or class, `'object'` for
* the generic {@link RWorker.RObject} class.
* @returns {ProxyConstructor} A proxy to the R object subclass corresponding to
* the given value of the `objType` argument.
* @typeParam T The type of the {@link RWorker.RObject} class to be proxied.
Expand All @@ -253,10 +253,10 @@ export function newRProxy(chan: ChannelMain, payload: WebRPayloadPtr): RProxy<RW
export function newRClassProxy<T, R>(
chan: ChannelMain,
shelter: ShelterID,
objType: RType | 'object'
objType: RType | RCtor
) {
return new Proxy(RWorker.RObject, {
construct: (_, args: [WebRData]) => newRObject(chan, objType, shelter, ...args),
construct: (_, args: WebRData[]) => newRObject(chan, objType, shelter, ...args),
get: (_, prop: string | number | symbol) => {
return targetMethod(chan, prop.toString());
},
Expand Down
1 change: 1 addition & 0 deletions src/webR/robj-main.ts
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ export type RDouble = RProxy<RWorker.RDouble>;
export type RComplex = RProxy<RWorker.RComplex>;
export type RCharacter = RProxy<RWorker.RCharacter>;
export type RList = RProxy<RWorker.RList>;
export type RDataFrame = RProxy<RWorker.RDataFrame>;
export type RRaw = RProxy<RWorker.RRaw>;
export type RCall = RProxy<RWorker.RCall>;
// RFunction proxies are callable
Expand Down
Loading

0 comments on commit 442e5ac

Please sign in to comment.