Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for encoding TypedArrays as primitive objects for serialization #2911

Closed
wants to merge 11 commits into from

Conversation

jonmmease
Copy link
Contributor

@jonmmease jonmmease commented Aug 16, 2018

Overview

This PR implements a proposed approach encoding TypedArrays as primitive representation objects. See some related discussion in #1784.

Background

Plotly.js gained native support for typed arrays in #2388. This provides significantly improved performance when working with large arrays. plotly.py version 3 takes advantage of typed array support by converting numpy arrays into binary buffers on the Python side, and then converting these buffers directly into TypedArrays on the JavaScript side (See #2388 (comment) for more info).

One downside of working with TypedArrays is that there isn't a standard way (at least that I've been able to find) to serialize them to JSON. This PR aims to provide a few options of remedies to this problem.

Use cases

There are at least 5 use cases directly relevant to plotly.py where a serialized representation of TypedArrays will be very useful.

  1. The reason I'm working on this today is because the lack of serialization support for TypedArrays is the reason that FigureWidget instances containing numpy arrays cannot be rendered statically using nbconvert, nbviewer, and by extension Plotly Cloud. With these changes I could update the JavaScript model for FigureWidget to not include TypedArrays, but instead primitive representation objects that can be serialized. (This PR is no longer needed for this use case)

  2. With these changes I could update the plotly.py JSON serializer to encode numpy arrays as base64 strings, which can be up to 10 times faster than the current method of first converting them to lists.

  3. This JSON representation can be written to disk and then opened in the JupyterLab Chart Editor more efficiently.

  4. This JSON representation should make the plotly.py orca integration more responsive for figures with large numpy arrays.

  5. This JSON representation should make Dash more responsive when working with figures with large numpy arrays.

Typed Array Representation

This PR introduces the concept of an encoded TypedArray. An encoded TypedArray is a vanilla JavaScript object that contains dtype and value properties.

  • The dtype property is a string indicating the data type of the TypedArray ('int8', 'float32', 'uint16', etc.).
  • The value property is a primitive JavaScript object that stores the typed array data. It can be one of the following:
    i. Standard JavaScript Array
    ii. A base64 encoded string
    iii. An ArrayBuffer object
    iv. A DataView object

Encodings (i) and (ii) can be directly serialized to a string representation. Encodings (iii) and (iv) are useful when working with frameworks that already have support for serializing these more primitive binary representations.

Decoding and Encoding

  • A new top-level Plotly.decode function is introduced. This function inputs a JavaScript value, and returns a copy where all encoded TypedArray instances have been decoded into proper TypedArrays.

  • A new top-level Plotly.encode function is introduced. This function inputs a JavaScript value and returns a copy where all TypedArray instances are encoded as base64-encoded typed array representations.

Future

  • To support multi-dimensional arrays, an encoded typed array representation object could optionally include a shape parameter, indicating the size of each dimension. Plotly.js does not currently support a homogenous multi-dimensional array type, so initially these would be decoded into nested primitive arrays.

  • Would it be possible to encode datetime arrays more efficiently with a base64 buffer?

…tation objects)

A TypedArray representation object has two properties: `dtype` and `data`.

`dtype` is a string indicating the type of the typed array (`'int8'`, `'float32'`, `'uint16'`, etc.)

`data` is a primitive JavaScript object that stores the typed array data. It can be one of:
 - Standard JavaScript Array
 - ArrayBuffer
 - DataView
 - A base64 encoded string

The representation objects may stand in for TypedArrays in `data_type` properties and in properties with `arrayOk: true`.

The representation object is stored in `data`/`layout`, while the converted TypedArray is stored in `_fullData`/`_fullLayout`
@chriddyp
Copy link
Member

Really appreciate this thorough write ups @jonmmease ! Technologically, I'll leave it up to the rest, but I do really like the sound of the use cases that you've enumerated 👍

With these changes I could update the plotly.py JSON serializer to encode numpy arrays as base64 strings, which can be up to 10 times faster than the current method of first converting them to lists.

Dash directly uses this JSON serializer (plotly.utils.PlotlyJSONEncoder), so this would be a nice win.

When you say 10 times faster, in which ways is it faster? I'm assuming there are 4:
1 - Speed to serialization (json.dumps)
2 - Size for network transfer (this is orders of magnitude more significant than 1, especially when Dash apps are deployed on remote servers. In Dash, we also gzip all of the requests and responses, so I'm not sure what impact this would have.)
3 - Speed of de-serialization (JSON.parse)
4 - Speed of charting

@jonmmease
Copy link
Contributor Author

@chriddyp The 10X was my off-the-cuff test of encoding a 1 million element numpy array of random float64 values into a Python string.

Something like json.dumps(arr.tolist()) vs base64.encode(memoryview(arr))).decode()

Comparisons of (2) (3) and (4) would definitely be interesting as well!

Copy link
Contributor

@etpinard etpinard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for PR! Looking forward to JSON-serializable typed arrays!

src/lib/is_array.js Outdated Show resolved Hide resolved
package.json Show resolved Hide resolved
{
"data": [{
"type": "scatter",
"x": {"dtype": "float64", "data": [3, 2, 1]},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about using "data", "data" has a pretty important meaning already for plotly.js. I'd vote for values.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or maybe just "v" as calling a base64 string a set of values sounds wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed data to value (singular) in 5030d3a. Does that work for you? v felt too short 🙂

@@ -521,3 +539,48 @@ function validate(value, opts) {
return out !== failed;
}
exports.validate = validate;

var dtypeStringToTypedarrayType = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that all of them, except for Uint8ClampedArray:

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Typed_arrays#Typed_array_views

Might as well add it here for completeness.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in 5030d3a

*
* @returns {TypedArray}
*/
function primitiveTypedArrayReprToTypedArray(v) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious. Have you benchmarked this routine here for bas64 string corresponding to 1e4, 1e5, and 1e6 pts?

else if(dflt !== undefined) propOut.set(dflt);
if(isArrayOrTypedArray(v)) {
propOut.set(v);
} else if(isPrimitiveTypedArrayRepr(v)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, the {dtype: '', data: ''} -> typed array conversion should happen during the calc step. More precisely somewhere here:

ax.makeCalcdata = function(trace, coord) {
var arrayIn = trace[coord];
var len = trace._length;
var arrayOut, i;
var _d2c = function(v) { return ax.d2c(v, trace.thetaunit); };
if(arrayIn) {
if(Lib.isTypedArray(arrayIn) && axType === 'linear') {
if(len === arrayIn.length) {
return arrayIn;
} else if(arrayIn.subarray) {
return arrayIn.subarray(0, len);
}
}
arrayOut = new Array(len);
for(i = 0; i < len; i++) {
arrayOut[i] = _d2c(arrayIn[i]);
}
} else {
var coord0 = coord + '0';
var dcoord = 'd' + coord;
var v0 = (coord0 in trace) ? _d2c(trace[coord0]) : 0;
var dv = (trace[dcoord]) ? _d2c(trace[dcoord]) : (ax.period || 2 * Math.PI) / len;
arrayOut = new Array(len);
for(i = 0; i < len; i++) {
arrayOut[i] = v0 + i * dv;
}
}
return arrayOut;
};

Depending on how slow this conversion can be, moving it to the calc step will help ensure faster interactions (note that the calc is skipped on e.g. zoom and pan).

This might be a fairly big job though, you'll have to replace all the isArrayOrTypedArray calls upstream of the calc step with something like isArrayOrTypedArrayOrIsPrimitiveTypedArrayRepr (or something less verbose 😄 ).

So I guess, we should first benchmark primitiveTypedArrayReprToTypedArray what kind of potential perf gain we could get.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@etpinard Maybe I'm misunderstanding something. As it is now, I thought these conversions would be happening in the supplyDefaults logic for the various traces. Does that logic run on events like zoom/pan?

Either way, I'll get some performance numbers for primitiveTypedArrayReprToTypedArray. I was hoping I wouldn't need to retrace all of the steps you went through to add the original TypedArray support!

@alexcjohnson
Copy link
Contributor

Thanks for bringing this issue to the fore @jonmmease - your format proposal looks great, and I'd be very happy to have this format locked in as an official part of plotly.js. My main concern is where it fits in the pipeline. Leaving the representation in the figure and making conversion part of coerce means:

  • It will be repeated on every supplyDefaults - ie on every change to the plot, which is a lot of overhead
  • Potentially a problem for Plotly.react as it would create arrays with new identity each time, making it look like the data changed when it didn't.
  • We don't have an actual array in the figure object, so calls like this (which you likely don't use, but they are supported) won't work:
Plotly.newPlot(gd,[{y:new Int8Array([1,2,3,4,5])}]);
Plotly.restyle(gd, 'y[2]', 6);

I guess the first two could potentially be fixed by @etpinard's suggestion of moving the conversion to calc, but it wouldn't affect the third one. I guess it would be OK in principle to just say we don't support this kind of mutation on the new data type, but doesn't seem ideal.

Alternatively, would it be reasonable to have official serialization/deserialization routes, that can be used both on a complete figure and on arguments to restyle et al, so we can keep gd.data in native format? Does that cause a problem for any of the use cases you enumerated above?

@jonmmease
Copy link
Contributor Author

I'll look it over some more tonight, but one quick thought. For plotly.py all I really need is a way to check equality between my data model (representation array), and whatever Plotly.js stores in data. If it would be better on the Plotly.js side to store the converted TypedArray in data, then I could perform the conversion from representation to TypedArray in my equality checks. If the isPrimitiveTypedArrayRepr and primitiveTypedArrayReprToTypedArray were public so that I could import and use them from FigureWidget then I think that's all I would need.

If we go this route, could the supplyDefaults logic modify the input container (from representation to TypedArray) as well as the output? Or is that frowned upon 🙂 Is there currently a better place to do that?

@alexcjohnson
Copy link
Contributor

If we go this route, could the supplyDefaults logic modify the input container (from representation to TypedArray) as well as the output? Or is that frowned upon 🙂 Is there currently a better place to do that?

Yeah, we do modify the input in a few places, but we're trying to break that habit. This is why I'm angling for an explicit deserialize step when new data comes in (and an explicit serialize step when saving). And as the serial format is really all about interfacing with the world outside of javascript, it seems like it should be kept separate from the regular pipeline, to be invoked by whatever application it is that's doing that out-of-js interfacing.

If the isPrimitiveTypedArrayRepr and primitiveTypedArrayReprToTypedArray were public so that I could import and use them from FigureWidget then I think that's all I would need.

OK great, lets see how far we can get that way!

@jonmmease
Copy link
Contributor Author

Quick update. It turns out that I was able to solve the FigureWidget serialization problem by customizing the ipywidgets serialization logic. So my use-case (1) is no longer dependent on any changes here. And there's no need to make the methods discussed above public.

Given that, I do think it makes sense to work towards some form of dedicated serialization pathway.

Maybe something like...

// inVal is something from the outside with typed array representation objects
var inVal = {...} 

// Plotly.import converts these to TypedArrays
Plot.newPlot(gd, Plotly.import(inVal)) 

// Do stuff

// outVal has TypedArrays encoded as base64 representation objects
var outVal = Plotly.export(gd, {typedArrayRepr: 'base64'}) 

what do you think?

 - Use `Lib.isPlainObject`
 - Renamed `data` -> `value`
 - Added `Uint8ClampedArray`
 - Committed updated package-lock.json

No changes yet to the logical structure of where conversion happens
@jonmmease
Copy link
Contributor Author

Latest push reverts all coerce.js changes and moves the typed array conversion logic to a new Plotly.import function.

@@ -5,6 +5,7 @@
* This source code is licensed under the MIT license found in the
* LICENSE file in the root directory of this source tree.
*/
var Lib = require('../lib');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll need to require './is_plain_object.js to avoid a circular dependency pattern

lib/index -> lib/is_plain_object -> lib/index -> lib/is_array -> lib/index -> lib/is_plain_object

@etpinard
Copy link
Contributor

@jonmmease your

// inVal is something from the outside with typed array representation objects
var inVal = {...} 

// Plotly.import converts these to TypedArrays
Plot.newPlot(gd, Plotly.import(inVal)) 

// Do stuff

// outVal has TypedArrays encoded as base64 representation objects
var outVal = Plotly.export(gd, {typedArrayRepr: 'base64'}) 

sounds solid. 🥇

Personally, I find Plotly.import a little confusion. At first-order, I think of import as importing as module, not a dataset. Perhaps Plotly.decodeTypedArrays would be better?

*
* @returns {TypedArray}
*/
var dtypeStringToTypedarrayType = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll need to append this list:

plotly.js/.eslintrc

Lines 12 to 21 in a8c6217

"globals": {
"Promise": true,
"Float32Array": true,
"Float64Array": true,
"Uint8Array": true,
"Int16Array": true,
"Int32Array": true,
"ArrayBuffer": true,
"DataView": true,
"SVGElement": false

to make npm run lint pass.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and we'll need to add fallbacks so that browsers w/o typed array support don't break.

We have a "test" for that:

npm run test-jasmine -- --bundleTest=ie9_test.js

@jonmmease
Copy link
Contributor Author

Relevant question from the plotly.py forums: https://community.plot.ly/t/offline-plot-to-div-encode-numpy-data-as-binary-blob/12965

Is there any way to use plotly.offline to store (numpy) data in a div or HTML as a binary blob then have plotly js render it properly?

The HTML file has bits that look like…
“z”: [[0.2866461675771665, 0.36531080671829425, -0.2904632669675007, -0.36149370732795966, -0.049367492181059396, -0.08192231566757988, 0.6849181745604624, 1.268579555423972, 0.2944241042103686, -0.2851544876150079, -0.9762164627387184, 0.1527361407626408] …

I’m trying to reduce the size of my HTML files that have several image plots each with 10^5 - 10^6 data points.

I’ve taken to stripping some of the precision so the json (string)-encoded data have fewer digits, but this seems silly.

This reminds me that the encoding should be able to handle the 2-dimensional use case as well.

I think I'll add an optional shape field to the encoding. If shape is absent or a scalar array then the representation will convert into a TypedArray. But if the shape is an array with more than one element, then the representation will be converted into a nested set of arrays.

@jonmmease
Copy link
Contributor Author

Thanks for the feedback @etpinard , I'm going to circle back around to this in a week or two, after the plotly.py 3.2 release.

In terms of naming. I was thinking of keeping "TypedArray" out of the name so that we could eventually add other encodings if useful. How about Plotly.import -> Plotly.decode and Plotly.export -> Plotly.encode?

Plotly.decode would be generous about accepting the various encodings of TypedArrays (and maybe eventually various encoding of other types like images). Plotly.encode would always produce a JSON-serializable representation of the object.

I was also wondering if it would make sense to allow Plotly.decode to accept compressed representations. It would be nice, from the Python side, to be able to encode arrays as base64 and then gzip up the whole figure bundle before transporting to Plotly.js. It think Dash already does this, but it would be nice if were baked in to plotly.py and Plotly.js.

@chriddyp would compression as the figure-level be useful to Dash or do you already compress at a higher level? Also, what do you use from decompression on the JavaScript side?

@etpinard
Copy link
Contributor

How about Plotly.import -> Plotly.decode and Plotly.export -> Plotly.encode?

👌

I was also wondering if it would make sense to allow Plotly.decode to accept compressed representations.
to be able to encode arrays as base64 and then gzip up the whole figure bundle before transporting to Plotly.js.

Interesting, but perhaps this is out of the scope of plotly.js? How big our common front-end unzip libraries? If we want to have all decode/encode/compress/decompress logic in one place, maybe we should explore placing these Plotly.decode / Plotly.encode methods in a separate npm package.

@jonmmease
Copy link
Contributor Author

Good point regarding scope. The typed array stuff is pretty tied to Plotly.js, but the compression can happen where-ever.

I was picturing using the decompression from orca eventually, but it probably makes more sense to just add this as an option to orca down the road (if it proves helpful), rather than Plotly.js.

In this case the decoded value is undefined, but an error won't be
thrown.
This can't be a mock anymore because it is not valid as input to
Plotly.plot without first passing through Plotly.decode
This function inputs a Plotly object and outputs a copy where all
TypedArray instances have been replace with JSON serializable
representation objects.  This function is the inverse of Plotly.decode
@jonmmease
Copy link
Contributor Author

Ok, I believe I have finished the implementation and testing of the new Plotly.decode and Plotly.encode functions for 1-dimensional TypedArray encoding.

Handling multi-dimensional arrays will be a bit more complicated, and a bit less useful since Plotly.js doesn't (yet?) use a homogenous multi-dimensional array type internally, so I'd like to put this off for a future PR.

Might it be feasible to get this into 1.41? My hope was to use this in plotly.py 3.3, where I'm going to introduce new plotly.io.write_figure and plotly.io.read_figure functions for reading and writing figures to/from disk. Then, we'll update the JupyterLab chart editor to read and write these same files from the file system (plotly/jupyterlab-chart-editor#20). With compression on top, I think we'll have a really efficient storage format for figure's involving large arrays!

@etpinard
Copy link
Contributor

etpinard commented Sep 11, 2018

@jonmmease after a long talk with @alexcjohnson, we came to a few conclusions:

  • a JSON-to-typed-array encode/decode util will be very useful not just for plotly.py but also for plotly cloud and the react-chart-editor
  • decoding works best when called before Plotly.newPlot, mainly when considering applications like the react-chart-editor that would need to decode before populating its table columns.
  • when called before Plotly.newPlot the decoding util doesn't need to know anything about plotly.js, except that it uses JSON-compatible objects as arguments.
  • the only short-term consumer of these encode/decode util will be plotly.py. Adding typed-array support on plotly cloud will be a big project. Adding it to the react-chart-editor doesn't appear to be a priority.
  • and finally, we doubt that Plotly.encode and Plotly.decode are valid additions to plotly.js (not plotly in general 😄 ) as we try to keep plotly.js' scope only as wide as it needs to be.

So, I'm proposing two solutions:

  • make a npm package called @plotly/typed-json and add it as a dependency to plotly.py. I would be happy to dedicate some time to this later on this week after 1.41.0 is released
  • or, move the new encode/decode logic to the plotly.py front-end for now, until we have a better idea of where this encode/decode util fits in the plotly realm.

@jonmmease
Copy link
Contributor Author

Thanks for taking the time to think through this in detail @etpinard and @alexcjohnson. The approach of eventually creating a separate @plotly/typed-json npm package makes fine sense to me.

plotly.py does't really need this new npm package for itself (it already as a version of the decode function in the implementation of the FigureWidget JavaScript library). The benefit for plotly.py users will be when other Plotly.js-based endpoints are able to accept (and return) encoded typed arrays. The ones in the front of my mind are orca, the JupyterLab chart editor, the dash figure component, and cloud.

So I'm happy to plan on pursuing the npm package approach in not too distant future, but I don't think it needs to be this week. It's not blocking anything, just the next step in efficiency for large datasets. Maybe after 1.42? Thanks!

@jonmmease jonmmease closed this Sep 11, 2018
@etpinard
Copy link
Contributor

Thanks very much for your understanding and your hard work @jonmmease

but I don't think it needs to be this week. It's not blocking anything, just the next step in efficiency for large datasets.

Thanks for info!

['int32', new Int32Array([-2147483648, -123, 345, 32767, 2147483647])],
['uint32', new Uint32Array([0, 345, 32767, 4294967295])],
['float32', new Float32Array([1.2E-38, -2345.25, 2.7182818, 3.1415926, 2, 3.4E38])],
['float64', new Float64Array([5.0E-324, 2.718281828459045, 3.141592653589793, 1.8E308])]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be of interest to add BigInts to the list.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems that is not yet supporting longs. How is ploty dealing with BigInt currently? is there any support planned?

return 'float32';
} else if(typeof Float64Array !== 'undefined' && v instanceof Float64Array) {
return 'float64';
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be of interest to support BigInts as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Big integers could be useful for storing time in milliseconds.

@@ -16,6 +16,8 @@ exports.restyle = main.restyle;
exports.relayout = main.relayout;
exports.redraw = main.redraw;
exports.update = main.update;
exports.decode = main.decode;
exports.encode = main.encode;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may start with underscore (i.e. kind of private) methods here.
For example:

exports._decode = main.decode;
exports._encode = main.encode;

If we want to expose these functionality, then using a more specific name could be considered.
For example:

exports.decodeArray = main.decode;
exports.encodeArray = main.encode;

} else if(Lib.isPlainObject(v)) {
var result = {};
for(var k in v) {
if(v.hasOwnProperty(k)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we may list the keys using Object.getOwnPropertyNames and loop through them.

@archmoj
Copy link
Contributor

archmoj commented Aug 5, 2020

One downside of working with TypedArrays is that there isn't a standard way (at least that I've been able to find) to serialize them to JSON. This PR aims to provide a few options of remedies to this problem.

What's the status of this problem in 2020?

@archmoj
Copy link
Contributor

archmoj commented Aug 5, 2020

One downside of working with TypedArrays is that there isn't a standard way (at least that I've been able to find) to serialize them to JSON. This PR aims to provide a few options of remedies to this problem.

What's the status of this problem in 2020?

Besides there is BSON file format: https://en.wikipedia.org/wiki/BSON

@archmoj
Copy link
Contributor

archmoj commented Aug 5, 2020

Plotly.js does not currently support a homogenous multi-dimensional array type, so initially these would be decoded into nested primitive arrays.

ndarray is applied in the surface trace code:

var ndarray = require('ndarray');
var ndarrayInterp2d = require('ndarray-linear-interpolate').d2;

} else if(dtype === 'float64' && typeof Float64Array !== 'undefined') {
return Float64Array;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may consider rewrite this:

function getArrayType(t) {
    return (typeof t !== 'undefined') ? t : undefined;
}
var validInt8Array = getArrayType(Int8Array);
var validUint8Array = getArrayType(Uint8Array);
...

/**
 * Get TypedArray type for a given dtype string
 * @param {String} dtype: Data type string
 * @returns {TypedArray}
 */
function getTypedArrayTypeForDtypeString(dtype) {
    switch(dtyle) {
        case 'int8':
            return validInt8Array;
        case 'uint8':
            return validUint8Array;
        ...
    }
}

return result;
} else {
return v;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can drop last else statement and simply return v;

@archmoj
Copy link
Contributor

archmoj commented Aug 5, 2020

Right now, one could call:

Plotly.newPlot(gd, Plotly.decode({
    'data': [{
        'type': 'scatter',
        'x': {'dtype': 'float64', 'value': 'AAAAAAAACEAAAAAAAAAAQAAAAAAAAPA/'},
        'y': {'dtype': 'float32', 'value': 'AABAQAAAAEAAAIA/'},
        'marker': {
            'color': {
                'dtype': 'uint16',
                'value': 'AwACAAEA',
            },
        }
    }]
}));

But wondering if we could/should expand data_array valType to support a call like this instead? i.e. without exposing decode function.

Plotly.newPlot(gd, 
    'data': [{
        'type': 'scatter',
        'x': {'dtype': 'float64', 'value': 'AAAAAAAACEAAAAAAAAAAQAAAAAAAAPA/'},
        'y': {'dtype': 'float32', 'value': 'AABAQAAAAEAAAIA/'},
        'marker': {
            'color': {
                'dtype': 'uint16',
                'value': 'AwACAAEA',
            },
        }
    }]
});

@jonmmease
Copy link
Contributor Author

That was my original design. I don't remember all of the details, but as I recall it got a little messy to work through what should get stored in data and fullData (whether to store the input data structure, or the converted Typed Array) to handle restyle/react updates.

I don't have a strong preference if there's a clean way to handle it all internally.

@archmoj
Copy link
Contributor

archmoj commented Aug 5, 2020

That was my original design. I don't remember all of the details, but as I recall it got a little messy to work through what should get stored in data and fullData (whether to store the input data structure, or the converted Typed Array) to handle restyle/react updates.

I don't have a strong preference if there's a clean way to handle it all internally.

We actually have this comment here:

// TODO maybe `v: {type: 'float32', vals: [/* ... */]}` also

And few more added in bc32981cdb

@astroboylrx
Copy link

astroboylrx commented Apr 30, 2023

How may one use this feature when exporting an interactive plot to HTML from Python? The html file I got still store data in ascii format. I couldn't find proper documentation for using this.

I tried to manually replace the data in the HTML file with base64 string following the format in the given example, which doesn't seem to work...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants