Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Can't plot INT64 values #13573

Closed
jmakov opened this issue Dec 3, 2023 · 8 comments
Closed

[BUG] Can't plot INT64 values #13573

jmakov opened this issue Dec 3, 2023 · 8 comments

Comments

@jmakov
Copy link

jmakov commented Dec 3, 2023

Software versions

Python version : 3.10.13 | packaged by conda-forge | (main, Oct 26 2023, 18:07:37) [GCC 12.3.0]
IPython version : 8.18.1
Tornado version : 6.3.3
Bokeh version : 3.3.1
BokehJS static path : /home/jernej_m/mambaforge-pypy3/envs/TEST/lib/python3.10/site-packages/bokeh/server/static
node.js version : v20.9.0
npm version : 10.1.0
jupyter_bokeh version : 2.0.4
Operating system : Linux-6.1.64-1-MANJARO-x86_64-with-glibc2.38

Browser name and version

Chromiumjupy

Jupyter notebook / Jupyter Lab version

4.0.9

Expected behavior

Plot without warnings

Observed behavior

BokehUserWarning: out of range integer may result in loss of precision

Example code

import hvplot.polars
import polars

df = polars.DataFrame({"col1": [9223372036854775807 for i in range(10)], "col2": [2*i * 10**15 for i in range(10)], "col3": [i * 10**8 for i in range(10)]})
df.hvplot(y=["col1", "col2"], x="col3")

# this works. Also it looks like that only the "x" axis needs to be cast to Float64 for some reason
df.cast(polars.Float64).hvplot(y=["col1", "col2"], x="col3")

Stack traceback or browser console output

No response

Screenshots

No response

@jmakov jmakov added the TRIAGE label Dec 3, 2023
@bryevdv
Copy link
Member

bryevdv commented Dec 3, 2023

@jmakov please provide a pure-Bokeh Minimal Reproducible example.

@mattpap
Copy link
Contributor

mattpap commented Dec 3, 2023

The largest integer bokeh will accept is 2**53-1 (JS' Number.MAX_SAFE_INTEGER), because that's the largest integer that can be exactly represented by a float64. If your integer data is larger than that, you will get a warning and bokeh will convert it to float64 with a possible loss of precision. If you don't mind working with float64 and you don't want to be warned by bokeh, then either perform the conversion yourself or use fp values from the beginning and let the type system of polars and thus bokeh know what your intentions are.

@jmakov
Copy link
Author

jmakov commented Dec 4, 2023

I'm working with nanosecond timestamps so it's a bit inconvenient that they need to be converted just to get a plot. But if that's indented, then this issue can be closed since it's not a bug but a design decision.

@bryevdv
Copy link
Member

bryevdv commented Dec 4, 2023

It's a design decision of JavaScript @jmakov, that we have to live with.

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Number

@bryevdv bryevdv closed this as not planned Won't fix, can't repro, duplicate, stale Dec 4, 2023
@jbednar
Copy link
Contributor

jbednar commented Dec 4, 2023

Is there a design document somewhere for Bokeh that talks about JavaScript BigInt vs Number? Obviously in most cases Number is more suited to plotting arbitrary quantities, and it looks to me like BigInt objects aren't necessarily easily interoperable with Number objects, but it would be nice to specify the design tradeoffs and issues somewhere. I don't see BigInt mentioned anywhere in the issues or docs.

@bryevdv
Copy link
Member

bryevdv commented Dec 4, 2023

For reference, Bokeh began in 2012, but BigInt did not start to exist in JavaScript implementations until 2018 or so. There was no decision to even consider in 2012, and since then there is the inertia of years of status quo to overcome. This issue is not unknown but it is one that affects a relatively small segment of use-cases, whose fixing will carry non-negligible risk and effort due to its low-level nature, and no-one has simultaneously had the time, money, and personal motivation to dive into it. Frankly I would not expect any movement on this unless or until the work is explicitly funded.

Edit: it's worth noting that even three years ago when I started the discussion linked above, that BigInt still did not exist in some major browsers (e.g. Safari) which is why it was not mentioned, even that recently. It might be a viable option to consider these days, I don't know.

@mattpap
Copy link
Contributor

mattpap commented Dec 4, 2023

We don't have any design documentation for this, because only recently BigInt (and BigInt64Arrray and BigUint64Array) became widely available (circa late 2020 and 2021 respectively; Safari significantly delayed adoption) and us dropping legacy browsers in bokeh 3.0 allowed us to finally pursue modern APIs.

The biggest issue with BigInt is that it doesn't work with JS' number type at all, i.e. only BigInt op BigInt are permitted and anything else is a TypeError. Thus even taking a half of a big integer requires a special code path x/2n vs. x/2 for number type (n is a suffix indicating BigInt literals). Only a fraction of bokehjs' code does computations on raw data, but it's not an insignificant amount to deal with, both on the input side of the data pipeline and its output side (tooltips, etc.). Given the amount of work involved, I figured it would be easier to employ WASM and get 64-bit support essentially for free. That work didn't get anywhere far for now, due to WASM's inherent issues and the scale of that work.

On top of that, JavaScript doesn't offer operator overloading, so it isn't possible to create generic, readable/idiomatic and performant code that would work with both number types. Another issue is performance. Based on some on-line benchmarks, BigInt introduces a 4x slowdown compared to number type, though it's not a fair comparison, because we would have to compare it with manual 64-bit arithmetics and not floating-point arithmetics.

@jbednar
Copy link
Contributor

jbednar commented Dec 5, 2023

That's all precisely the design considerations I was looking for, thanks! Not sure if this issue is the best place for it, but certainly better to have it here than not! Thanks so much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants