Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-39017: [JS] Add typeId as attribute #39018

Merged
merged 4 commits into from
Dec 25, 2023

Conversation

kylebarron
Copy link
Contributor

@kylebarron kylebarron commented Dec 1, 2023

Rationale for this change

Support reconstructing DataType after postMessage.

What changes are included in this PR?

Make typeId an attribute, not a getter.

Are these changes tested?

Passes all existing tests.

Are there any user-facing changes?

No

Copy link

github-actions bot commented Dec 1, 2023

⚠️ GitHub issue #39017 has been automatically assigned in GitHub to PR creator.

@domoritz
Copy link
Member

domoritz commented Dec 1, 2023

Makes sense to me overall. My main concern would be potential changes to the performance of creating types. Did you look at that?

@kylebarron
Copy link
Contributor Author

I tested

const arrow = require('./targets/apache-arrow');

console.time("create data type")
for (let i = 0; i < 100000; i++) {
    new arrow.Uint16();
}
console.timeEnd("create data type")

on this branch and it was 3.861ms, and on main it was 1.587ms. So if my moving of zeros is correct, each class instantiation is an extra 22 nanoseconds?

@domoritz
Copy link
Member

domoritz commented Dec 1, 2023

Since types are not singletons and we instantiate types a lot this may or may not matter. Let's run the benchmark suite and see.

There may also be additional memory usage as well but again might be negligible. @trxcllnt thoughts?

Could we move the property onto the prototype? Would that work with transferring?

@kylebarron
Copy link
Contributor Author

Let's run the benchmark suite and see.

I wasn't familiar with the benchmark suite; I'll try to run that

Could we move the property onto the prototype? Would that work with transferring?

From web.dev:

Prototypes: If you use structuredClone() with a class instance, you’ll get a plain object as the return value, as structured cloning discards the object’s prototype chain.

So as far as I can tell, anything on the prototype is lost. The only options to support the structured clone are either:

  • Add the type id onto DataType as an attribute
  • Somehow serialize/prepare a DataType before moving to a worker

@domoritz
Copy link
Member

domoritz commented Dec 1, 2023

I'm good with this change but want to hear @trxcllnt's opinion before we merge.

IIRC, the reason why types are not singletons is that some types need details in the constructor. However, maybe we could add a type builder and then we could use singletons for the common types like float, string, and int. But that's beyond the scope of this pull request.

@domoritz
Copy link
Member

@ursabot please benchmark lang=JavaScript

@ursabot
Copy link

ursabot commented Dec 16, 2023

Benchmark runs are scheduled for commit f4a5322. Watch https://buildkite.com/apache-arrow and https://conbench.ursa.dev for updates. A comment will be posted here when the runs are complete.

Copy link

Thanks for your patience. Conbench analyzed the 1 benchmarking run that has been run so far on PR commit f4a5322.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details.

@domoritz
Copy link
Member

Can you update it to add support for the new LargeUTF8 type? Did you see any differences in the benchmark suite?

I wonder whether it would be worth using singletons for common/primitive types. But that's orthogonal to this pull request and I am happy to merge if the benchmarks don't show huge differences.

I don't really trust this but https://conbench.ursa.dev/compare/runs/2d46394a52e04edab18a051c253c317d...3d356a645a664215934860d5a54e1828/ shows some big differences (where this is faster ????)

Screenshot 2023-12-16 at 22 11 27

@kylebarron
Copy link
Contributor Author

I ran yarn clean && yarn build && yarn perf on my M2 Pro chip on battery power.

This branch:

kyle at Kyles-MBP in ~/github/apache/arrow/js on kyle/typeid-attribute ✗                                    [03e935b0e]  15:27
> yarn perf
Prepare Data: 701.828ms
Running "vectorFromArray" suite...
from: numbers                  170 ops/s ±0.93%,  5.8 ms, 88 samples
from: booleans                 158 ops/s ±0.45%,  6.3 ms, 84 samples
from: dictionary               170 ops/s ±0.35%,  5.9 ms, 89 samples
Running "Iterate Vector" suite...
from: uint8Array             1,041 ops/s ±0.27%, 0.96 ms, 99 samples
from: uint16Array            1,002 ops/s ±0.26%, 0.99 ms, 96 samples
from: uint32Array              988 ops/s ±0.35%,    1 ms, 98 samples
from: uint64Array              361 ops/s ±0.17%,  2.8 ms, 94 samples
from: int8Array              1,043 ops/s ±0.16%, 0.96 ms, 98 samples
from: int16Array             1,001 ops/s ±0.18%,    1 ms, 98 samples
from: int32Array             1,015 ops/s ±0.42%, 0.98 ms, 99 samples
from: int64Array               349 ops/s ±0.17%,  2.9 ms, 95 samples
from: float32Array             905 ops/s ±0.47%,  1.1 ms, 95 samples
from: float64Array             921 ops/s ±0.99%,  1.1 ms, 97 samples
from: numbers                  929 ops/s ±0.34%,  1.1 ms, 98 samples
from: booleans                 334 ops/s ±0.35%,    3 ms, 91 samples
from: dictionary               358 ops/s ±0.32%,  2.8 ms, 93 samples
from: string                    89 ops/s ±0.62%,   11 ms, 78 samples
Running "Spread Vector" suite...
from: uint8Array               444 ops/s ±0.39%,  2.2 ms, 96 samples
from: uint16Array              435 ops/s ±0.87%,  2.3 ms, 94 samples
from: uint32Array              446 ops/s ±0.63%,  2.2 ms, 96 samples
from: uint64Array              192 ops/s ±0.53%,  5.2 ms, 78 samples
from: int8Array                443 ops/s ±0.49%,  2.2 ms, 96 samples
from: int16Array               450 ops/s ±0.25%,  2.2 ms, 97 samples
from: int32Array               449 ops/s ±0.34%,  2.2 ms, 97 samples
from: int64Array               195 ops/s ±0.65%,  5.1 ms, 85 samples
from: float32Array             379 ops/s ±0.62%,  2.6 ms, 82 samples
from: float64Array             376 ops/s ±0.70%,  2.6 ms, 90 samples
from: numbers                  379 ops/s ±0.63%,  2.6 ms, 83 samples
from: booleans                 203 ops/s ±0.23%,  4.9 ms, 88 samples
from: dictionary               217 ops/s ±0.31%,  4.6 ms, 86 samples
from: string                    74 ops/s ±0.24%,   14 ms, 77 samples
Running "toArray Vector" suite...
from: uint8Array        27,779,858 ops/s ±0.33%,    0 ms, 94 samples
from: uint16Array       27,641,412 ops/s ±0.25%,    0 ms, 98 samples
from: uint32Array       27,250,958 ops/s ±0.39%,    0 ms, 94 samples
from: uint64Array       28,013,695 ops/s ±0.38%,    0 ms, 94 samples
from: int8Array         27,400,403 ops/s ±0.27%,    0 ms, 99 samples
from: int16Array        27,375,344 ops/s ±0.43%,    0 ms, 96 samples
from: int32Array        26,809,273 ops/s ±0.59%,    0 ms, 90 samples
from: int64Array        27,522,709 ops/s ±0.89%,    0 ms, 94 samples
from: float32Array      24,712,256 ops/s ±0.42%,    0 ms, 98 samples
from: float64Array      24,668,548 ops/s ±0.64%,    0 ms, 96 samples
from: numbers           24,572,012 ops/s ±0.97%,    0 ms, 98 samples
from: booleans                 203 ops/s ±0.33%,  4.9 ms, 88 samples
from: dictionary               216 ops/s ±0.27%,  4.6 ms, 87 samples
from: string                    73 ops/s ±0.55%,   14 ms, 77 samples
Running "get Vector" suite...
from: uint8Array               424 ops/s ±0.28%,  2.4 ms, 92 samples
from: uint16Array              428 ops/s ±0.25%,  2.3 ms, 96 samples
from: uint32Array              431 ops/s ±0.37%,  2.3 ms, 93 samples
from: uint64Array              423 ops/s ±0.25%,  2.4 ms, 95 samples
from: int8Array                430 ops/s ±0.12%,  2.3 ms, 93 samples
from: int16Array               433 ops/s ±0.15%,  2.3 ms, 94 samples
from: int32Array               432 ops/s ±0.45%,  2.3 ms, 97 samples
from: int64Array               423 ops/s ±0.12%,  2.4 ms, 95 samples
from: float32Array            443 ops/s ±0.080%,  2.3 ms, 95 samples
from: float64Array             435 ops/s ±0.40%,  2.3 ms, 94 samples
from: numbers                  435 ops/s ±0.38%,  2.3 ms, 94 samples
from: booleans                 391 ops/s ±0.30%,  2.5 ms, 92 samples
from: dictionary              425 ops/s ±0.090%,  2.4 ms, 96 samples
from: string                    93 ops/s ±0.26%,   11 ms, 81 samples
Running "Parse" suite...
dataset: tracks, function: read recordBatches
       11,546 ops/s ±0.86%, 0.086 ms, 96 samples
dataset: tracks, function: write recordBatches
        1,269 ops/s ±4.4%,  0.73 ms, 85 samples
Running "Get values by index" suite...
dataset: tracks, column: lat, length: 1,000,000, type: Float32
         25.9 ops/s ±0.14%,   39 ms, 47 samples
dataset: tracks, column: lng, length: 1,000,000, type: Float32
         25.4 ops/s ±0.52%,   39 ms, 46 samples
dataset: tracks, column: origin, length: 1,000,000, type: Dictionary<Int8, Utf8>
         20.4 ops/s ±1.1%,    48 ms, 40 samples
dataset: tracks, column: destination, length: 1,000,000, type: Dictionary<Int8, Utf8>
         20.4 ops/s ±0.54%,   49 ms, 38 samples
Running "Iterate vectors" suite...
dataset: tracks, column: lat, length: 1,000,000, type: Float32
           94 ops/s ±0.59%,   11 ms, 82 samples
dataset: tracks, column: lng, length: 1,000,000, type: Float32
           94 ops/s ±0.38%,   11 ms, 81 samples
dataset: tracks, column: origin, length: 1,000,000, type: Dictionary<Int8, Utf8>
           36 ops/s ±0.36%,   28 ms, 64 samples
dataset: tracks, column: destination, length: 1,000,000, type: Dictionary<Int8, Utf8>
           36 ops/s ±0.24%,   28 ms, 64 samples
Running "Slice toArray vectors" suite...
dataset: tracks, column: lat, length: 1,000,000, type: Float32
        3,423 ops/s ±1.3%,  0.29 ms, 89 samples
dataset: tracks, column: lng, length: 1,000,000, type: Float32
        3,266 ops/s ±1.6%,   0.3 ms, 88 samples
dataset: tracks, column: origin, length: 1,000,000, type: Dictionary<Int8, Utf8>
         19.9 ops/s ±0.84%,   50 ms, 38 samples
dataset: tracks, column: destination, length: 1,000,000, type: Dictionary<Int8, Utf8>
         19.9 ops/s ±0.90%,   50 ms, 38 samples
Running "Slice vectors" suite...
dataset: tracks, column: lat, length: 1,000,000, type: Float32
    3,957,588 ops/s ±0.13%,    0 ms, 99 samples
dataset: tracks, column: lng, length: 1,000,000, type: Float32
    3,928,326 ops/s ±0.21%,    0 ms, 99 samples
dataset: tracks, column: origin, length: 1,000,000, type: Dictionary<Int8, Utf8>
    3,401,203 ops/s ±0.35%,    0 ms, 95 samples
dataset: tracks, column: destination, length: 1,000,000, type: Dictionary<Int8, Utf8>
    3,439,828 ops/s ±0.24%,    0 ms, 94 samples
Running "Spread vectors" suite...
dataset: tracks, column: lat, length: 1,000,000, type: Float32
         14.9 ops/s ±4.9%,    67 ms, 42 samples
dataset: tracks, column: lng, length: 1,000,000, type: Float32
         15.4 ops/s ±4.6%,    64 ms, 34 samples
dataset: tracks, column: origin, length: 1,000,000, type: Dictionary<Int8, Utf8>
         19.9 ops/s ±0.88%,   51 ms, 38 samples
dataset: tracks, column: destination, length: 1,000,000, type: Dictionary<Int8, Utf8>
           20 ops/s ±0.85%,   50 ms, 38 samples
Running "Table" suite...
Iterate, dataset: tracks, numRows: 1,000,000
           30 ops/s ±0.32%,   33 ms, 54 samples
Spread, dataset: tracks, numRows: 1,000,000
         8.59 ops/s ±6.3%,   106 ms, 26 samples
toArray, dataset: tracks, numRows: 1,000,000
         8.65 ops/s ±7.3%,   105 ms, 26 samples
get, dataset: tracks, numRows: 1,000,000
         15.5 ops/s ±0.35%,   64 ms, 43 samples
Running "Table Direct Count" suite...
dataset: tracks, column: lat, numRows: 1,000,000, type: Float32, test: gt, value: 0
         27.9 ops/s ±0.38%,   36 ms, 51 samples
dataset: tracks, column: lng, numRows: 1,000,000, type: Float32, test: gt, value: 0
         27.8 ops/s ±1.3%,    36 ms, 51 samples
dataset: tracks, column: origin, numRows: 1,000,000, type: Dictionary<Int8, Utf8>, test: eq, value: Seattle
           31 ops/s ±0.34%,   32 ms, 56 samples

Main branch:

kyle at Kyles-MBP in ~/github/apache/arrow/js on main ✗                                                                                                      [ec41209ea]  15:37
> yarn perf
Prepare Data: 716.741ms
Running "vectorFromArray" suite...
from: numbers                  164 ops/s ±1.0%,     6 ms, 85 samples
from: booleans                 148 ops/s ±0.15%,  6.8 ms, 86 samples
from: dictionary               177 ops/s ±0.15%,  5.7 ms, 91 samples
Running "Iterate Vector" suite...
from: uint8Array               969 ops/s ±0.25%,    1 ms, 98 samples
from: uint16Array              962 ops/s ±0.33%,    1 ms, 92 samples
from: uint32Array              954 ops/s ±0.41%,    1 ms, 98 samples
from: uint64Array              222 ops/s ±0.80%,  4.5 ms, 88 samples
from: int8Array                992 ops/s ±0.27%,    1 ms, 98 samples
from: int16Array               980 ops/s ±0.20%,    1 ms, 96 samples
from: int32Array               983 ops/s ±0.12%,    1 ms, 100 samples
from: int64Array               221 ops/s ±0.26%,  4.5 ms, 88 samples
from: float32Array             892 ops/s ±0.51%,  1.1 ms, 96 samples
from: float64Array             922 ops/s ±0.26%,  1.1 ms, 95 samples
from: numbers                  920 ops/s ±0.27%,  1.1 ms, 97 samples
from: booleans                 205 ops/s ±3.2%,   4.7 ms, 88 samples
from: dictionary               219 ops/s ±0.29%,  4.6 ms, 87 samples
from: string                    74 ops/s ±0.22%,   14 ms, 77 samples
Running "Spread Vector" suite...
from: uint8Array               440 ops/s ±0.33%,  2.3 ms, 95 samples
from: uint16Array              427 ops/s ±1.1%,   2.3 ms, 88 samples
from: uint32Array              432 ops/s ±0.28%,  2.3 ms, 93 samples
from: uint64Array              143 ops/s ±0.39%,    7 ms, 83 samples
from: int8Array                435 ops/s ±0.44%,  2.3 ms, 94 samples
from: int16Array               432 ops/s ±0.62%,  2.3 ms, 93 samples
from: int32Array               444 ops/s ±0.58%,  2.2 ms, 96 samples
from: int64Array               145 ops/s ±1.0%,   6.8 ms, 84 samples
from: float32Array             376 ops/s ±0.65%,  2.6 ms, 93 samples
from: float64Array             377 ops/s ±0.52%,  2.6 ms, 82 samples
from: numbers                  378 ops/s ±0.49%,  2.6 ms, 93 samples
from: booleans                 150 ops/s ±0.17%,  6.6 ms, 87 samples
from: dictionary               156 ops/s ±0.24%,  6.4 ms, 90 samples
from: string                    63 ops/s ±0.48%,   16 ms, 67 samples
Running "toArray Vector" suite...
from: uint8Array        25,110,513 ops/s ±0.34%,    0 ms, 97 samples
from: uint16Array       25,069,039 ops/s ±0.29%,    0 ms, 99 samples
from: uint32Array       25,195,442 ops/s ±0.12%,    0 ms, 99 samples
from: uint64Array       25,616,402 ops/s ±0.21%,    0 ms, 100 samples
from: int8Array         25,025,845 ops/s ±0.12%,    0 ms, 98 samples
from: int16Array        25,398,431 ops/s ±0.16%,    0 ms, 95 samples
from: int32Array        25,356,305 ops/s ±0.20%,    0 ms, 99 samples
from: int64Array        25,944,767 ops/s ±0.15%,    0 ms, 96 samples
from: float32Array      23,308,162 ops/s ±0.13%,    0 ms, 101 samples
from: float64Array      23,277,235 ops/s ±0.14%,    0 ms, 96 samples
from: numbers           23,174,411 ops/s ±0.31%,    0 ms, 98 samples
from: booleans                 150 ops/s ±0.23%,  6.7 ms, 87 samples
from: dictionary               157 ops/s ±0.26%,  6.4 ms, 90 samples
from: string                    64 ops/s ±0.54%,   16 ms, 68 samples
Running "get Vector" suite...
from: uint8Array               248 ops/s ±0.15%,    4 ms, 92 samples
from: uint16Array             250 ops/s ±0.080%,    4 ms, 93 samples
from: uint32Array              250 ops/s ±0.26%,    4 ms, 93 samples
from: uint64Array              247 ops/s ±0.26%,    4 ms, 92 samples
from: int8Array                248 ops/s ±0.17%,    4 ms, 92 samples
from: int16Array               250 ops/s ±0.13%,    4 ms, 93 samples
from: int32Array              250 ops/s ±0.080%,    4 ms, 93 samples
from: int64Array               247 ops/s ±0.13%,    4 ms, 92 samples
from: float32Array             254 ops/s ±0.50%,  3.9 ms, 94 samples
from: float64Array             250 ops/s ±0.27%,    4 ms, 93 samples
from: numbers                  251 ops/s ±0.22%,    4 ms, 93 samples
from: booleans                 234 ops/s ±0.22%,  4.3 ms, 93 samples
from: dictionary               241 ops/s ±0.18%,  4.1 ms, 89 samples
from: string                    77 ops/s ±0.20%,   13 ms, 80 samples
Running "Parse" suite...
dataset: tracks, function: read recordBatches
       11,778 ops/s ±0.67%, 0.084 ms, 90 samples
dataset: tracks, function: write recordBatches
        1,417 ops/s ±2.0%,  0.68 ms, 87 samples
Running "Get values by index" suite...
dataset: tracks, column: lat, length: 1,000,000, type: Float32
         18.5 ops/s ±0.48%,   54 ms, 50 samples
dataset: tracks, column: lng, length: 1,000,000, type: Float32
         18.3 ops/s ±0.26%,   55 ms, 50 samples
dataset: tracks, column: origin, length: 1,000,000, type: Dictionary<Int8, Utf8>
         15.4 ops/s ±0.11%,   65 ms, 43 samples
dataset: tracks, column: destination, length: 1,000,000, type: Dictionary<Int8, Utf8>
         15.4 ops/s ±0.19%,   65 ms, 43 samples
Running "Iterate vectors" suite...
dataset: tracks, column: lat, length: 1,000,000, type: Float32
           93 ops/s ±0.44%,   11 ms, 81 samples
dataset: tracks, column: lng, length: 1,000,000, type: Float32
           93 ops/s ±0.55%,   11 ms, 81 samples
dataset: tracks, column: origin, length: 1,000,000, type: Dictionary<Int8, Utf8>
         22.1 ops/s ±0.18%,   45 ms, 41 samples
dataset: tracks, column: destination, length: 1,000,000, type: Dictionary<Int8, Utf8>
           22 ops/s ±0.54%,   45 ms, 41 samples
Running "Slice toArray vectors" suite...
dataset: tracks, column: lat, length: 1,000,000, type: Float32
        3,158 ops/s ±1.7%,  0.32 ms, 83 samples
dataset: tracks, column: lng, length: 1,000,000, type: Float32
        3,224 ops/s ±1.3%,  0.31 ms, 88 samples
dataset: tracks, column: origin, length: 1,000,000, type: Dictionary<Int8, Utf8>
         15.1 ops/s ±0.72%,   66 ms, 41 samples
dataset: tracks, column: destination, length: 1,000,000, type: Dictionary<Int8, Utf8>
         15.1 ops/s ±0.27%,   66 ms, 41 samples
Running "Slice vectors" suite...
dataset: tracks, column: lat, length: 1,000,000, type: Float32
    3,568,919 ops/s ±0.60%,    0 ms, 97 samples
dataset: tracks, column: lng, length: 1,000,000, type: Float32
    3,629,150 ops/s ±0.44%,    0 ms, 98 samples
dataset: tracks, column: origin, length: 1,000,000, type: Dictionary<Int8, Utf8>
    3,243,659 ops/s ±0.41%,    0 ms, 98 samples
dataset: tracks, column: destination, length: 1,000,000, type: Dictionary<Int8, Utf8>
    3,265,349 ops/s ±0.11%,    0 ms, 92 samples
Running "Spread vectors" suite...
dataset: tracks, column: lat, length: 1,000,000, type: Float32
         14.9 ops/s ±5.3%,    65 ms, 42 samples
dataset: tracks, column: lng, length: 1,000,000, type: Float32
         15.1 ops/s ±4.5%,    66 ms, 41 samples
dataset: tracks, column: origin, length: 1,000,000, type: Dictionary<Int8, Utf8>
         14.7 ops/s ±1.2%,    66 ms, 42 samples
dataset: tracks, column: destination, length: 1,000,000, type: Dictionary<Int8, Utf8>
         15.1 ops/s ±0.12%,   66 ms, 42 samples
Running "Table" suite...
Iterate, dataset: tracks, numRows: 1,000,000
         19.9 ops/s ±0.25%,   50 ms, 37 samples
Spread, dataset: tracks, numRows: 1,000,000
         7.46 ops/s ±6.1%,   132 ms, 24 samples
toArray, dataset: tracks, numRows: 1,000,000
         7.42 ops/s ±6.2%,   130 ms, 23 samples
get, dataset: tracks, numRows: 1,000,000
         12.3 ops/s ±0.88%,   81 ms, 35 samples
Running "Table Direct Count" suite...
dataset: tracks, column: lat, numRows: 1,000,000, type: Float32, test: gt, value: 0
         18.9 ops/s ±0.61%,   53 ms, 52 samples
dataset: tracks, column: lng, numRows: 1,000,000, type: Float32, test: gt, value: 0
         19.1 ops/s ±0.15%,   52 ms, 52 samples
dataset: tracks, column: origin, numRows: 1,000,000, type: Dictionary<Int8, Utf8>, test: eq, value: Seattle
         20.2 ops/s ±0.12%,   49 ms, 38 samples

@kylebarron
Copy link
Contributor Author

I hate statistics and forget how to compute if something is statistically significant, but moving from

Running "get Vector" suite...
from: uint8Array               248 ops/s ±0.15%,    4 ms, 92 samples

to

Running "get Vector" suite...
from: uint8Array               424 ops/s ±0.28%,  2.4 ms, 92 samples

seems like quite an impressive change. I can't find where Vector.get is defined (or overloaded) but that seems wildly expensive for get to take 4 ms. Even now at 2.4ms, that seems crazy long to find the index of the chunk and then select from it.

@domoritz
Copy link
Member

get Vector is defined at

arrow/js/perf/index.ts

Lines 124 to 136 in ec41209

b.suite(
`get Vector`,
...Object.entries(vectors).map(([name, vector]) =>
b.add(`from: ${name}`, () => {
for (let i = -1, n = vector.length; ++i < n;) {
vector.get(i);
}
})),
b.cycle(cycle)
);
. It's doing more than one get here (I think 100k since that's the length of the test vectors). Absolute numbers are hard to get here, I think.

@domoritz
Copy link
Member

In the default configuration we use typescript so you can just run yarn clean && yarn perf.

@domoritz
Copy link
Member

Either way, this looks good. Let's merge and iterate from here. Maybe using singletons is even faster.

@domoritz domoritz merged commit 90f7eca into apache:main Dec 25, 2023
9 checks passed
@domoritz
Copy link
Member

Before

$ perf/index.ts
Prepare Data: 955.886ms
Running "get Vector" suite...
from: uint8Array               169 ops/s ±0.34%,  5.9 ms, 87 samples
from: uint16Array              169 ops/s ±0.25%,  5.9 ms, 87 samples
from: uint32Array              168 ops/s ±0.28%,  5.9 ms, 87 samples
from: uint64Array              165 ops/s ±0.82%,    6 ms, 86 samples
from: int8Array                168 ops/s ±0.15%,    6 ms, 87 samples
from: int16Array               169 ops/s ±0.15%,  5.9 ms, 87 samples
from: int32Array               169 ops/s ±0.18%,  5.9 ms, 87 samples
from: int64Array               165 ops/s ±0.28%,    6 ms, 86 samples
from: float32Array             171 ops/s ±0.22%,  5.8 ms, 88 samples
from: float64Array             170 ops/s ±0.24%,  5.9 ms, 88 samples
from: numbers                  170 ops/s ±0.23%,  5.9 ms, 88 samples
from: booleans                 160 ops/s ±0.32%,  6.2 ms, 83 samples
from: dictionary               170 ops/s ±0.39%,  5.8 ms, 88 samples
from: string                    60 ops/s ±0.26%,   17 ms, 64 samples

after this change

$ perf/index.ts
Prepare Data: 956.121ms
Running "get Vector" suite...
from: uint8Array               278 ops/s ±0.31%,  3.6 ms, 90 samples
from: uint16Array              288 ops/s ±0.27%,  3.5 ms, 93 samples
from: uint32Array              297 ops/s ±0.23%,  3.4 ms, 90 samples
from: uint64Array              286 ops/s ±0.80%,  3.5 ms, 92 samples
from: int8Array                295 ops/s ±0.26%,  3.4 ms, 95 samples
from: int16Array               299 ops/s ±0.13%,  3.3 ms, 91 samples
from: int32Array               299 ops/s ±0.21%,  3.3 ms, 91 samples
from: int64Array               288 ops/s ±0.19%,  3.5 ms, 93 samples
from: float32Array             302 ops/s ±0.24%,  3.3 ms, 92 samples
from: float64Array             298 ops/s ±0.21%,  3.3 ms, 90 samples
from: numbers                  297 ops/s ±0.30%,  3.4 ms, 95 samples
from: booleans                 273 ops/s ±0.28%,  3.7 ms, 94 samples
from: dictionary               304 ops/s ±0.25%,  3.3 ms, 92 samples
from: string                    70 ops/s ±0.27%,   14 ms, 73 samples

Copy link

After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit 90f7eca.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 2 possible false positives for unstable benchmarks that are known to sometimes produce them.

@kylebarron
Copy link
Contributor Author

get Vector is defined at

Sorry, I meant I couldn't immediately tell where the implementation of get is defined. I saw

arrow/js/src/vector.ts

Lines 167 to 172 in 90f7eca

/**
* Get an element value by position.
* @param index The index of the element to read.
*/
// @ts-ignore
public get(index: number): T['TValue'] | null { return null; }

but couldn't tell where it gets overridden.

In the default configuration we use typescript so you can just run yarn clean && yarn perf.

Oh cool. I just wanted to make sure I wasn't accidentally testing on the wrong version

@domoritz
Copy link
Member

Get is defined in the get visitor: https://github.com/apache/arrow/blob/main/js/src/visitor/get.ts.

@kylebarron kylebarron deleted the kyle/typeid-attribute branch December 26, 2023 17:45
clayburn pushed a commit to clayburn/arrow that referenced this pull request Jan 23, 2024
### Rationale for this change

Support reconstructing `DataType` after `postMessage`.

### What changes are included in this PR?

Make `typeId` an attribute, not a getter.

### Are these changes tested?

Passes all existing tests.

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

### Are there any user-facing changes?

No

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please uncomment the
line below and explain which changes are breaking.
-->
<!-- **This PR includes breaking changes to public APIs.** -->

<!--
Please uncomment the line below (and provide explanation) if the changes
fix either (a) a security vulnerability, (b) a bug that caused incorrect
or invalid data to be produced, or (c) a bug that causes a crash (even
when the API contract is upheld). We use this to highlight fixes to
issues that may affect users without their knowledge. For this reason,
fixing bugs that cause errors don't count, since those are usually
obvious.
-->
<!-- **This PR contains a "Critical Fix".** -->
* Closes: apache#39017
dgreiss pushed a commit to dgreiss/arrow that referenced this pull request Feb 19, 2024
### Rationale for this change

Support reconstructing `DataType` after `postMessage`.

### What changes are included in this PR?

Make `typeId` an attribute, not a getter.

### Are these changes tested?

Passes all existing tests.

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

### Are there any user-facing changes?

No

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please uncomment the
line below and explain which changes are breaking.
-->
<!-- **This PR includes breaking changes to public APIs.** -->

<!--
Please uncomment the line below (and provide explanation) if the changes
fix either (a) a security vulnerability, (b) a bug that caused incorrect
or invalid data to be produced, or (c) a bug that causes a crash (even
when the API contract is upheld). We use this to highlight fixes to
issues that may affect users without their knowledge. For this reason,
fixing bugs that cause errors don't count, since those are usually
obvious.
-->
<!-- **This PR contains a "Critical Fix".** -->
* Closes: apache#39017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[JS] Enable postMessage of Vector, Data, DataType objects
3 participants