-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-39017: [JS] Add typeId
as attribute
#39018
Conversation
|
Makes sense to me overall. My main concern would be potential changes to the performance of creating types. Did you look at that? |
I tested const arrow = require('./targets/apache-arrow');
console.time("create data type")
for (let i = 0; i < 100000; i++) {
new arrow.Uint16();
}
console.timeEnd("create data type") on this branch and it was 3.861ms, and on main it was 1.587ms. So if my moving of zeros is correct, each class instantiation is an extra 22 nanoseconds? |
Since types are not singletons and we instantiate types a lot this may or may not matter. Let's run the benchmark suite and see. There may also be additional memory usage as well but again might be negligible. @trxcllnt thoughts? Could we move the property onto the prototype? Would that work with transferring? |
I wasn't familiar with the benchmark suite; I'll try to run that
So as far as I can tell, anything on the prototype is lost. The only options to support the structured clone are either:
|
I'm good with this change but want to hear @trxcllnt's opinion before we merge. IIRC, the reason why types are not singletons is that some types need details in the constructor. However, maybe we could add a type builder and then we could use singletons for the common types like float, string, and int. But that's beyond the scope of this pull request. |
@ursabot please benchmark lang=JavaScript |
Benchmark runs are scheduled for commit f4a5322. Watch https://buildkite.com/apache-arrow and https://conbench.ursa.dev for updates. A comment will be posted here when the runs are complete. |
Thanks for your patience. Conbench analyzed the 1 benchmarking run that has been run so far on PR commit f4a5322. There were no benchmark performance regressions. 🎉 The full Conbench report has more details. |
Can you update it to add support for the new LargeUTF8 type? Did you see any differences in the benchmark suite? I wonder whether it would be worth using singletons for common/primitive types. But that's orthogonal to this pull request and I am happy to merge if the benchmarks don't show huge differences. I don't really trust this but https://conbench.ursa.dev/compare/runs/2d46394a52e04edab18a051c253c317d...3d356a645a664215934860d5a54e1828/ shows some big differences (where this is faster ????) |
I ran This branch:
Main branch:
|
I hate statistics and forget how to compute if something is statistically significant, but moving from
to
seems like quite an impressive change. I can't find where |
get Vector is defined at Lines 124 to 136 in ec41209
|
In the default configuration we use typescript so you can just run |
Either way, this looks good. Let's merge and iterate from here. Maybe using singletons is even faster. |
Before
after this change
|
After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit 90f7eca. There were no benchmark performance regressions. 🎉 The full Conbench report has more details. It also includes information about 2 possible false positives for unstable benchmarks that are known to sometimes produce them. |
Sorry, I meant I couldn't immediately tell where the implementation of Lines 167 to 172 in 90f7eca
but couldn't tell where it gets overridden.
Oh cool. I just wanted to make sure I wasn't accidentally testing on the wrong version |
Get is defined in the get visitor: https://github.com/apache/arrow/blob/main/js/src/visitor/get.ts. |
### Rationale for this change Support reconstructing `DataType` after `postMessage`. ### What changes are included in this PR? Make `typeId` an attribute, not a getter. ### Are these changes tested? Passes all existing tests. <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> ### Are there any user-facing changes? No <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please uncomment the line below and explain which changes are breaking. --> <!-- **This PR includes breaking changes to public APIs.** --> <!-- Please uncomment the line below (and provide explanation) if the changes fix either (a) a security vulnerability, (b) a bug that caused incorrect or invalid data to be produced, or (c) a bug that causes a crash (even when the API contract is upheld). We use this to highlight fixes to issues that may affect users without their knowledge. For this reason, fixing bugs that cause errors don't count, since those are usually obvious. --> <!-- **This PR contains a "Critical Fix".** --> * Closes: apache#39017
### Rationale for this change Support reconstructing `DataType` after `postMessage`. ### What changes are included in this PR? Make `typeId` an attribute, not a getter. ### Are these changes tested? Passes all existing tests. <!-- We typically require tests for all PRs in order to: 1. Prevent the code from being accidentally broken by subsequent changes 2. Serve as another way to document the expected behavior of the code If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> ### Are there any user-facing changes? No <!-- If there are user-facing changes then we may require documentation to be updated before approving the PR. --> <!-- If there are any breaking changes to public APIs, please uncomment the line below and explain which changes are breaking. --> <!-- **This PR includes breaking changes to public APIs.** --> <!-- Please uncomment the line below (and provide explanation) if the changes fix either (a) a security vulnerability, (b) a bug that caused incorrect or invalid data to be produced, or (c) a bug that causes a crash (even when the API contract is upheld). We use this to highlight fixes to issues that may affect users without their knowledge. For this reason, fixing bugs that cause errors don't count, since those are usually obvious. --> <!-- **This PR contains a "Critical Fix".** --> * Closes: apache#39017
Rationale for this change
Support reconstructing
DataType
afterpostMessage
.What changes are included in this PR?
Make
typeId
an attribute, not a getter.Are these changes tested?
Passes all existing tests.
Are there any user-facing changes?
No
Vector
,Data
,DataType
objects #39017