Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-5981: [C++] Propagate errors from MemoTable to DictionaryBuilder #6347

Closed

Conversation

pitrou
Copy link
Member

@pitrou pitrou commented Feb 3, 2020

No description provided.

@pitrou
Copy link
Member Author

pitrou commented Feb 3, 2020

This incurs a noticeable slowdown in building integer dictionaries (10-20% depending on the benchmark). String dictionaries are not affected.

@github-actions
Copy link

github-actions bot commented Feb 3, 2020

@wesm
Copy link
Member

wesm commented Feb 4, 2020

Makes me wonder if there should not also be a set of "unsafe" APIs that skip error propagation and instead set a "global" error state in the memo table instance. So after a sequence of multiple operations you can check if any error occurred. (Just thinking out loud)

@pitrou
Copy link
Member Author

pitrou commented Feb 4, 2020

Makes me wonder if there should not also be a set of "unsafe" APIs that skip error propagation and instead set a "global" error state in the memo table instance.

i'll try that out.

@pitrou
Copy link
Member Author

pitrou commented Feb 4, 2020

Ok, after trying it out, I don't think it's worthwhile. AdaptiveIntBuilder can fail at almost any point (because it can double the stored integer width dynamically). In general, even if we don't check errors when inserting values into the memo table, we'll check errors somewhere else, when growing another buffer somewhere.

@pitrou pitrou force-pushed the ARROW-5981-memo-table-error-checking branch from 33158e7 to 0bdaea6 Compare February 4, 2020 16:22
@wesm
Copy link
Member

wesm commented Feb 4, 2020

OK, I figure if it shows up in a performance critical context in the future this can be analyzed more closely for use-case-specific optimizations

@pitrou pitrou deleted the ARROW-5981-memo-table-error-checking branch February 6, 2020 13:07
kszucs pushed a commit that referenced this pull request Feb 7, 2020
Closes #6347 from pitrou/ARROW-5981-memo-table-error-checking and squashes the following commits:

0bdaea6 <Antoine Pitrou> Some cleanups
efe4411 <Antoine Pitrou> ARROW-5981:  Propagate errors from MemoTable to DictionaryBuilder

Authored-by: Antoine Pitrou <antoine@python.org>
Signed-off-by: François Saint-Jacques <fsaintjacques@gmail.com>
brills pushed a commit to brills/arrow that referenced this pull request Mar 31, 2020
Closes apache#6347 from pitrou/ARROW-5981-memo-table-error-checking and squashes the following commits:

0bdaea6 <Antoine Pitrou> Some cleanups
efe4411 <Antoine Pitrou> ARROW-5981:  Propagate errors from MemoTable to DictionaryBuilder

Authored-by: Antoine Pitrou <antoine@python.org>
Signed-off-by: François Saint-Jacques <fsaintjacques@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants