Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-39355: [Java] Improve JdbcConsumer exceptions #39356

Merged
merged 3 commits into from
Jan 8, 2024

Conversation

aiguofer
Copy link
Contributor

@aiguofer aiguofer commented Dec 22, 2023

Rationale for this change

This helps debug Arrow conversion errors from JDBC by exposing the JdbcFieldInfo for the related column and the ArrowType for the corresponding vector.

What changes are included in this PR?

A new JdbcConsumerException which is thrown by the CompositeJdbcConsumer while consuming data for a specific vector.

Are these changes tested?

N/A

Are there any user-facing changes?

Users can now catch JdbcConsumerExceptions and extract the related ArrowType and JdbcFieldInfo from it for debugging.

Copy link

⚠️ GitHub issue #39355 has been automatically assigned in GitHub to PR creator.

@github-actions github-actions bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Dec 22, 2023
@davisusanibar
Copy link
Contributor

Thanks to @aiguofer for this enhancement. Would it be possible to add a unit test to cover these new exception improvements?

@vibhatha
Copy link
Collaborator

vibhatha commented Jan 1, 2024

@aiguofer agreeing with @davisusanibar here.
It would be better to add a test case which is failing as expected and verify the exception message as an assertion.
Also just curious would this require documentation update?

@aiguofer
Copy link
Contributor Author

aiguofer commented Jan 2, 2024

@davisusanibar @vibhatha It would certainly be great to have tests for consumer! unfortunately there's not much prior art regarding this. I'm not very familiar with the test suite in general and this would likely require mocking a ResultSet along with the corresponding ResultSetMetaData in order to properly test this, which seems like a lot of work.

Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For testing, we do use hsqldb in the suite to integration test against an actual database (instead of mocking everything). That said mocking may be easier here overall

@github-actions github-actions bot added awaiting merge Awaiting merge and removed awaiting committer review Awaiting committer review labels Jan 2, 2024
@lidavidm
Copy link
Member

lidavidm commented Jan 2, 2024

Also just curious would this require documentation update?

It would be good to add a @throws in the docstring for ArrowVectorIterator. Actually, this might be considered a breaking change since people may expect SQLException currently. (And it's documented in sqlToArrowVectorIterator.)

Possibly we should extend SQLException instead?

@vibhatha
Copy link
Collaborator

vibhatha commented Jan 3, 2024

Possibly we should extend SQLException instead?

+1

@vibhatha
Copy link
Collaborator

vibhatha commented Jan 3, 2024

For testing, we do use hsqldb in the suite to integration test against an actual database (instead of mocking everything). That said mocking may be easier here overall

should we include it in this PR itself or should we create follow up?

@aiguofer
Copy link
Contributor Author

aiguofer commented Jan 5, 2024

Sorry all, I'm just getting back from break. The exceptions are not always a SQLException, they can also be a conversion error. For example, in our case what prompted this PR was this redacted stack trace:

java.lang.RuntimeException: Error occurred while getting next schema root.
	at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.next(ArrowVectorIterator.java:181)
        ...
Caused by: java.lang.RuntimeException: Error occurred while consuming data.
	at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.consumeData(ArrowVectorIterator.java:117)
	at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.load(ArrowVectorIterator.java:159)
	at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.next(ArrowVectorIterator.java:177)
	... 10 common frames omitted
Caused by: java.lang.UnsupportedOperationException: Decimal size greater than 16 bytes: 17
	at org.apache.arrow.vector.util.DecimalUtility.writeByteArrayToArrowBufHelper(DecimalUtility.java:175)
	at org.apache.arrow.vector.util.DecimalUtility.writeBigDecimalToArrowBuf(DecimalUtility.java:136)
	at org.apache.arrow.vector.DecimalVector.set(DecimalVector.java:365)
	at org.apache.arrow.adapter.jdbc.consumer.DecimalConsumer.set(DecimalConsumer.java:79)
	at org.apache.arrow.adapter.jdbc.consumer.DecimalConsumer$NullableDecimalConsumer.consume(DecimalConsumer.java:101)
	at org.apache.arrow.adapter.jdbc.consumer.CompositeJdbcConsumer.consume(CompositeJdbcConsumer.java:46)
	at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.consumeData(ArrowVectorIterator.java:106)
	... 12 common frames omitted

Currently, the JdbcConsumerException extends a RuntimeException so no throws is necessary is non-breaking. It could potentially be useful to extend Exception instead and add throws, but to your point that'd be a breaking change.

@lidavidm
Copy link
Member

lidavidm commented Jan 5, 2024

Ok, thanks. Then I think just documenting it in @throws is sufficient.

@aiguofer
Copy link
Contributor Author

aiguofer commented Jan 7, 2024

@lidavidm done!

@lidavidm lidavidm merged commit a71c941 into apache:main Jan 8, 2024
15 checks passed
@lidavidm lidavidm removed the awaiting merge Awaiting merge label Jan 8, 2024
@github-actions github-actions bot added the awaiting merge Awaiting merge label Jan 8, 2024
Copy link

After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit a71c941.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 5 possible false positives for unstable benchmarks that are known to sometimes produce them.

clayburn pushed a commit to clayburn/arrow that referenced this pull request Jan 23, 2024
### Rationale for this change

This helps debug Arrow conversion errors from JDBC by exposing the JdbcFieldInfo for the related column and the ArrowType for the corresponding vector.

### What changes are included in this PR?

A new JdbcConsumerException which is thrown by the CompositeJdbcConsumer while consuming data for a specific vector.

### Are these changes tested?

N/A

### Are there any user-facing changes?

Users can now catch `JdbcConsumerException`s and extract the related ArrowType and JdbcFieldInfo from it for debugging.

* Closes: apache#39355

Authored-by: Diego Fernandez <aiguo.fernandez@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>
dgreiss pushed a commit to dgreiss/arrow that referenced this pull request Feb 19, 2024
### Rationale for this change

This helps debug Arrow conversion errors from JDBC by exposing the JdbcFieldInfo for the related column and the ArrowType for the corresponding vector.

### What changes are included in this PR?

A new JdbcConsumerException which is thrown by the CompositeJdbcConsumer while consuming data for a specific vector.

### Are these changes tested?

N/A

### Are there any user-facing changes?

Users can now catch `JdbcConsumerException`s and extract the related ArrowType and JdbcFieldInfo from it for debugging.

* Closes: apache#39355

Authored-by: Diego Fernandez <aiguo.fernandez@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>
zanmato1984 pushed a commit to zanmato1984/arrow that referenced this pull request Feb 28, 2024
### Rationale for this change

This helps debug Arrow conversion errors from JDBC by exposing the JdbcFieldInfo for the related column and the ArrowType for the corresponding vector.

### What changes are included in this PR?

A new JdbcConsumerException which is thrown by the CompositeJdbcConsumer while consuming data for a specific vector.

### Are these changes tested?

N/A

### Are there any user-facing changes?

Users can now catch `JdbcConsumerException`s and extract the related ArrowType and JdbcFieldInfo from it for debugging.

* Closes: apache#39355

Authored-by: Diego Fernandez <aiguo.fernandez@gmail.com>
Signed-off-by: David Li <li.davidm96@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Java] Improve JdbcConsumer errors
4 participants