Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error message with SQL Server #31

Closed
aersam opened this issue Dec 20, 2022 · 9 comments
Closed

Error message with SQL Server #31

aersam opened this issue Dec 20, 2022 · 9 comments
Labels
bug Something isn't working

Comments

@aersam
Copy link

aersam commented Dec 20, 2022

Hi there!

After months using this lib in production, we have some error we do not quite understand.

External error: The number of diagnostic records returned by ODBC seems to exceed `32,767` This means not\n        all (diagnostic) records could be inspected. This in turn may be problematic if invariants\n        rely on checking that certain errors did not occurr. Usually this many warnings are only\n        generated, if one or more warnings per row is generated, and then there are many rows. Maybe\n        try fewer rows, or fix the cause of some of these warnings/errors?

Do you know what that means? :)

@pacman82
Copy link
Owner

Hello @aersam ,

yes I do have a hint. Some background facts:

  1. ODBC allows for both error and warning messages to be retrieved via index diagnostic records. The integer type used to do this is a signed 16Bit Integer.
  2. arrow-odbc-py currently does not forward warnings to any python logging facility.

What happens here is that a single query you execute causes some warnings. odbc-api the Rust crate which is used in arrow-odbc which is in turn used in arrow-odbc-py tries to extract all the warnings in order to log them. It does so by retrieving diagnostic records from ODBC with increasingly higher indices until one record comes back empty. The error you see here happens if there are so many warnings or errors generated, that it exceeds the maximum value being able to represent in the diagnostics.

How does a single query generate so many errors?

Typically this happens then one warning is generated per row affected by your query. All it needs then is the query to affect at least 32,767 to trigger this error.

So what to do with this information?

  • Personally I could explore options to not even ask about these diagnostic records if I am not going to log them away in Python anyway (retrieving these records might also take some time).
  • Alternatively, if I feel more ambitious, I could explore options to sensibly forward the information to Python, but behaving gracefully in this situation. I am currently a bit sickly, so please do not expect much in the short term.
  • There is also stuff you could do without any changes in arrow-odbc. After all the overflow is just a downstream error caused by something wrong in the query itself. If you know the query which caused this you may want to try it using an artefact which emits warnings so you can fix it. One option would be to use the odbcsv command line tool to execute the same query and pay attention to what is emitted to stderr.

I am always interessted to hear more about the usecase and backgrounds. Also curious if you find the above hints helpful, or what you would suggest the behaviour of arrow-odbc-py should be in that kind of situation.

Thanks for reporting this issue and sorry for the trouble you encounter.

Best, Markus

@pacman82 pacman82 added the bug Something isn't working label Dec 20, 2022
@aersam
Copy link
Author

aersam commented Dec 20, 2022

Hi Markus
Your response time is already famous at my company :)

Your explanations gave me a good hint. I think it might be the smalldatetime Data Type of SQL Server, I'll try a CAST in my query as a quick fix.

Actually I think the ODBC driver should be fixed. One warning per row, isn't that ridiculous? :)

@pacman82
Copy link
Owner

Hi @aersam ,

Your response time is already famous at my company :)

I feel flattered, but I am also "lucky" that I am at home with a cold.

Your explanations gave me a good hint.

Did you resolve the issue? If not, the arrow-odbc 0.37.0 wheels have been released. The emit one of the warnings together with the error.

Actually I think the ODBC driver should be fixed. One warning per row, isn't that ridiculous? :)

I think the driver is fine. There are some parts about the ODBC standard itself which could be improved upon though. One aspect related to this issue is, that it is kind of hard to tell if any column in a Result set contains truncated values, because not enough memory has been allocated for its buffers. E.g. if the max_text_size parameter is set to an argument smaller than a value in a column. In order to detect such truncation the arrow-odbc Rust crate relies on inspecting all diagnostic records and checks that there is no truncation error within them. However this strategy only is reliable if it can see all diagnostic records. If not, (because too many have been generated) it can no longer reliable tell if all the data is there.

There might be some things I could do here, but for now the better error message must suffice. Please tell me if you found your issue.

Cheers, Markus

@aersam
Copy link
Author

aersam commented Jan 9, 2023

I did try some stuff, but was not very successful. Switching to datetime did not help. Though I do use arrow-odbc 0.3.7 I do not get diagnostics. How am I supposed to get these? I just do str(Error)

I tried odbcsv.exe to query the same Query, this works without error. That's a bit strange to me?

@pacman82
Copy link
Owner

pacman82 commented Jan 9, 2023

Thanks for the information. Can explain (almost) everything. Including, why it works with odbcsv. Has to wait until this evening though. In the meantime, with odbcsv (and maybe the -v (verbose) flag) you should see the diagnostics.

@pacman82
Copy link
Owner

pacman82 commented Jan 9, 2023

Okay, here the explanation, also for myself as personal reference:

Normally arrow-odbc tries to determine the size of variadic sized columns using the driver and allocates enough memory to hold the entire batch. This however can go wrong, if either the ODBC driver is badly implemented or the database schema is way to generous and would allow individual values to hold up to 2GB or more. In these cases it may be sensible to limit e.g. the maximum length of a string, in order to avoid out of memory errors. Inevitably we now paved the way for a new kind of error: What if the size of the column is now to small? (Without domain knowledge) there is no way to know. At the very least we could give the user an error in case it happens. To do so, arrow-odbc inspects all generated diagnostic records. If none of them indicate a truncated string, it concludes all the data is safe and sound even with the smaller column size. However to make that conclusion it must see all errors. However to how ODBC works under the hood, it can not safely conclude it has seen all errors if the query generates more or equal to 32,767 diagnostic records. In order to prevent silent data loss it throw the exception you see instead.

The newer version of arrow-odbc should forward you at least one of the diagnostic records in addition to that error. If it is helpful or not depends on your driver, but MSSQL is usually very good in that regard. I would be interested in the complete error message you see in the most recent version.

odbcsv checks for truncation very differently by inspecting the indicator buffer for each returned row individually. This is more robust, in this scenario (obviously), but comes with a performance penalty in the happy path.

My current idea how to move forward with this is to combine the benefits of both methods in odbc-api. Basically check if any diagnostic record is present and only if so, use the indicator buffers, to check for truncation.

odbcsv certainly will emit some warnings. You should try to fix those, then arrow-odbc-py will also work again, otherwise you have to wait until it is improved first in odbc-api then in arrow-odbc for Rust and finally updated in these Python bindings. However this may take some time. Mostly in the first step, of implementing it in odbc-api.

Best, Markus

@pacman82
Copy link
Owner

The arrow-odbc-py 0.3.9 has just been released. It no longer escalates warnings into an error, since the internal truncation check now can use a different algorithm, which does not rely on ODBC diagnostics anymore. Please tell me if this solves your problem.

@aersam
Copy link
Author

aersam commented Jan 16, 2023

Working, too!
We now load 3228 tables with 34 billion rows and a total size of 1.8 TB sucessfully with this library :)

@aersam aersam closed this as completed Jan 16, 2023
@pacman82
Copy link
Owner

Hello @aersam,

Thanks for the feedback and the stats. Always nice to hear back. Glad if you are happy about its performance.

Cheers, Markus

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants