Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'utf-8' codec can't decode byte #493

Closed
azdobylak opened this issue Mar 10, 2023 · 10 comments
Closed

'utf-8' codec can't decode byte #493

azdobylak opened this issue Mar 10, 2023 · 10 comments

Comments

@azdobylak
Copy link

After migration to Vertica 12.0.3 (from 12.0.1) I have an issue with decoding messages. I am using cluster with 3 nodes, running query on 3rd node returns an error:

conn_info = {
    'user': '***',
    'password': '***',
    'host': '<node3>',
    'port': 5433,
    'database': 'vertica_db',
    'unicode_error': 'ignore',
}

connection = vertica_python.connect(**conn_info)
cur = connection.cursor()
cur.execute('select 3;')
print(cur.fetchone())
Traceback (most recent call last):
File "test_vertica_python.py", line 14, in <module>
print(cur.fetchone())
File "/home/user/.pyenv/versions/tsg-scripts/lib/python3.7/site-packages/vertica_python/vertica/cursor.py", line 330, in fetchone
self._message = self.connection.read_message()
File "/home/user/.pyenv/versions/tsg-scripts/lib/python3.7/site-packages/vertica_python/vertica/connection.py", line 694, in read_message
message = BackendMessage.from_type(type_, self.read_bytes(size - 4))
File "/home/user/.pyenv/versions/tsg-scripts/lib/python3.7/site-packages/vertica_python/vertica/messages/message.py", line 102, in from_type
return klass(data, **kwargs)
File "/home/user/.pyenv/versions/tsg-scripts/lib/python3.7/site-packages/vertica_python/vertica/messages/backend_messages/command_complete.py", line 58, in __init__
self.command_tag = data.decode('utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd0 in position 0: unexpected end of data

At the same time the query executes correctly on 1st and 2nd node. The issue also occurred on 2nd node, and for some time worked on 3rd node.
The issue was spotted only with vertica_python, it works fine with pyodbc and vsql, so I suppose it is possible to improve response handling. I'm using vertica_python==1.3.1. To exclude network issue I reproduced the issue on the server (executed the script locally), the error was the same.

@azdobylak
Copy link
Author

I tried to debug the issue, I check the value of fetched bytes on working and bugged host.

  • Correct one returns: type: b'C', size: 5, bytes: b'\x00'
  • Incorrect host: type: b'C', size: 6, bytes: b'\xc2\x00'

If I blindly overwrite the value C2 00 to 00 and proceed then it works, returns correct result. Would appreciate help from someone who actually understands the meaning of these codes.

@sitingren
Copy link
Member

sitingren commented Mar 10, 2023

Hi @azdobylak, I have no clue based on your description, and I cannot reproduce the problem in my environment.

  • Can you confirm this is a problem caused by a server upgrade (from 12.0.1 to 12.0.3), and has no relation with client version upgrade?
  • Does all 3 nodes have the same environment settings?
  • Have you turned on TLS?
  • Does that happen on specific queries?

@azdobylak
Copy link
Author

azdobylak commented Mar 10, 2023

  • At first, it looked so. But we just upgraded dev environment and I was unable to reproduce the error there. No relation to client upgrade - same error with vertica_python==1.0.5.
  • Yes
  • TLS off
  • Yes. When I replaced select 3; with select <column> from <table>; then it proceeds to RowDescription condition and returns correct response

@sitingren
Copy link
Member

You should receive RowDescription(type: b'T'), DataRow(type: b'D'), CommandComplete(type: b'C') messages in order.

It seems to be not a problem of vertica-python, but a problem of either your environment or the server. If you still see such error in your upgraded dev environment, please leave your OS info, python version, server version (x.x.x-x), I'll report this to the server team.

@sitingren
Copy link
Member

sitingren commented Apr 19, 2023

The workaround is published in vertica-python v1.3.2.

In the meantime, anyone hit this problem are welcome to report their error details here, we'd like to collect as much info as possible to fix this server bug. v1.3.2 will issue a warning message if you hit this issue, you can share that warning in this thread.

@snc-lin
Copy link

snc-lin commented May 16, 2023

command tag length: 1
command tag content: b'\xaf'
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xaf in position 0: invalid start byte
Server version: 12.0.4
Query executed (if possible): xxx
The OS of each server node (if possible): CentOS 7.9.2009
The locale of each server node (if possible): en_US@collation=binary

@chathawee
Copy link

I also got the same error on the version 1.3.2
This is the error vertica error: 'utf-8' codec can't decode byte 0xe9 in position 1: invalid continuation byte

@sitingren
Copy link
Member

Server version 12.0.4-3 has the full fix now. Please upgrade your server to fix this issue.

@sitingren sitingren changed the title Unable to decode messages 'utf-8' codec can't decode byte May 26, 2023
@chathawee
Copy link

@sitingren Did you mean Vertica to version 12.0.4-3 right?

@sitingren
Copy link
Member

@chathawee Yes

@sitingren sitingren unpinned this issue Aug 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants