Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix conversion of varchar to binary varbinary and vice versa #1957

Merged

Conversation

tanscorpio7
Copy link
Contributor

@tanscorpio7 tanscorpio7 commented Oct 26, 2023

Description

Binary data type simply stores hex codes but when transformed to and from varchar, use of correct encoding becomes important. For example the symbol '™' is stored as
0xE284A2 in UTF-8 enc. while
0x99 in Win encoding
From users perspective they must see the hex value which is congruent to their server encoding.

To fix this we do necessary encoding to source data in varchar <--> varbinary internal functions.
Length checks should be done when data (hex string) is in server encoding.

We must also handle the case for string literal to binary types. So we explicitly call the varcharvarbinary conversion when a string literal is being casted to binary data type.

Also added CAST functions for sys.BBF_VARBINARY to sys.BBF_BINARY and vice versa
these cast are being used in geography and geometry cast functions
made necessary changes to geography and geometry casts as well
i.e. CAST (CAST ($1 AS sys.VARCHAR) AS sys.bbf_varbinary) --> CAST ($1 AS sys.bbf_varbinary)

Engine PR: babelfish-for-postgresql/postgresql_modified_for_babelfish#248

Extension PR: #1957

Issues Resolved

[BABEL-1940]

Sign-off

Signed-off-by: Tanzeel Khan tzlkhan@amazon.com
Co-authored-by: Rohit Bhagat rohitbgt@amazon.com

Check List

  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is under the terms of the Apache 2.0 and PostgreSQL licenses, and grant any person obtaining a copy of the contribution permission to relicense all or a portion of my contribution to the PostgreSQL License solely to contribute all or a portion of my contribution to the PostgreSQL open source project.

For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Tanzeel Khan <tzlkhan@amazon.com>
Signed-off-by: Tanzeel Khan <tzlkhan@amazon.com>
Signed-off-by: Tanzeel Khan <tzlkhan@amazon.com>
@sumitj824
Copy link
Contributor

Please add tests.

Signed-off-by: Tanzeel Khan <tzlkhan@amazon.com>
Signed-off-by: Tanzeel Khan <tzlkhan@amazon.com>
Signed-off-by: Tanzeel Khan <tzlkhan@amazon.com>
Signed-off-by: Tanzeel Khan <tzlkhan@amazon.com>
Signed-off-by: Tanzeel Khan <tzlkhan@amazon.com>
Signed-off-by: Tanzeel Khan <tzlkhan@amazon.com>
Signed-off-by: Tanzeel Khan <tzlkhan@amazon.com>
Signed-off-by: Tanzeel Khan <tzlkhan@amazon.com>
Signed-off-by: Tanzeel Khan <tzlkhan@amazon.com>
@rohit01010
Copy link
Contributor

1> SELECT CONVERT(VARBINARY(10), CONVERT(VARCHAR(10), 0x330033))
2> go
Msg 33557097, Level 16, State 1, Server BABELFISH, Line 1
invalid byte sequence for encoding "UTF8": 0x00

Getting this above error with the changes from this PR, Kindly resolve and add this query to test-cases

Signed-off-by: Tanzeel Khan <tzlkhan@amazon.com>
Signed-off-by: Tanzeel Khan <tzlkhan@amazon.com>
Signed-off-by: Tanzeel Khan <tzlkhan@amazon.com>
contrib/babelfishpg_common/src/varbinary.c Show resolved Hide resolved
contrib/babelfishpg_common/src/varbinary.c Outdated Show resolved Hide resolved
test/JDBC/expected/BABEL_1940.out Show resolved Hide resolved
Signed-off-by: Tanzeel Khan <tzlkhan@amazon.com>
Signed-off-by: Tanzeel Khan <tzlkhan@amazon.com>
Signed-off-by: Tanzeel Khan <tzlkhan@amazon.com>
Signed-off-by: Tanzeel Khan <tzlkhan@amazon.com>
Signed-off-by: Tanzeel Khan <tzlkhan@amazon.com>
Deepesh125
Deepesh125 previously approved these changes Nov 10, 2023
contrib/babelfishpg_common/src/varbinary.c Outdated Show resolved Hide resolved
contrib/babelfishpg_common/src/varbinary.c Outdated Show resolved Hide resolved
Signed-off-by: Tanzeel Khan <tzlkhan@amazon.com>
@tanscorpio7 tanscorpio7 changed the title Varchar Varbinary inter conversion fix conversion of varchar to binary varbinary and vice versa Nov 10, 2023
@tanscorpio7 tanscorpio7 changed the title fix conversion of varchar to binary varbinary and vice versa proper release sys cache reference after string literal hook Nov 10, 2023
@tanscorpio7 tanscorpio7 changed the title proper release sys cache reference after string literal hook release sys cache reference after string literal hook Nov 10, 2023
@tanscorpio7 tanscorpio7 changed the title release sys cache reference after string literal hook fix conversion of varchar to binary varbinary and vice versa Nov 10, 2023
Deepesh125 pushed a commit to babelfish-for-postgresql/postgresql_modified_for_babelfish that referenced this pull request Nov 10, 2023
Release syscache reference when returning not null nodes from string literal hook to avoid cache reference leak

Engine PR: #248
Extension PR: babelfish-for-postgresql/babelfish_extensions#1957

Task: 1940
Signed-off-by: Tanzeel Khan <tzlkhan@amazon.com>
@Deepesh125 Deepesh125 merged commit 62546c5 into babelfish-for-postgresql:BABEL_3_X_DEV Nov 10, 2023
28 checks passed
tanscorpio7 added a commit to tanscorpio7/postgresql_modified_for_babelfish that referenced this pull request Nov 10, 2023
…sh-for-postgresql#248)

Release syscache reference when returning not null nodes from string literal hook to avoid cache reference leak

Engine PR: babelfish-for-postgresql#248
Extension PR: babelfish-for-postgresql/babelfish_extensions#1957

Task: 1940
Signed-off-by: Tanzeel Khan <tzlkhan@amazon.com>
Deepesh125 pushed a commit to babelfish-for-postgresql/postgresql_modified_for_babelfish that referenced this pull request Nov 10, 2023
Release syscache reference when returning not null nodes from string literal hook to avoid cache reference leak

Engine PR: #248
Extension PR: babelfish-for-postgresql/babelfish_extensions#1957

Task: 1940
Signed-off-by: Tanzeel Khan <tzlkhan@amazon.com>
Sairakan pushed a commit to amazon-aurora/postgresql_modified_for_babelfish that referenced this pull request Nov 16, 2023
…sh-for-postgresql#248)

Release syscache reference when returning not null nodes from string literal hook to avoid cache reference leak

Engine PR: babelfish-for-postgresql#248
Extension PR: babelfish-for-postgresql/babelfish_extensions#1957

Task: 1940
Signed-off-by: Tanzeel Khan <tzlkhan@amazon.com>
Sairakan pushed a commit to amazon-aurora/postgresql_modified_for_babelfish that referenced this pull request Nov 17, 2023
…sh-for-postgresql#248)

Release syscache reference when returning not null nodes from string literal hook to avoid cache reference leak

Engine PR: babelfish-for-postgresql#248
Extension PR: babelfish-for-postgresql/babelfish_extensions#1957

Task: 1940
Signed-off-by: Tanzeel Khan <tzlkhan@amazon.com>
deepakshi-mittal pushed a commit to amazon-aurora/postgresql_modified_for_babelfish that referenced this pull request Nov 22, 2023
…sh-for-postgresql#248)

Release syscache reference when returning not null nodes from string literal hook to avoid cache reference leak

Engine PR: babelfish-for-postgresql#248
Extension PR: babelfish-for-postgresql/babelfish_extensions#1957

Task: 1940
Signed-off-by: Tanzeel Khan <tzlkhan@amazon.com>
priyansx pushed a commit to amazon-aurora/postgresql_modified_for_babelfish that referenced this pull request Nov 22, 2023
…sh-for-postgresql#248)

Release syscache reference when returning not null nodes from string literal hook to avoid cache reference leak

Engine PR: babelfish-for-postgresql#248
Extension PR: babelfish-for-postgresql/babelfish_extensions#1957

Task: 1940
Signed-off-by: Tanzeel Khan <tzlkhan@amazon.com>
Comment on lines +49 to +54
CREATE CAST (sys.BBF_BINARY AS sys.BBF_VARBINARY)
WITHOUT FUNCTION AS IMPLICIT;

CREATE CAST (sys.BBF_VARBINARY AS sys.BBF_BINARY)
WITHOUT FUNCTION AS IMPLICIT;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No upgrade script changes for these?

@tanscorpio7 tanscorpio7 deleted the BABEL_1940 branch December 12, 2023 08:28
staticlibs pushed a commit to wiltondb/postgresql_modified_for_babelfish that referenced this pull request Mar 15, 2024
…sh-for-postgresql#248)

Release syscache reference when returning not null nodes from string literal hook to avoid cache reference leak

Engine PR: babelfish-for-postgresql#248
Extension PR: babelfish-for-postgresql/babelfish_extensions#1957

Task: 1940
Signed-off-by: Tanzeel Khan <tzlkhan@amazon.com>
staticlibs pushed a commit to wiltondb/babelfish_extensions that referenced this pull request Mar 15, 2024
…sh-for-postgresql#1957)

Binary data type simply stores hex codes but when transformed to and from varchar, use of correct encoding becomes important when we cast to and from string data types. For example the symbol '™' is stored as 0xE284A2 in UTF-8 encoding while 0x99 in WIN1252 encoding. From users perspective they must see the hex value which is congruent to their server encoding.

To fix this we do necessary encoding to source data in varchar <--> varbinary internal functions. Length checks should be done when data (hex string) is in server encoding.

We must also handle the case for string literal to binary types. So we explicitly call the varcharvarbinary conversion when a string literal is being casted to binary data type.

Also added CAST functions for sys.BBF_VARBINARY to sys.BBF_BINARY and vice versa. These cast are being used in geography and geometry data types cast functions. Made necessary changes to geography and geometry casts as well
i.e. CAST (CAST ($1 AS sys.VARCHAR) AS sys.bbf_varbinary) --> CAST ($1 AS sys.bbf_varbinary)

Engine PR: babelfish-for-postgresql/postgresql_modified_for_babelfish#248
Extension PR: babelfish-for-postgresql#1957

Task: BABEL-1940
Signed-off-by: Tanzeel Khan tzlkhan@amazon.com
Co-authored-by: Rohit Bhagat rohitbgt@amazon.com
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants