Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Translation of large text BLOB between UNICODE_FSS (UTF8) and other charsets [CORE2122] #976

Closed
firebird-issue-importer opened this issue Oct 14, 2008 · 18 comments

Comments

@firebird-issue-importer

Submitted by: @ibprovider

Attachments:
filters_1_63_dirty_patch.txt

I made some tests for checks the translation of BLOB between UTF8 and other charsets

At small BLOB these tests work fine.

At large BLOB - I get the error "Cannot transliterate character between character sets"

For example:
- Meta: BLOB UNICODE_FSS
- Insert [connection ctype: UNICODE_FSS] large string with 1048576 UTF8 chars from CP943C charset
- Select [connection ctype: CP943C]: "Cannot transliterate character between character sets"

for 1024 chars - no problem at select

----------
- Meta: BLOB UNICODE_FSS
- Insert [connection ctype: CP943C] large string with 32767 CP943C chars: Cannot transliterate character between character sets

with 1024 chars - insert is OK.

----------
I made tests for BIG_5, TIS620, WIN1251 also, and received a similar problem.

Banzay

Commits: e1cb23f acb1151 99246d8

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Oct 14, 2008

Modified by: @dyemanov

assignee: Adriano dos Santos Fernandes [ asfernandes ]

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Oct 15, 2008

Commented by: @ibprovider

If is it need, I can sent the private tests (for Windows 32/64) with demonstration of this problems.

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Oct 17, 2008

Commented by: @asfernandes

> Insert [connection ctype: UNICODE_FSS] large string with 1048576 UTF8 chars from CP943C charset

What you mean? If your blob is being created as UNICODE_FSS but you put CP943 bytes, it's obviously that you will have problems.

If that is not the case, please sent the test case.

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Oct 17, 2008

Modified by: @asfernandes

status: Open [ 1 ] => Resolved [ 5 ]

resolution: Fixed [ 1 ]

Fix Version: 2.5 Beta 1 [ 10251 ]

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Oct 18, 2008

Commented by: @ibprovider

Hi

The problems still occur for single-byte ICU-charsets - TIS620

Sample test:
blob.002.unicode.TBL_CS__TIS620.COL_BLOB.ins_UNICODE_FSS.sel_TIS620.len_32767.chars_TIS620.bind__wstr

And, after correction - for all lengths of multi-byte ICU-charset - CP943C

Sample tests:
blob.002.unicode.TBL_CS__CP943C.COL_BLOB.ins_CP943C.sel_CP943C.len_*.chars_CP943C.bind__wstr

Ofcourse, may this is other problems, and they will be decided in separate changes

See also our old BUG-1596 :-)

Thanks

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Oct 18, 2008

Commented by: @asfernandes

Does your test run in loop or it have too many blob.002* tests?

I tried run blob.002* and it never ends...

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Oct 18, 2008

Commented by: @ibprovider

I have the great workstation + patience :-)

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Oct 20, 2008

Commented by: @dyemanov

Re-opened upon request of the bug reporter. He insists the problem still exists.

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Oct 20, 2008

Modified by: @dyemanov

status: Resolved [ 5 ] => Reopened [ 4 ]

resolution: Fixed [ 1 ] =>

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Oct 20, 2008

Commented by: @asfernandes

Then I expect from Mr. Kovalenko sources for his test as well as a way to compile and debug it.

I can do nothing looking at the debugger on junk bytes that the engine has saying is bad input!!!

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Oct 22, 2008

Commented by: @asfernandes

Real problem is the following: Test case generate bytes and convert them to UTF-8 using ICU (please correct if I'm wrong, Dmitry K.). But the generated UTF-8 bytes is not valid UNICODE_FSS. Current, well formed check of UNICODE_FSS is done as with UTF-8, so string pass from a stage that it shouldn't. Later, when converting from (wrong) UNICODE_FSS to TIS620 a transliteration error is raised.

So what really need to be fixed is UNICODE_FSS well formed check, and then ask for Dmitry correct its tests. :-)

This is at least for blob.002.unicode.TBL_CS__TIS620.COL_BLOB.ins_UNICODE_FSS.sel_TIS620.len_32767.chars_TIS620.bind__wstr case. Didn't verified others yet.

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Oct 22, 2008

Commented by: @ibprovider

See attach file

But I continue get the old (and new) errors with select from TBL_CS__CP943C as UNICODE_FSS

I think this problem has link with CORE2123

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Oct 22, 2008

Modified by: @ibprovider

Attachment: filters_1_63_dirty_patch.txt [ 11110 ]

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Oct 23, 2008

Commented by: @ibprovider

>and then ask for Dmitry correct its tests. :-)
No problem, Adriano.

I has improved my tests. But has get the new, similar errors for all FB-charsets :-(

Ofcourse, except ASCII

[ FB 2.1.1 without filters__dirty_patch ]

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Oct 27, 2008

Modified by: @asfernandes

status: Reopened [ 4 ] => Resolved [ 5 ]

resolution: Fixed [ 1 ]

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Jul 7, 2010

Modified by: @dyemanov

Fix Version: 2.1.4 [ 10361 ]

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Feb 4, 2011

Modified by: @pcisar

status: Resolved [ 5 ] => Closed [ 6 ]

@firebird-issue-importer
Copy link
Author

@firebird-issue-importer firebird-issue-importer commented Jan 19, 2016

Modified by: @pavel-zotov

QA Status: No test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants