charset #310

DiegoGuimaDev · 2019-04-18T17:59:15Z

i am trying to run a select, but i'm receiving an error
ERROR: invalid byte sequence for encoding "LATIN1": 0x00

ORACLE_CHARSET = WE8ISO8859P1
POSTGRES_CHARSET = LATIN1

they are equivalent in the documentation, is this a bug?

laurenz · 2019-04-19T09:41:59Z

You could say it is a shortcoming of PostgreSQL: It does not allow zero bytes in character data types, even though they are valid UNICODE characters. The reason is that zero bytes are treated as end-of-string markers in C. Changing this would be too complicated, so it won't happen.

But what can you do to solve your problem?

Update the offending data at the source. Even though it is allowed, few people put zero bytes into Oracle strings by design, and usually nobody minds if they get stripped out:
```
UPDATE mytable
SET stringcol = replace(stringcol, chr(0))
WHERE stringcol LIKE '%' || chr(0) || '%';
```

Define the foreign table so that it does the same thing while reading from Oracle:

CREATE FOREIGN TABLE (
   id integer NOT NULL,
   stringcol text
) SERVER oraserver
OPTIONS (table '(SELECT id, replace(stringcol, chr(0)) AS stringcol FROM mytable)');

Is there a good reason to use LATIN1 and not UTF8 in PostgreSQL?

laurenz · 2022-12-05T17:26:39Z

Just for the record, I'll mark this as a duplicate of #114.

laurenz added the problem label Apr 19, 2019

laurenz closed this as completed May 8, 2019

Repository owner locked as resolved and limited conversation to collaborators May 8, 2019

laurenz added the duplicate label Dec 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

charset #310

charset #310

DiegoGuimaDev commented Apr 18, 2019

laurenz commented Apr 19, 2019

laurenz commented Dec 5, 2022

charset #310

charset #310

Comments

DiegoGuimaDev commented Apr 18, 2019

laurenz commented Apr 19, 2019

laurenz commented Dec 5, 2022