Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support of Oracle native VARCHAR2 as Pascal-like strings (C-like strings are lossy for some cases) #40

Closed
dbpm opened this issue May 10, 2016 · 4 comments
Assignees

Comments

@dbpm
Copy link

dbpm commented May 10, 2016

Oracle allows to store strings which contain 0x0 character in the middle, it could be shown in the sample query below:

SELECT value, LENGTH(VALUE), dump(VALUE) FROM (
  SELECT 'test' || chr(0) || 'me' VALUE FROM dual
)

ocilib is not able to handle such strings, part after 0x0 character is being lost, so it will be better to implement support of Oracle native OCI_Varchar2 complex datatype (with will contain data length and pointer to data), OCI_GetVarchar2(), OCI_BindVarchar2() in addition to otext, OCI_GetString(), OCI_BindString(). It need to be implemented as enhancement, not as replacement - the same string value might be accessed different ways.

Direct Path API also need to be extended to handle the OCI_Varchar2 type.

Initially the data loss issue was detected in code which replicates tables (ETL task) between production databases. The #0 symbol was initially introduced as data separator, CSV like, in third-party solution, so it's not possible to avoid such case.

@vrogier
Copy link
Owner

vrogier commented May 10, 2016

Hi,

I understand the issue.
Bujt you will get the same problem with all (at least 99%) Oracle OCI based libraries and even with the Java thin driver.
Even Sql*Plus cannot handle these situations, nor print these values.

If you were using the OCILIB C++ API, I would tell you to add you own template specialization for Statement::Bind() using a custom container
But apparently you are using the C API.

Maybe you could trick OCILIB by callingOCI_BindSetDataSize() with the "real size" beyond the null character.

I have to check if it could work. but even if it works for inserting, it will not work for selecting.

Instead of introducing a new type, i would rather try to find a way to give applications a way to indicate and retrieve the real length of a string bind/selected column row data.
If I can't find a good solution, I will think about adding a new type...

Best regards,

Vincent

@dbpm
Copy link
Author

dbpm commented May 10, 2016

Technically it could be something like RAW datatype, but existing API does not allow us to reuse OCI_GetRaw() call over varchars.

Though I don’t have objection to have OCI_GetVarchar2 as near 1:1 copy of OCI_GetRaw, the same could be true for binds, why not?

@vrogier
Copy link
Owner

vrogier commented May 11, 2016

Hi,

Regarding your issue:
-binding in and out is not a problem
-gathering the real lenght on select can be addressed with a new function OCI_GetDataSize

unsigned int OCI_API OCI_GetDataSize
(
    OCI_Resultset *rs,
    unsigned int   index
)
{
    OCI_Define *def = NULL;

    OCI_LIB_CALL_ENTER(unsigned int , 0)

    OCI_CHECK_PTR(OCI_IPC_RESULTSET, rs)
    OCI_CHECK_BOUND(rs->stmt->con, index, 1, rs->nb_defs)

    def = OCI_GetDefine(rs, index);

    if (def && OCI_DefineIsDataNotNull(def))
    {
        ub2* lens = (ub2 *)def->buf.lens;

        call_retval = lens[rs->row_cur - 1];

        if (OCI_CDT_TEXT == def->col.datatype)
        {
            call_retval /= sizeof(otext);
        }
    }

    OCI_LIB_CALL_EXIT()
}

Here is a complete sample code and its output:

int main()
{
    OCI_Connection *cn;
    OCI_Statement  *st;
    OCI_Resultset  *rs;
    otext buf[128] = "";

    if (!OCI_Initialize(err_handler, NULL, OCI_ENV_DEFAULT))
        return EXIT_FAILURE;

    cn = OCI_ConnectionCreate("db12c", "usr", "pwd", OCI_SESSION_DEFAULT);
    st = OCI_StatementCreate(cn);

    OCI_ExecuteStmt(st, "SELECT value, LENGTH(VALUE), dump(VALUE) FROM (SELECT 'test' || chr(0) || 'me' VALUE FROM dual)");
    rs = OCI_GetResultset(st);
    while (OCI_FetchNext(rs))
    {
        printf("%s (size=%d) | %s (size=%d) | %s (size=%d)\n",
            OCI_GetString(rs, 1), OCI_GetDataSize(rs, 1),
            OCI_GetString(rs, 2), OCI_GetDataSize(rs, 2),
            OCI_GetString(rs, 3), OCI_GetDataSize(rs, 3));
    }

    OCI_Prepare(st, "begin :1 := 'test' || chr(0) || 'me'; end;");
    OCI_BindString(st, ":1", buf, sizeof(buf));
    OCI_Execute(st);

    printf("%s (size=%d), buf+5=%s\n", buf, OCI_BindGetDataSize(OCI_GetBind(st, 1)), buf + strlen(buf) + 1);

    OCI_ExecuteStmt(st, "create table temp(value varchar2(50))");
    OCI_ExecuteStmt(st, "insert into temp values( 'test' || chr(0) || 'me')");
    OCI_ExecuteStmt(st, " select value from temp");
    rs = OCI_GetResultset(st);
    while (OCI_FetchNext(rs))
    {
        char *s = OCI_GetString(rs, 1);
        printf("%s (size=%d), buf+5=%s\n", s, OCI_GetDataSize(rs, 1), s + strlen(s) + 1);
    }

    OCI_ExecuteStmt(st, "drop table temp");

    OCI_Cleanup();

    return EXIT_SUCCESS;
}

Output:

test (size=7) | 7 (size=2) | Typ=1 Len=7: 116,101,115,116,0,109,101 (size=38)
test (size=7), buf+5=me
test (size=7), buf+5=me

Is that sounds okay for you ?

@dbpm
Copy link
Author

dbpm commented May 13, 2016

OCI_GetDataSize() - it's a brilliant solution, I much satisfied, thank you.
Separate OCI_Varchar2 datatype could be unreasonably complex solution, indeed.

So simple function call fully resolves our case. I also tested Direct Path API - it handles the 0x0 char in the middle of the string properly.

Please close the issue.

@dbpm dbpm closed this as completed May 13, 2016
@vrogier vrogier self-assigned this Jan 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants