Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An error occurred translating string for a field to UTF-8 retrieving truncated unicode character #993

Open
maxiwheat opened this issue May 16, 2019 · 2 comments

Comments

@maxiwheat
Copy link

maxiwheat commented May 16, 2019

PHP Driver version or file name

5.6.1

SQL Server version

2014

Client operating system

Red Hat 7

PHP version

7.2.18

Microsoft ODBC Driver version

17

Table schema

Problem description

When I try to retrieve data stored in a column that contains a truncated 4 bytes UTF-8 (or USC-2) character, the driver generates an error: An error occurred translating string for a field to UTF-8: Error code 0x0 and the row cannot be fetched at all.

As I cannot always know in advance what the data would be in a column of a table, if it contains an invalid character, it prevents my code from getting the data.

Expected behavior and actual behavior

I expect the truncated string/chars to be return as is, replaced with a question mark or at least completely removed. This query runs okay, without any error in SSMS and the string appears truncated (the second character is replaced by a question mark like this �)

Repro code or steps to reproduce

    $connInfo = array(
        'Database' => 'MyDatabase',
        'UID' => 'MyUsername',
        'PWD' => 'MyPassword',
        'LoginTimeout' => 5,
        'CharacterSet' => 'UTF-8'
    );

    $conn = sqlsrv_connect('myserver.mydomain.com', $connInfo);
    
    // The christmas tree emoji is 4 bytes, if we store two of them in a NVARCHAR(3), which holds 6 bytes, the second christmas tree character is truncated
    $sql = "SET NOCOUNT ON;
            DECLARE @val NVARCHAR(3) = N'🎄🎄';
            CREATE TABLE #tmpTest (testCol NVARCHAR(3));
            INSERT INTO #tmpTest (testCol) VALUES (@val);
            SELECT * from #tmpTest;";

    $stmt = sqlsrv_query($conn, $sql);

    $row = sqlsrv_fetch_array($stmt, SQLSRV_FETCH_ASSOC);

    $errors = sqlsrv_errors(SQLSRV_ERR_ERRORS);
    // $errors contain "An error occurred translating string for a field to UTF-8: Error code 0x0"
    var_dump($row, $errors);
@maxiwheat maxiwheat changed the title An error occurred translating string for a field to UTF-8 retrieving truncated string An error occurred translating string for a field to UTF-8 retrieving truncated unicode character May 16, 2019
@david-puglielli
Copy link
Contributor

@maxiwheat Thank you for reporting this. I am able to reproduce it; we will keep this thread updated.

@yitam
Copy link
Contributor

yitam commented Jun 3, 2019

@maxiwheat this is by design. We have done some tests in our drivers -- insertion of invalid characters fails. The same happens with MySQL and PostgreSQL.

On the other hand, with fetching, we might consider the possibility of being more lenient. Since this changes the existing behavior, the default is the strict way but we might provide an option for ignoring errors such that fetching can continue, which means that we allow displaying bad data (unpredictable).

We will leave this issue open and see what the other users think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants