Enclosure should be remove when stripping BOM sequence #102

Closed
onema opened this Issue Jun 4, 2015 · 6 comments

Comments

Projects
None yet
2 participants
@onema

onema commented Jun 4, 2015

I have the following code:

$csv = '"Paret name","Child name","Title"
"parent 1","child 1","title"
"parent 2","child 2","title"
';
$csvReader = Reader::createFromString($csv);
$assoc = $csvReader->fetchAssoc();

And the associative array returned by fetchAssoc looks like this:

Array (
    [0] => Array (
            ["Paret name"] => parent 1
            [Child name] => child 1
            [Title] => title
        )
    [1] => Array (
            ["Paret name"] => parent 2
            [Child name] => child 2
            [Title] => title
        )
)

Where parent name contains the quotes ". At fist, I thought there was something wrong with the fetchAssoc method but after close inspection I realized that the file I was getting the CSV from was UTF-8 with BOM. Then I try to set stripBom to true, but it didn't work as I was still getting the same output.

$csv = '"Paret name","Child name","Title"
"parent 1","child 1","title"
"parent 2","child 2","title"
';
$csvReader = Reader::createFromString($csv);
$csvReader->stripBom(true);
$assoc = $csvReader->fetchAssoc();

The only way for me to get it to work was to manually strip the BOM before creating the reader:

$csv = preg_replace('/\x{FEFF}/u', '', $csv);
  • Am I using the reader correctly?
  • Should I expect the reader to automatically remove the BOM for me if stripBom is set to true?

@nyamsprod nyamsprod added the bug label Jun 4, 2015

@nyamsprod nyamsprod changed the title from Strip BOM doens't seem to work properly to Reader::fetchAssoc remove the BOM sequence too late Jun 4, 2015

@nyamsprod nyamsprod changed the title from Reader::fetchAssoc remove the BOM sequence too late to Enclosure should be remove when stripping BOM sequence Jun 4, 2015

nyamsprod added a commit that referenced this issue Jun 4, 2015

Resolve issue #102
The enclosure character should also be stripped when
The BOM character is removed to properly output CSV content
@nyamsprod

This comment has been minimized.

Show comment
Hide comment
@nyamsprod

nyamsprod Jun 4, 2015

Member

@onema thanks for reporting the bug. A patch was made available on the master branch. After review and testing a patch release will be made next week. In the meantime you can test that the dev-master branch fix works for you

Member

nyamsprod commented Jun 4, 2015

@onema thanks for reporting the bug. A patch was made available on the master branch. After review and testing a patch release will be made next week. In the meantime you can test that the dev-master branch fix works for you

nyamsprod added a commit that referenced this issue Jun 4, 2015

Add missing test
Test regarding #102 issue added
@onema

This comment has been minimized.

Show comment
Hide comment
@onema

onema Jun 4, 2015

Awesome! will update and be testing it today.

onema commented Jun 4, 2015

Awesome! will update and be testing it today.

@onema

This comment has been minimized.

Show comment
Hide comment
@onema

onema Jun 4, 2015

I just tested the master branch and fetch* methods work as long as I call stripBom(true) before each call. It threw me off a bit, but looking at the code it seems this is the intended behavior; I didn't see anything related to this in the docs. Is this the intended behavior?

$csv = '"Paret name","Child name","Title"
"parent 1","child 1","title"
"parent 2","child 2","title"
';
$csvReader = Reader::createFromString($csv);
$csvReader->stripBom(true);
$assoc = $csvReader->fetchAssoc();
$all = $csvReader->fetchAll();

$assoc

Array (
    [0] => Array (
            [Paret name] => parent 1
            [Child name] => child 1
            [Title] => title
        )
    [1] => Array (
            [Paret name] => parent 2
            [Child name] => child 2
            [Title] => title
        )
)

$all

Array (
    [0] => Array (
            [0] => "Paret name"
            [1] => Child name
            [2] => Title
        )
    [1] => Array (
            [0] => parent 1
            [1] => child 1
            [2] => title
        )
    ...
)

onema commented Jun 4, 2015

I just tested the master branch and fetch* methods work as long as I call stripBom(true) before each call. It threw me off a bit, but looking at the code it seems this is the intended behavior; I didn't see anything related to this in the docs. Is this the intended behavior?

$csv = '"Paret name","Child name","Title"
"parent 1","child 1","title"
"parent 2","child 2","title"
';
$csvReader = Reader::createFromString($csv);
$csvReader->stripBom(true);
$assoc = $csvReader->fetchAssoc();
$all = $csvReader->fetchAll();

$assoc

Array (
    [0] => Array (
            [Paret name] => parent 1
            [Child name] => child 1
            [Title] => title
        )
    [1] => Array (
            [Paret name] => parent 2
            [Child name] => child 2
            [Title] => title
        )
)

$all

Array (
    [0] => Array (
            [0] => "Paret name"
            [1] => Child name
            [2] => Title
        )
    [1] => Array (
            [0] => parent 1
            [1] => child 1
            [2] => title
        )
    ...
)
@nyamsprod

This comment has been minimized.

Show comment
Hide comment
@nyamsprod

nyamsprod Jun 5, 2015

Member

Quoting the docs you are referring to:

After an extract/conversion method call, all query options are cleared;

This includes the stripBom method which is nothing more than a query option.

I know it may seems strange but this is a trade-off I choose when implementing the solution. Stripping BOM sequence is only useful when the first cell of the first row is expected in the result set. In all others situations you don't want to add another layer of Iterator when it is not needed.

Member

nyamsprod commented Jun 5, 2015

Quoting the docs you are referring to:

After an extract/conversion method call, all query options are cleared;

This includes the stripBom method which is nothing more than a query option.

I know it may seems strange but this is a trade-off I choose when implementing the solution. Stripping BOM sequence is only useful when the first cell of the first row is expected in the result set. In all others situations you don't want to add another layer of Iterator when it is not needed.

@onema

This comment has been minimized.

Show comment
Hide comment
@onema

onema Jun 5, 2015

👍 thank you for your work on this.

onema commented Jun 5, 2015

👍 thank you for your work on this.

@nyamsprod

This comment has been minimized.

Show comment
Hide comment
@nyamsprod

nyamsprod Jun 10, 2015

Member

The patch release 7.1.2 is released with the bug fix

Member

nyamsprod commented Jun 10, 2015

The patch release 7.1.2 is released with the bug fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment