Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix skip_empty_rows('skip') and trailing newlines #65

Merged
merged 3 commits into from Dec 2, 2023

Conversation

bugfood
Copy link
Contributor

@bugfood bugfood commented Nov 18, 2023

When reading line by line, Text::CSV returns an empty row if the input file contains an empty line, even if skip_empty_rows('skip') is in use.

This happens due to the following code (greatly abridged here):

$seenSomething = 0;
LOOP:
    while($self->__get_from_src($ctx, $src)) {
        $seenSomething = 1;
        # (This conditional happens when EOL is reached and
        # skip_empty_row is set).
        if (!defined $c or $ser == 2) {  # EOF
            $waitingForField = 0;
            last LOOP;
        }
    }

if ($waitingForField) {
    if ($seenSomething or !$ctx->{useIO}) {
        return 1;
    }
    $self->SetDiag(2012);
    return;
}

The call to SetDiag is necessary to set the EOF condition (2012 is EOF). Without this, __parse() returns a truthy value, so getline() returns an empty array.

In order to make the SetDiag call happen, we need to:

  • No longer unset $waitingForField (after skipping an empty row, we're still waiting).
  • Unset $seenSomething (we did see something, but we skipped it).

This fixes:
#64

@bugfood
Copy link
Contributor Author

bugfood commented Nov 18, 2023

I split this up into two commits.

  • The first commit fixes the bug I ran into, and includes tests.
  • The second commit fixes what I think are other cases of the same bug, but I didn't see where to test them.

bugfood added a commit to bugfood/Text-CSV_XS that referenced this pull request Nov 18, 2023
When reading line by line, Text::CSV_XS returns an empty row if the
input file contains an empty line, even if skip_empty_rows('skip') is in
use.

The mechanism for this is substantially the same as is decribed in the
corresponding PR for Text::CSV:
makamaka/Text-CSV#65
bugfood added a commit to bugfood/Text-CSV_XS that referenced this pull request Nov 18, 2023
When reading line by line, Text::CSV_XS returns an empty row if the
input file contains an empty line, even if skip_empty_rows('skip') is in
use.

The mechanism for this is substantially the same as is decribed in the
corresponding PR for Text::CSV:
makamaka/Text-CSV#65

This fixes:
Tux#51
1.53 is necessary in order to support upcoming tests for skipping empty
rows. This version includes:

Tux/Text-CSV_XS#52
@bugfood
Copy link
Contributor Author

bugfood commented Nov 21, 2023

I can see that CI tests failed because of running the tests against Text::CSV_XS 1.51.

Text::CSV_XS 1.53 should work, but that is apparently not yet released, so I don't think this change can be merged until Text::CSV_XS 1.53 is released.

I updated this PR to:

  1. Use Text::CSV_XS 1.53 instead of 1.51.
  2. Modify whitespace and use test descriptions according to Text::CSV_XS.

Thanks,
Corey

@bugfood
Copy link
Contributor Author

bugfood commented Nov 22, 2023

The build failures are expected until Text::CSV_XS 1.53 is released.

When reading line by line, Text::CSV returns an empty row if the input
file contains an empty line, even if skip_empty_rows('skip') is in use.

This happens due to the following code (greatly abridged here):

    $seenSomething = 0;
    LOOP:
        while($self->__get_from_src($ctx, $src)) {
            $seenSomething = 1;
            # (This conditional happens when EOL is reached and
            # skip_empty_row is set).
            if (!defined $c or $ser == 2) {  # EOF
                $waitingForField = 0;
                last LOOP;
            }
        }

    if ($waitingForField) {
        if ($seenSomething or !$ctx->{useIO}) {
            return 1;
        }
        $self->SetDiag(2012);
        return;
    }

The call to SetDiag is necessary to set the EOF condition (2012 is EOF).
Without this, __parse() returns a truthy value, so getline() returns an
empty array.

In order to make the SetDiag call happen, we need to:
* No longer unset $waitingForField (after skipping an empty row, we're
  still waiting).
* Unset $seenSomething (we did see something, but we skipped it).

This fixes:
makamaka#64

The corresponding change is now in Text::CSV_XS from:
Tux/Text-CSV_XS#52

This change includes whitespace and test descriptions from:
Tux/Text-CSV_XS@731e78c
This follows up on commit:
fix skip_empty_rows('skip') and trailing newlines

...for a couple places where I think this is a bug as well, but I don't
know what circumstances follow those code paths.
@bugfood
Copy link
Contributor Author

bugfood commented Nov 23, 2023

I removed an extra line I had accidentally left in 67_emptrow.t.

@charsbar charsbar merged commit 07cd652 into makamaka:master Dec 2, 2023
2 of 8 checks passed
@charsbar
Copy link
Collaborator

charsbar commented Dec 2, 2023

Thanks.

@bugfood
Copy link
Contributor Author

bugfood commented Dec 2, 2023

Thank you too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants