Skip to content

Commit

Permalink
allow %d and %m to take either 1 or 2 characters, e.g. 03/04/199… (#160)
Browse files Browse the repository at this point in the history
* First shot, but still no solution to #123

The problem seems that dates like 1/5/1999 are not read with
format "%m/%d/%Y", because 01/05/1999 is expected. I think
that if you read ?strptime well, this is correct, but then base
R works differently and accepts single digit days and months. So,
I think the preferred behavior is to accept them.

This PR is messy and full with print statements. I believe that the
line marked with "need to set back the parser iterator/pointer one
character here?" should be replaced with setting back the parser
pointer one step, as we have a fail read here which does NOT raise
an error. The error then happens in next step, when the "/" (or other
separator) has been absorbed, I suspect. Happy to revisit this when
I know how the parser works.

* allow %d and %m to be 1 or 2 characters long; fix #123

* tidy

* add NEWS entry

Fixes #123
Fixes #170
  • Loading branch information
edzer authored and jimhester committed Sep 25, 2019
1 parent d6918c2 commit 9625ca7
Show file tree
Hide file tree
Showing 2 changed files with 22 additions and 2 deletions.
2 changes: 2 additions & 0 deletions NEWS.md
@@ -1,5 +1,7 @@
# vroom (development version)

* `col_date` now parses single digit month and day (@edzer, #123, #170)

* `vroom_fwf()` now handles files with dos newlines properly.

* Added benchmarks with _wide_ data for both numeric and character data (#87, @R3myG)
Expand Down
22 changes: 20 additions & 2 deletions src/DateTimeParser.h
Expand Up @@ -214,7 +214,7 @@ class DateTimeParser {
year_ += (year_ < 69) ? 2000 : 1900;
break;
case 'm': // month
if (!consumeInteger1(2, &mon_, false))
if (!consumeInteger1length1_or_2(2, &mon_, false))
return false;
break;
case 'b': // abbreviated month name
Expand All @@ -226,7 +226,7 @@ class DateTimeParser {
return false;
break;
case 'd': // day
if (!consumeInteger1(2, &day_, false))
if (!consumeInteger1length1_or_2(2, &day_, false))
return false;
break;
case 'a': // abbreviated day of week
Expand Down Expand Up @@ -434,6 +434,24 @@ class DateTimeParser {
return true;
}

// Integer indexed from 1 (i.e. month and date) which can take 1 or 2 positions
inline bool consumeInteger1length1_or_2(int n, int* pOut, bool exact = true) {
int out1, out2;
if (!consumeInteger(1, &out1, true))
return false;
else {
if (consumeInteger(1, &out2, true))
*pOut = 10 * out1 + out2;
else {
*pOut = out1;
dateItr_--; // unconsume the last read non-integer char
}
}

(*pOut)--;
return true;
}

// Integer indexed from 1 with optional space
inline bool consumeInteger1WithSpace(int n, int* pOut) {
if (consumeThisChar(' '))
Expand Down

0 comments on commit 9625ca7

Please sign in to comment.