[R] default TZ parsing woes in CSV reader

I am attempting to use open_dataset() on a large collection of CSV files in which a timestamp column sometimes has a date format and sometimes a timezone format.

readr is fine reading these both in with a col_type set to "timestamp" (i.e. see below), but arrow_read_csv insists the one must use tz="UTC" while the other must not use tz="UTC" in order for the schema to be valid.  Easiest to see this in a simple example:


```java

x <- tempfile()
df <- data.frame(time = '2021-02-01T00:00:00Z')
readr::write_csv(df, x)
schema = arrow::schema(time = timestamp("s", ""))

# ERROR cannot parse w/o tz="UTC" in the schema:
arrow::read_csv_arrow(x,schema = schema, skip=1) 

df2 <- readr::read_csv(x, col_types="T")  # works fine
```
```java

df <- data.frame(time = '2021-02-01')
readr::write_csv(df, x)
## ERROR cannot parse w/ tz="UTC" :
schema = arrow::schema(time = timestamp("s", "UTC")) 
arrow::read_csv_arrow(x,schema = schema, skip=1)

## Once again, readr has no issues:
df2 <- readr::read_csv(x, col_types="T")
 
```

**Reporter**: [Carl Boettiger](https://issues.apache.org/jira/browse/ARROW-15124) / @cboettig
**Watchers**: [Rok Mihevc](https://issues.apache.org/jira/browse/ARROW-15124) / @rok

<sub>**Note**: *This issue was originally created as [ARROW-15124](https://issues.apache.org/jira/browse/ARROW-15124). Please see the [migration documentation](https://github.com/apache/arrow/issues/14542) for further details.*</sub>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[R] default TZ parsing woes in CSV reader #30632

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[R] default TZ parsing woes in CSV reader #30632

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions