Skip to content

Bad column type guessing on lat/long data points #316

@stephenturner

Description

@stephenturner

I've got some data with populations, latitude, longitude in this kind of format:

pop_name    geo_region  maxlon  minlon  maxlat  minlat
Ami EastAsia    121E    121.5E  24N 22.5N
Hakka   EastAsia    105E    122E    35N 22N
Biaka   Africa  15E 20E 5N  2N
Mbuti   Africa  26E 30E 3N  0N

Here's the gist:
https://gist.github.com/stephenturner/faf8d38acd38fdbd6197

When I try to import that with readr:

library(dplyr)
library(readr)
read_tsv("https://gist.githubusercontent.com/stephenturner/faf8d38acd38fdbd6197/raw/e0b6e395bf0315f32fbb1b8415e186c82754f58b/readr_bad_coltype_inference.txt")

Here's what I get back:

Source: local data frame [4 x 6]

  pop_name geo_region maxlon minlon maxlat minlat
     (chr)      (chr)  (dbl)  (dbl)  (dbl)  (chr)
1      Ami   EastAsia    121  121.5     24  22.5N
2    Hakka   EastAsia    105  122.0     35    22N
3    Biaka     Africa     15   20.0      5     2N
4    Mbuti     Africa     26   30.0      3     0N

Looks like I'm stripping the N/E/S/W and converting to double.

Edit: sessionInfo, using readr 0.2.2

R version 3.2.2 (2015-08-14)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.10.5 (Yosemite)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] readr_0.2.2 dplyr_0.4.3

loaded via a namespace (and not attached):
[1] magrittr_1.5      R6_2.1.1          assertthat_0.1    rsconnect_0.4.1.9
[5] parallel_3.2.2    DBI_0.3.1         tools_3.2.2       Rcpp_0.12.2      

Metadata

Metadata

Assignees

Labels

bugan unexpected problem or unintended behavior

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions