New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad column type guessing on lat/long data points #316

Closed
stephenturner opened this Issue Nov 16, 2015 · 1 comment

Comments

Projects
None yet
3 participants
@stephenturner

stephenturner commented Nov 16, 2015

I've got some data with populations, latitude, longitude in this kind of format:

pop_name    geo_region  maxlon  minlon  maxlat  minlat
Ami EastAsia    121E    121.5E  24N 22.5N
Hakka   EastAsia    105E    122E    35N 22N
Biaka   Africa  15E 20E 5N  2N
Mbuti   Africa  26E 30E 3N  0N

Here's the gist:
https://gist.github.com/stephenturner/faf8d38acd38fdbd6197

When I try to import that with readr:

library(dplyr)
library(readr)
read_tsv("https://gist.githubusercontent.com/stephenturner/faf8d38acd38fdbd6197/raw/e0b6e395bf0315f32fbb1b8415e186c82754f58b/readr_bad_coltype_inference.txt")

Here's what I get back:

Source: local data frame [4 x 6]

  pop_name geo_region maxlon minlon maxlat minlat
     (chr)      (chr)  (dbl)  (dbl)  (dbl)  (chr)
1      Ami   EastAsia    121  121.5     24  22.5N
2    Hakka   EastAsia    105  122.0     35    22N
3    Biaka     Africa     15   20.0      5     2N
4    Mbuti     Africa     26   30.0      3     0N

Looks like I'm stripping the N/E/S/W and converting to double.

Edit: sessionInfo, using readr 0.2.2

R version 3.2.2 (2015-08-14)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.10.5 (Yosemite)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] readr_0.2.2 dplyr_0.4.3

loaded via a namespace (and not attached):
[1] magrittr_1.5      R6_2.1.1          assertthat_0.1    rsconnect_0.4.1.9
[5] parallel_3.2.2    DBI_0.3.1         tools_3.2.2       Rcpp_0.12.2      
@hadley

This comment has been minimized.

Member

hadley commented Jun 1, 2016

Minimal reprex:

collector_guess(c("13T","13T","10N"))
#> [1] "number"

Probably an off-by-one error somewhere :(

@jimhester jimhester self-assigned this Jun 7, 2016

@jimhester jimhester added in progress and removed ready labels Jun 7, 2016

jimhester added a commit to jimhester/readr that referenced this issue Jun 7, 2016

jimhester added a commit to jimhester/readr that referenced this issue Jun 7, 2016

jimhester added a commit to jimhester/readr that referenced this issue Jun 7, 2016

@hadley hadley closed this in #419 Jun 7, 2016

hadley added a commit that referenced this issue Jun 7, 2016

@hadley hadley removed the in progress label Jun 7, 2016

@lock lock bot locked and limited conversation to collaborators Sep 25, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.