-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
read.argo() cannot trim leading whitespace #2206
Comments
Below is what I get, for the first problematic file (attached). The problem is that , , 1
[,1]
[1,] " "
[2,] " "
[3,] "No significant salinity drift detected; r=1.000000 "
[,2]
[1,] " "
[2,] " "
[3,] "COEFFICIENT r FOR CONDUCTIVITY IS 1.000347, +/- 0.0003095027 "
[,3]
[1,] "ADDITIVE COEFFICIENT FOR PRESSURE ADJUSTMENT IS 0db "
[2,] " "
[3,] "r=1.000401, \xb1 2.177525e-005 " |
Oh, hang on. Maybe the problem is that |
The 0xb1 character means plus-or-minus. But I don't really know what that last entry is. The I'll look into whether there is a way to make |
Maybe I should just do as follows, because then I can employ the gsub("^[ \t\r\n](.*)[ \t\r\n]$", "\\1", value, useBytes=TRUE) |
At https://blog.r-project.org/2022/07/12/speedups-in-operations-with-regular-expressions/ I see that the advice is to avoid |
Or, I can do as follows. This avoids the use of a <- "r=1.000401, \xb1 2.177525e-005"
> trimws(a)
Error in sub(re, "", x, perl = TRUE) : input string 1 is invalid UTF-8
> b<-iconv(a,from="latin1", to="UTF-8")
> b
[1] "r=1.000401, ± 2.177525e-005"
> trimws(b)
[1] "r=1.000401, ± 2.177525e-005" |
Seems OK on this test case. Note that it is displaying the +- properly, but that's not my concern -- my concern is whether it fails. I'd be interested to know whether this is a one-off problem with this float, or whether other Argo Canada files have this property. In any case, I'm going to run my approx. 1500 argo files now, and if they seem ok (i.e. no new problems) I'll do more local checks and then push to GH. > library(oce)
> f<-"/Users/kelley/data/argo/argo_summer_project/D4900883_003.nc"
> d<-read.oce(f)
> d@metadata$scientificCalibCoefficient
, , 1
[,1]
[1,] ""
[2,] ""
[3,] "No significant salinity drift detected; r=1.000000"
[,2]
[1,] ""
[2,] ""
[3,] "COEFFICIENT r FOR CONDUCTIVITY IS 1.000347, +/- 0.0003095027"
[,3]
[1,] "ADDITIVE COEFFICIENT FOR PRESSURE ADJUSTMENT IS 0db"
[2,] ""
[3,] "r=1.000401, ± 2.177525e-005" |
All my local tests passed, and also the R-CMD action worked. I'll start the R-hub action now. That takes maybe 30 minutes to an hour so I'll come back to this later, to close it. |
The r-hub completed quickly! Maybe that's because it failed on the macos and windows machines. But the failure is because those machines cannot build I guess I'll think twice before wasting electricity on r-hub builds. Their within-R system got so flakey that I gave up on it. I was hoping this gh-action system would be better, but ... maybe not so much. Closing time. |
I'm seeing this in some argo work. We get a warning, so I suppose that may be all we need, but my plan is to look more carefully at those files, to see why this happens, and whether there is a reasonable workaround or better warning message.
Below are a few instances. Notice that this is one particular Argo float.
The text was updated successfully, but these errors were encountered: