Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unable to read microCAT cnv files created with firmware version 3 #1457

Closed
clayton33 opened this issue Nov 9, 2018 · 24 comments
Closed

unable to read microCAT cnv files created with firmware version 3 #1457

clayton33 opened this issue Nov 9, 2018 · 24 comments
Assignees

Comments

@clayton33
Copy link
Collaborator

Short summary of problem

Unable to read SeaBird MicroCAT cnv files created with firmware version 3. As of now, oce is able to read cnv files from this instrument that were created with firmware version 2.

Related, but un-related to this issues, is read.ctd.sbe intended/suitable to read moored CTD data ?

Trimmed files have been sent privately in an e-mail, data are OK to be public for testing purposes.

What you did

> library(oce)
Loading required package: testthat
Loading required package: gsw
> old <- read.oce('old.cnv')
Warning message:
In read.ctd.sbe(file, processingLog = processingLog, ...) :
  created 'salinity' from 'temperature', 'conductivity' and 'pressure'
> new <- read.oce('new.cnv')
Error in oceSetData(res, name = "salinity", value = S, unit = list(unit = expression(),  : 
  object 'S' not found
In addition: Warning messages:
1: In cnvName2oceName(lines[nameLines[iline]], columns, debug = debug -  :
  unrecognized SBE name 'cond0S/m'; consider using 'columns' to define this name
2: In cnvName2oceName(lines[nameLines[iline]], columns, debug = debug -  :
  unrecognized SBE name 'timeJV2'; consider using 'columns' to define this name
3: In read.ctd.sbe(file, processingLog = processingLog, ...) :
  cannot find salinity or conductivity in .cnv file; try using columns argument if the file actually contains these items

How urgent is this?

Not urgent. This can wait a few days

Output from sessionInfo()

sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Linux Mint 18

Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0

locale:
 [1] LC_CTYPE=en_CA.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_CA.UTF-8        LC_COLLATE=en_CA.UTF-8    
 [5] LC_MONETARY=en_CA.UTF-8    LC_MESSAGES=en_CA.UTF-8   
 [7] LC_PAPER=en_CA.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods  
[7] base     

other attached packages:
[1] oce_0.9-24     gsw_1.0-5      testthat_2.0.0

loaded via a namespace (and not attached):
[1] compiler_3.4.3 magrittr_1.5   R6_2.2.2       tools_3.4.3   
[5] yaml_2.1.19    Rcpp_0.12.18   rlang_0.1.6  
@dankelley
Copy link
Owner

dankelley commented Nov 9, 2018

Thanks. I've started work on this. I've broken the task down as follows (cuz I like checklists) ... please note that I will tick and untick these things as I work within my not-committed branch, so the list will just be for my own purposes as a scratch paper, until I post another comment.

  • make 'columns' work
  • document 'columns' better
  • add 'columns' methodology to the build-test suite, so if recoding breaks it, I'll know
  • address the object 'S' not found message
  • add "cond0S/m" to the list of recognized column name patterns
  • add "timeJV2" to the list of recognized column name patterns
  • decide what to do with this timeJV2 thing ... I don't like calling it "time" because in oce, time quantities are usually POSIX times (or strings that can be decoded into POSIX times). See seabird CTD (cnv) files should have better names for time columns #1458

@dankelley
Copy link
Owner

dankelley commented Nov 9, 2018

A test code follows:

library(oce)
old <- read.oce("old.cnv")
new1 <- read.oce("new.cnv")
new2 <- read.oce("new.cnv",
    columns=list(conductivity=list(name="cond0S/m",
                                   unit=list(unit=expression(S/m), scale=""))))

Test data files attached. Note: must rename them to old.cnv and new.cnv before running the test code.

new.cnv.txt
old.cnv.txt

@dankelley
Copy link
Owner

PS on the test files: the longitude is given unsigned, so plot() shows them on land :-)

@dankelley
Copy link
Owner

Done in "develop" branch, commit c27acc1

Reporter recheck requested.

@richardsc
Copy link
Collaborator

In answer to this question:

Related, but un-related to this issues, is read.ctd.sbe intended/suitable to read moored CTD data ?

Yes! If you look in the ctd class metadata, there is a field called "deploymentType", which can be moored. If it is, the default plots are different (but that might be the only real practical difference that I know of).

@dankelley
Copy link
Owner

Hm. If I set it to moored, it doesn't plot:

new2m <- oceSetMetadata(new2, "deploymentType", "moored")

then I get an error:

> plot(new2m, debug=10)
plot,ctd-method(..., eos="gsw", inset=FALSE, ...) {
Error in xy.coords(x, y, xlabel, ylabel, log) : 
  'x' and 'y' lengths differ
> traceback()
9: stop("'x' and 'y' lengths differ")
8: xy.coords(x, y, xlabel, ylabel, log)
7: plot.default(x, y, axes = FALSE, xaxs = xaxs, yaxs = yaxs, xlim = if (missing(xlim)) NULL else xlim, 
       ylim = if (missing(ylim)) NULL else ylim, xlab = xlab, ylab = ylab, 
       type = type, cex = cex, ...)
6: plot(x, y, axes = FALSE, xaxs = xaxs, yaxs = yaxs, xlim = if (missing(xlim)) NULL else xlim, 
       ylim = if (missing(ylim)) NULL else ylim, xlab = xlab, ylab = ylab, 
       type = type, cex = cex, ...)
5: plot(x, y, axes = FALSE, xaxs = xaxs, yaxs = yaxs, xlim = if (missing(xlim)) NULL else xlim, 
       ylim = if (missing(ylim)) NULL else ylim, xlab = xlab, ylab = ylab, 
       type = type, cex = cex, ...) at oce.R#1203
4: oce.plot.ts(x[["time"]], x[["salinity"]], ylab = resizableLabel("S", 
       "y", debug = debug - 1)) at ctd.R#3569
3: .local(x, ...)
2: plot(new2m, debug = 10)
1: plot(new2m, debug = 10)

I guess this could be a new issue, but I'll pursue it within this issue, because I think it relates to the use of a julian day, not stored as a time.

@dankelley
Copy link
Owner

dankelley commented Nov 10, 2018

Yes, it's definitely because the object has no item named "time" in the data, so x[["time"]] is getting the time from the metadata, which for this file is "2018-10-11 16:27:31 UTC". However, that is the "System Upload Time", which I think is irrelevant.

So, I think the best plan would be to dig into the timeJV2 column. For this, I'll probably need help from someone who has used this device, so I am asking an oce co-developer or Roger (if he is in github now) to advise me on this. Below I'm putting some code and results. I am using the full file here. I can't remember whether @richardsc has a copy of that file already, but I know that @clayton33 has a copy, and so does Roger (I cannot do the at-roger thing because I think he is not on GH yet ... Monday, maybe?)

library(oce)
microcat <- read.oce("2062a.cnv")
microcat <- oceSetMetadata(microcat, "deploymentType", "moored")
## Get system upload time, which should give us the year of the recovery
## time. We'll need that because we are going to construct time from
## the timeJV2 column.
microcat[["metadata"]][["time"]]
## Now, we ought to check on how many days the machine has been
## recording, because it could be multiple years, and if so we'll
## need to set our start time accordingly
range(microcat[["timeJV2"]])
## OK, under a year. Assuming no wrap-around (which would give
## an upper range of 365 or 366), we can now make a time column
t0 <- as.POSIXct("2018-01-01 00:00:00", tz="UTC")
t <- t0 + microcat[["timeJV2"]] * 86400
## Let's insert this into the data
microcat <- oceSetData(microcat, "time", t)
## Finally, to avoid confusion (whew), we destroy the time
## entry from metadata
microcat <- oceDeleteMetadata(microcat, "time")
plot(microcat)

NOTE: this code (or an update to it) is at https://github.com/dankelley/oce-issues/blob/master/14xx/1457/1457b.R but to make it work, you'll need the full .cnv file.

This creates results as below. You can see the thing going in the water. Note that the map is completely wonky because the longitude is incorrectly given as a positive number; changing that is trivial, of course, but not the main point here.

1457b

@dankelley
Copy link
Owner

PS: time could be out by a day, of course; I've no idea whether seabird starts at 0 or at 1, etc.

@dankelley
Copy link
Owner

REQUEST to BIO colleagues:

It would be wonderful if someone with access to a lot of your cnv files could run the following

git grep "^# name " *cnv > dk

and then email me (or post here) the resultant file named dk. This will let me ensure that oce can decode the column names that are likely to come up with BIO files. (I prefer to focus oce first on data that exist "in the wild", rather than trying to decode every possibility listed in every manufacturer's documentation. Writing code that is not going to get exercised is like inserting appendices into a body cavity ... just more stuff to yank out later.)

@pettipasrg
Copy link

Hi Dan. I ran the code snippet above on 12 ".cnv" files, the results are posted in the attachment.
dk.txt

What I have encountered with the time formats (regardless of "timeJ" or "timeJV2") is that it is a day number starting with 1 at Jan 01 at 00:00 UTC. It will continue at year end so that Jan 01 of the next year is 366. However I see in the module that creates the ".cnv" files that there is an option to start the next year over at day 0. While I haven't seen that used here, I will have to try it out and let you know what happens....

@dankelley
Copy link
Owner

dankelley commented Nov 13, 2018

Thanks, @pettipasrg -- I converted to a checklist, and will tick items off when I've either ensured that they work (after recoding, if need be). I'll add any new ones that you send, to this comment.

  • c0S/m: conductivity [S/m]
  • cond0S/m: Conductivity [S/m]
  • flag: 0.000e+00
  • pr: pressure [db]
  • prdM: Pressure, Strain Gauge [db]
  • sbeox0ML/L: Oxygen, SBE 43 [ml/l]
  • t090: temperature, ITS-90 [deg C]
  • timeJ: time [julian days]
  • timeJV2: Time, Instrument [julian days]
  • tv290C: Temperature [ITS-90, deg C]

@pettipasrg
Copy link

Thanks. I tried the "start over at day 0" option in the Seabird conversion module and got

365.993056
1.000000
1.006944
when the time rolled over from 2017 to 2018. This is an unlikely to occur in the file but remains a possibility.

@dankelley
Copy link
Owner

@pettipasrg -- what did the column name end up being? I hope SBE is giving a different name, so people can detect this.

Also, the test code https://github.com/dankelley/oce-issues/blob/master/14xx/1457/1457b.R has been updated, to spit out times as decoded by oce. The final 3 lines of output from that are now

t range:  2018-04-24 15:00:01.036 to 2018-10-11 16:20:00.988 
datacnv_date= # datcnv_date = Oct 11 2018 16:28:07, 7.26.1.8 [datcnv_vars = 4] 
systemUploadTime:  2018-10-11 16:27:31 

which suggests that this code is getting the time right.

If SBE is using the name timeJV2 for Julian Day starting at both 0 and 1, then I would be easy for me to cause the [[ accessor function (@clayton33 can explain what I mean by that) to do the sort of conversion I have in https://github.com/dankelley/oce-issues/blob/master/14xx/1457/1457b.R, saving the user from having to figure that out. (Anyone reading this can thumbs-up or down...)

@dankelley
Copy link
Owner

Sorry, @pettipasrg, I now see that you were doing a test of JD wrapping, not JD start time. QUESTION: does the SBE JD always start at 1? (I think that's the matlab convention. If we know that this is always the case, then I'll code in so that the [[ accessor can retrieve time in the normal R convention.)

@pettipasrg
Copy link

In my experience (I wish I could easily find documentation) it starts at day 1. But here's another wrinkle, I haven't seen it (here) before but it is an option in the Seabird time format.

name 3 = timeK: Time, Instrument [seconds]

I tried this and got records (the fourth column is the time)

18.8150   0.000020      0.107  577897201     0.0095  0.000e+00
18.8237   0.000017      0.096  577897801     0.0095  0.000e+00
18.8261   0.000017      0.105  577898401     0.0095  0.000e+00

Since the first record is 24-Apr-2018 15:00, that means that the zero time is 01-Jan-2000, so elapsed seconds since 01-Jan-2000. The sampling interval of this instrument is 10 minutes so the numbers make sense.

@dankelley
Copy link
Owner

Yes, timeK is known to oce, but it is just stored as-is in the data slot of the object returned by the read.* function. FYI, below is a snapshot from the SBE docs, which shows that timeK is well described. I may code in the accessor this afternoon, since this is on my mind ... I want a person to be able to do plot(read.oce("file.cnv")) and get something reasonable, without having to insert times into the object "manually" by recognizing what timeJV2 etc mean.

screen shot 2018-11-13 at 11 58 05 am

@dankelley
Copy link
Owner

Hm. I wonder if any of the BIO people in this thread can suggest some flag in the .cnv file that would tell me that the microcat is in mooring mode. If not, I ought to add an argument to read.ctd.sbe(), perhaps.

@pettipasrg
Copy link

I found the following in an old SBE19 file

  • SEACAT PROFILER V3.1 SN 2352 11/16/:2 13:06:23.983
  • strain gauge pressure sensor: S/N = 180159, range = 1000 psia, tc = 576
  • clk = 32767.289 iop = 157 vmain = 11.9 vlith = 5.0
  • mode = PROFILE ncasts = 9
  • sample rate = 1 scan every 0.5 seconds
  • minimum raw conductivity frequency for pump turn on = 3214 hertz
  • pump delay = 40 seconds
  • samples = 13828 free = 160300 lwait = 0 msec
  • SW1 = C8 battery cutoff = 7.3 volts
  • number of voltages sampled = 0
  • logdata = NO

Can't say for sure that it would be representative of a newer file. I'll ask around!

@pettipasrg
Copy link

Pages 28, 34, 36 and 38 of SBE Data Processing have examples.

@dankelley
Copy link
Owner

Dear reporter,

Do you think this issue (as defined by its title) has been addressed? If so, please close it. If not, please add a comment explaining what remains to be done. Thanks!

PS. This is a standardized reply.

@clayton33
Copy link
Collaborator Author

Test code works for me. Closing this, but @pettipasrg can re-open it if need be.

@dankelley
Copy link
Owner

Thanks, @clayton33 . To @pettipasrg if you run the code at https://github.com/dankelley/oce-issues/blob/master/14xx/1457/1457c.R you'll see that this works quite well. The temporal variability in the second of the two plots produced by that code (i.e. the plot showing only at-depth data) makes me think this is a pretty neat dataset!

@pettipasrg
Copy link

The example from yesterday had no time field in the data portion, so the time was determined from the header. I asked around and we can't find any SBE data files with the mode set to "MOORED".

@richardsc
Copy link
Collaborator

I don't know of any reliable header field that indicates what the deploymentType should be. There already is an arg in read.ctd(), which is what I use.

You could guess the deployment type by the instrument type, but I don't think that's worth the work (e.g. an SBE37 is almost always used as a moored instrument, and an SBE19/25 is almost always a profiling instrument)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants