Proposal: read.adp.ad2cp() should only read one "data type" #2005

richardsc · 2022-10-05T23:43:09Z

Basically, rather than try and read all the present data types into a single object (currently handled by doing, e.g. obj@data$average$... for the 'average' data type), we create an argument for read.adp.ad2cp() that specifies which one to read. The argument could be type or dataID or dataType (I like the latter).

This allows objects to fit the traditional oce style of @data and @metadata slots, without mixing up things that have different data fields and metadata. If an analyst needs to work with more than one of the data types together (e.g. velocities and echosounder), they can devise their own methods based working with the separate objects.

The text was updated successfully, but these errors were encountered:

richardsc · 2022-10-05T23:44:04Z

read.adp.ad2cp() could be made so tha if the user doesn't specify dataType, then it will parse the file for the types that are present, and print them to the console as a guide for what things can be read.

dankelley · 2022-10-06T09:10:40Z

Two very good ideas, thanks.

dankelley · 2022-10-06T09:12:55Z

Work on this issue, likely to begin on the weekend, will be in a branch named i2005 which descends from branch ad2cp (which in turn descends from branch develop).

dankelley · 2022-10-06T11:57:49Z

specifies which one to read. The argument could be type or dataID or dataType (I >like the latter).

@richardsc I'd like to settle on the param name. I'll give some notes, then I'll vote, and then I'll ask you to vote.

Presently, we have an argument that does this called which. (Importantly, it permits multiple types, which will not be allowed in the changes). But we use which a lot in plotting things, so I don't like it here. I prefer your suggestions type, dataID and dataType.

I'm not keen on type because that's already an argument for some read functions. It means the type of instrument, and it has appeared in some objects as @metadata$type for over a decade, so I think it is "taken".

I prefer dataType over dataID because it's more inclusive. I would allow a string or a numeric code (base-10 or base-16), so folks could choose their method.

My vote:

-1 for type
-0.5 for dataID (because I think that means only the numeric codes, which not everyone will remember)
0 for which (only >0 because it exists now, meaning I liked it before)
+1 for dataType

Can you vote, @richardsc so I can make a plan?

richardsc · 2022-10-06T17:02:01Z

I vote +1 for dataType, and I like your idea of accepting both the "name" (which would be in the docs in a table) and the base-16 code that is in the Nortek manual (i.e. the above screenshot).

dankelley · 2022-10-06T17:28:23Z

Good. I'll go with dataType. There will be concerns with what to put in metadata for each type, but they ought to be in new issues so I can keep track.

dankelley · 2022-10-07T11:43:29Z

As usual, I am starting with the documentation. It is below, in first-draft form, so the wording might change but not the meaning -- unless @richardsc interjects soonish.

As I go along, I am also modifying the test suite as appropriate. But I have not commented-out the now-failing tests, so that the package in this branch will not build-test without error. (I want those errors to tell me what to do next.)

dankelley · 2022-10-07T13:14:03Z

Here's an example of the unspecified-dataType usage (with a private file)

[1] "~/Dropbox/oce_secret_data/ad2cp/secret1_trimmed.ad2cp"
> read.adp.ad2cp(f1)
  IDhex IDdec    name occurance
1  0x15    21   burst        88
2  0x16    22 average        11
3  0xa0   160    text         1

dankelley · 2022-10-09T11:02:10Z

NB: I'm changing some warnings to messages, because they don't indicate a problematic state. This will involve changing tests like

            expect_warning(
                expect_warning(
                    d1 <- read.oce(f1, dataType="average"),
                    "using to=100 based on file contents"),
                "setting plan=0")

to read instead like

            expect_message(
                expect_message(
                    d1 <- read.oce(f1, dataType="average"),
                    "using to=100 based on file contents"),
                "setting plan=0")

An advantage of this is that people might use options(warn=3) to get into the debugger at spots that are actually problematic (e.g. an early end-of-file).

dankelley · 2022-10-09T18:09:15Z

I've made some progress in the "i2005" branch, commit b2199da. Only a small part of plotting is known to work (and I've not even begun to update its documentation).

dankelley · 2022-10-11T10:43:21Z

I am going to use this comment, which will be edited in-place, for a checklist of items remaining to be done.

read.adp.ad2cp(..., dataType="echosounderRaw") not computing distance (done in i2005 branch commit 9696ae8)

dankelley · 2023-01-01T16:18:26Z

I have started looking at this again, because I want this in the new CRAN. As a first step, I updated "ad2cp" based on "develop". Then I updated "i2005" based on "ad2cp" (which was tricky because there was a merge conflict that amounted to several hundred lines). Then I tested "ad2cp" and since it passed, I deleted the "i2005" branch.

My next step will be to merge "ad2cp" into "develop", and then to delete "ad2cp".

Why the branch deletions? Because the longer code sits in a branch other than "develop", the more likely it becomes that there will be merge conflicts that (a) waste my time and (b) may introduce errors because it's hard to compare logic flow across similar-but-different 200-line code chunks.

Since the ad2cp code was marked provisional, I don't see any reason to deprecate functions, or to make a new one called (something)NEW. The problem is that the object structure is changed. That means that all the generics -- summary(), subset() and plot etc -- need to be changed, not just read.*()..

dankelley · 2023-01-01T17:32:21Z

I've merged the "ad2cp" branch into "commit", and deleted "ad2cp" locally and on github.

dankelley · 2023-01-01T17:36:44Z

@richardsc I think this issue ought to be closed. It was basically completed a few months ago, and if there are problems, they ought to be discussed in new issues. Note that the "ad2cp" branch no longer exists; testing ought to be done in the "develop" branch.

Closing this gets us one step closer to being able to contemplate a CRAN release. And I always like to have a month or so of stable code before doing that. I am not too worried that read.adp.ad2cp() now works differently than it did prior to the posting of this issue, because it was always marked as provisional code, subject to change. And I am pretty certain that the new code is much more useful than the old code.

dankelley · 2023-02-27T19:22:25Z

During a discussion, CR and I decided to merge this. I've done that, checked that local build/check works, and pushed to GH as commit 68fb448 of the "develop" branch, so I am closing this issue.

richardsc added the ad2cp label Oct 5, 2022

dankelley self-assigned this Oct 6, 2022

dankelley mentioned this issue Oct 10, 2022

drop the ignoreChecksums parameter of read.adp.ad2cp()? #2007

Closed

richardsc mentioned this issue Nov 23, 2022

Support for MIDAS .ntk file format #2017

Closed

dankelley added the for next CRAN label Dec 30, 2022

dankelley closed this as completed Feb 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: read.adp.ad2cp() should only read one "data type" #2005

Proposal: read.adp.ad2cp() should only read one "data type" #2005

richardsc commented Oct 5, 2022

richardsc commented Oct 5, 2022

dankelley commented Oct 6, 2022

dankelley commented Oct 6, 2022

dankelley commented Oct 6, 2022

richardsc commented Oct 6, 2022

dankelley commented Oct 6, 2022

dankelley commented Oct 7, 2022

dankelley commented Oct 7, 2022

dankelley commented Oct 9, 2022

dankelley commented Oct 9, 2022

dankelley commented Oct 11, 2022 •

edited

Loading

dankelley commented Jan 1, 2023

dankelley commented Jan 1, 2023

dankelley commented Jan 1, 2023

dankelley commented Feb 27, 2023

Proposal: read.adp.ad2cp() should only read one "data type" #2005

Proposal: read.adp.ad2cp() should only read one "data type" #2005

Comments

richardsc commented Oct 5, 2022

richardsc commented Oct 5, 2022

dankelley commented Oct 6, 2022

dankelley commented Oct 6, 2022

dankelley commented Oct 6, 2022

richardsc commented Oct 6, 2022

dankelley commented Oct 6, 2022

dankelley commented Oct 7, 2022

dankelley commented Oct 7, 2022

dankelley commented Oct 9, 2022

dankelley commented Oct 9, 2022

dankelley commented Oct 11, 2022 • edited Loading

dankelley commented Jan 1, 2023

dankelley commented Jan 1, 2023

dankelley commented Jan 1, 2023

dankelley commented Feb 27, 2023

dankelley commented Oct 11, 2022 •

edited

Loading