Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Package object structure: Dimensions, and maybe other things #21

Closed
rogiersbart opened this issue Sep 17, 2020 · 5 comments
Closed

Package object structure: Dimensions, and maybe other things #21

rogiersbart opened this issue Sep 17, 2020 · 5 comments
Assignees
Labels
api 🔨 change likely to affect existing code bug 💣 unexpected problem or unintended behavior

Comments

@rogiersbart
Copy link
Owner

I just noted the current development branch contains a change to many (if not all) of the package object structures: The parameters related to dimensions are now exposed as object$dimensions$np instead of object$np for instance. That seems to be quite a breaking change, which I should have noticed before, and is actually breaking some of my RMODFLOW code.

I am prepared to adapt, but we need to discuss this I believe. My philosophy was, any dataset/variable defined in the online guide, which can be easily exposed to the user as object$name should be exposed in such a way. In case of MODFLOW parameters, I believe it indeed becomes too difficult, but plain integers as hob$nh, hob$mobs, hob$maxm should still be available like that if you ask me. This seems much more user friendly.

Any thoughts?

@rogiersbart rogiersbart added bug 💣 unexpected problem or unintended behavior api 🔨 change likely to affect existing code labels Sep 17, 2020
@cneyens
Copy link
Collaborator

cneyens commented Sep 17, 2020

I think this was introduced in #8 and added for HOB in #14. My apologies for not explicitly stating the breaking changes. So let me first explain the choices. They only affect boundary condition objects + HOB by the way.

The idea for #8 was to make it easier (less verbose) to create boundary condition objects. Instead of supplying every data set as a separate function argument to rmf_create_*, you would only supply rmf_list or rmf_2d_array objects + dis and let the function create everything else like np and itmp in the background. If you look at MODFLOW boundary condition file structures, they can be condensed into three parts (this is heavily opinionated): (1) dimensions, (2) data; either some sort of "table" (discrete boundary conditions & HOB) or 2d matrices (continuous bc) and (3) information on the stress period timing (not for HOB). So that's what these objects look like in RMODFLOW: lists with dimensions, data (or recharge/evt) and kper elements.

For dimensions, I get that it's less user-friendly than e.g. hob$nh. For data however, I think it's more user-friendly to have a data.frame than to have individual vectors of e.g. riv$stage, riv$conductance etc. If you want to look at your river data, pulling the entire data frame at once makes more sense to me than to pull a bunch of individual vectors. Furthermore, the MODFLOW data structure for RIV, GHB, etc. is literally a data frame-like structure. Similarly for HOB although there can be list-columns in there for multi-layer observations. For RCH and EVT, the data are lists with rmf_2d_arrays. Regarding the stress period timing, kper gives you an overview of which tables/arrays are active during each stress period. I think this is more informative than object$itmp although it does make a strong abstraction from the initial MODFLOW file structure. In general however, I don't think users are interested in the itmp values, but rather in "which arrays are active in stress period x".

Removing the dimensions element makes sense to me and shouldn't be too difficult. Removing data and/or kper on the other hand will require a rewrite of the internals. I think for the latter two it's easier to make S3 $.object operators to obtain and compute certain object parameters on the fly if we want to go down that road (see $.crs in sf for a good example).

@rogiersbart
Copy link
Owner Author

Ok, makes sense. Definitely the data frame part (just wondering now what the status of data.frame vs tibble use is currently, but might look into that later (note I would prefer using the latter throughout the package, mainly because of the more user-friendly printing)). Need to look deeper into the kper thing as well once, but that's ok for now. I will remove the dimensions layer in a feature branch then asap, and check if rmf_optimize() and rmf_analyze() are fixed by that.

@cneyens
Copy link
Collaborator

cneyens commented Sep 21, 2020

(just wondering now what the status of data.frame vs tibble use is currently

Currently, everything except rmf_as_tibble uses data.frame . This should be straightforward to adjust.

I will remove the dimensions layer in a feature branch then asap

If you want me to do that just let me know; since I introduced it in the first place. More than happy to help.

@rogiersbart
Copy link
Owner Author

If you have time for that, that would be great. Just put it in a data-frame-to-tibble feature branch or so. Should be straightforward to merge.

rogiersbart added a commit that referenced this issue Sep 22, 2020
@rogiersbart
Copy link
Owner Author

Oh, I see I misinterpreted your suggestion above. Well ... had to fix things asap to continue working on one of the projects. The tibble thing can be done later I suppose. It's more a UI thing than anything else I think, so maybe I'll try to do that when introducing rui throughout the package (unless you feel like taking action already of course). Will close this one now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api 🔨 change likely to affect existing code bug 💣 unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

2 participants