Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding derived vectors/matrices to a CompadreData object #26

Closed
patrickbarks opened this issue Aug 21, 2018 · 4 comments
Closed

Adding derived vectors/matrices to a CompadreData object #26

patrickbarks opened this issue Aug 21, 2018 · 4 comments

Comments

@patrickbarks
Copy link
Collaborator

Related to #25, I think it would be useful if there was a way for a user to add new slots containing either vectors, matrices, or lists to a CompadreData object.

In many demographic analyses users will want to derive vectors or matrices from an MPM (e.g. population vector, stable distribution, sensitivity matrix, collapsed or rearranged matrix), and it would be ideal if these remained part of the CompadreData object to maintain the mapping with the metadata and original MPM. This becomes particularly important if the user wishes to subsequently subset the db based on some derived value(s).

Many of the functions in Rcompadre/Rage return vectors or matrices derived from an MPM (e.g. reprodStages, identifyReprodStages, rearrangeMatrix, splitMatrix, collapseMatrix), and I gather these currently need to be stored separately from the db.

I don't know much about S4 and so don't have a sense of whether this is feasible, but perhaps @tdjames1 and @iainmstott can weigh in.

@iainmstott
Copy link
Collaborator

I can jump in quite quickly on this one.

The short answer is: it's not possible and probably not desirable with S4.

The long answer is...

Some justification:
S4 addressing is designed for very defined data structures (which is why we chose it: the structure of the databases are very rigid). It's not possible to add slots unless we specify what class they're for, and there's no way of knowing what class the user would want to add. We could add something like 'list' and leave it empty, but it's definitely a fudge. We'd have to decide on the name of that slot, so the user wouldn't be able to call it what they want. That extra slot would give us one extra variable, but that's it: if we want more than one we'd also have to anticipate how many slots people may want, as it's not possible (or desirable) to add slots to a an S4 object ad hoc. An S4 object is designed to be inflexible: it really is the case that once you define the structure, that's it.

Options to do what you need:

  • It is possible to add data to the metadata: saving the matrices as strings in extra metadata columns is a possibility, but I realise it's clunky when you're trying to work with the data, if you want to keep the new variable associated with the database all the time.
  • This one may give you a satisfactory solution. The CompadreData and CompadreM classes are exported, so it's possible for a user to have access to these and define a new class of their own desired structure, which inherits from this/these classes, if they so wish. This would do exactly what you want, as you would get to define the new slots. It's easy to do. It wouldn't be a part of the package, though. We could include it in an "Advanced users" section in the vignette or examples, perhaps.

I get where you're coming from... the new structure will probably change the way I work with the data too. But for me, the benefits of locking in the structure of the databases for general use outweighs this.

If/when we get to functions that query and pull straight from the servers, having this locked-in structure will also be really helpful, I think.

@levisc8
Copy link
Collaborator

levisc8 commented Aug 21, 2018

The metadata slot is an S3 data.frame so you can add whatever you want to that as long as we don't impose a dimension or column name check in the CompadreData class definition. The main drawback here is that you can only add to the metadata table, not actually create a new slot. But see #27 for some other (perhaps overly verbose) thoughts.

@patrickbarks
Copy link
Collaborator Author

I somehow didn't realize that an S3 data.frame could have list-columns. I thought that was extra tibble magic, but I see now that it works just fine with data.frame too. So that basically solves all my problems.

Regardless, I like the idea of merging metadata and mat into a data slot (#27), which makes adding additional list-columns for derived vectors/mats slightly more intuitive.

@patrickbarks
Copy link
Collaborator Author

Adding derived columns is now straightforward thanks to @iainmstott's updates in #32 and #43, so closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants