Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nicer model hashes; model_hash_to_model method #86

Open
wants to merge 13 commits into
base: develop
Choose a base branch
from

Conversation

dilpath
Copy link
Member

@dilpath dilpath commented Mar 26, 2024

Model hashes are now {MODEL_SUBSPACE_ID}-{MODEL_SUBSPACE_INDICES}-{PETAB_HASH}.

This makes possible a petab_select_problem.model_hash_to_model method, that converts a model hash into a model.

For parameters with > 10 options, alphabet characters will be used (similar to hex, but now 62 characters from [0-9][A-Z][a-z]. For parameters with > 62 options, the actual index delimited by . will be used.
e.g.
[estimate, 0, estimate, 0, 0] (values) -> [1,0,1,0,0] (indices) -> model_subspace_A-10100-petab_hash (hash)
[1, 36, 0] (indices) -> model_subspace_A-1Z0-petab_hash (hash)
[1, 63, 0] (indices) -> model_subspace_A-1.63.0-petab_hash (hash)

Also possible to just go with the last option with .s for simpler implementation and broad applicability. However, I expect most users will have parameters that can only be one of (0, estimate}, in which case all indices will be one of {0, 1}, so the more compact first representation is probably preferred.

The PEtab hash is computed from only the following information:

  • absolute location of the PEtab problem YAML file in the filesystem
  • nominal values of parameters in the model's PEtab problem
  • estimated parameters in the model's PEtab problem

Dilan Pathirana added 2 commits March 26, 2024 19:23
Copy link
Member

@dweindl dweindl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, very convenient, thx.

But wait, is it really model_subspace_A-1.63.0? I think it produces model_subspace_A.1-63-0.

I am not sufficiently familiar with the matter to tell whether there can be any collisions or not.

Some docstrings + return type annotations would be great.

petab_select/model.py Outdated Show resolved Hide resolved
petab_select/model.py Outdated Show resolved Hide resolved
petab_select/model.py Outdated Show resolved Hide resolved
@dweindl
Copy link
Member

dweindl commented Mar 26, 2024

Also possible to just go with the last option with .s for simpler implementation and broad applicability. However, I expect most users will have parameters that can only be one of (0, estimate}, in which case all indices will be one of {0, 1}, so the more compact first representation is probably preferred.

No strong opinion. Let's see it in action and decide then.

dilpath and others added 4 commits March 26, 2024 23:04
Co-authored-by: Daniel Weindl <dweindl@users.noreply.github.com>
@dilpath
Copy link
Member Author

dilpath commented Mar 26, 2024

But wait, is it really model_subspace_A-1.63.0? I think it produces model_subspace_A.1-63-0.

🙈 Thanks! I doubt this case in the current implementation will ever be used, I've never heard of anyone using more than maybe 3 or so options for a parameter. Not sure whether PEtab Select would even handle this case well.

I am not sufficiently familiar with the matter to tell whether there can be any collisions or not.

It should be that every model subspace ID is unique in the context of a single PEtab Select problem, which also means that every model subspace ID exists in only one row of one model space file, of possibly many model space files. So, a model_subspace_id with its model subspace indices should correspond to one model in the full model space, at most.

It could be that multiple model subspace ID + indices combinations correspond to the same "model" (same mathematical model structure and estimated parameters), due to e.g. model subspace rows in the model space file that match in everything except their unique model subspace ID. This would mean that the "same" model that exists in different subspaces could be calibrated multiple times (once per duplicate) during model selection. But, this would always be under a unique hash, so no collisions at least.

No strong opinion. Let's see it in action and decide then.

Sounds good!

@dweindl
Copy link
Member

dweindl commented Mar 27, 2024

It should be that every model subspace ID is unique in the context of a single PEtab Select problem

Thanks for the explanation. I think it would be good to stress in the docs that those hashes are context-dependent (which is usually not the case for hashes). I.e., after modifying a given PEtab select problem, a certain hash may identify a different model. This shouldn't really be a practical limitation, but users need to be aware.

petab_select/model.py Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants