New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate metrics from external regressors using F stats #1064
base: main
Are you sure you want to change the base?
Conversation
tedana/resources/decision_trees/demo_minimal_external_regressors_motion_task_models.json
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went through and added comments. The method looks solid to me. I have a few thoughts on style though:
- There's a lot of commented vestigial code.
- The docstrings should have type annotations, per Incorporate type hints when possible #704. I would also recommend requiring parameter named in the function calls (i.e., by putting a leading
*
before the parameters). - Some variable names do not match the rest of the codebase (e.g., you use
n_time
instead ofn_vols
). - It would be great if strings and lines in docstrings were broken on punctuation. This will result in cleaner diffs in the future.
- We're going to need extensive tests to cover the new code.
Co-authored-by: Taylor Salo <salot@pennmedicine.upenn.edu>
Co-authored-by: Taylor Salo <salot@pennmedicine.upenn.edu>
Co-authored-by: Taylor Salo <salot@pennmedicine.upenn.edu>
Co-authored-by: Taylor Salo <salot@pennmedicine.upenn.edu>
Co-authored-by: Taylor Salo <salot@pennmedicine.upenn.edu>
@tsalo I just added type hints to
|
@tsalo Can you explain more what you mean by "It would be great if strings and lines in docstrings were broken on punctuation. This will result in cleaner diffs in the future." |
Absolutely. Take a paragraph in the text, like the one below:
Since sentences are a unit within the paragraph, we're more likely to change them than random words. -This is sentence one. This is
+This is sentence one. This is a
-sentence two. This is sentence
+new sentence two with other
-three- but it's still going.
+changes. This is sentence three-
+but it's still going. The diff on that, where the only consideration is line length, is much more extensive (and harder to interpret for a reviewer) than the diff if we broke up the text on punctuation: This is sentence one.
-This is sentence two.
+This is a new sentence
+two with other changes.
This is sentence three-
but it's still going. |
Closes #1009. This is an alternative approach to #1008 and #1021.
If a user provides external regressors, this will calculate a fit to those regressors to include as metrics in the component table and for use in decision trees.
Changes proposed in this pull request:
metrics.external
, for calculating metrics from external regressorstask_keep
If regressors under this label are included, these will be excluded from the full F model and a separate full F model will be calculated using just these regressors. This was a suggestion from @dowdlelt so that it would be possible to identify and conservative retain components that fit to the overall task design.--external
, to pass in a TSV with external regressors.external_regressor_config
which is a dictionary with the following keys:info
A description of what the config does that's savedreport
A description to add toreport.txt
calc_stats
Currently the only option is "F" but this leaves open possibilities for additional options.detrend
: Will automatically calculate the number of polynomial detrending regressors if true, but can also be a number for the users to specify. If this is false, then will be set to 0 (mean removal) and log a warning.f_states_partial_models
[optional] A list of titles for the partial models (i.e.["Motion", "CSF"]
)f_states_partial_models
has model names, each model name is its own key and contains either a list of column labels or a regular express wildcard (i.e."Motion": ["^mot_.*$"]
means the Motion partial model will include any external regressor column label that begins withmot_
and"Motion": ["mot_x", "mot_y", "mot_z"]
specifies 3 specific column label namestask_keep
[optional] Contents are a regex wildcard or specific names to define task regressors that will not be included in the full modeldemo_minimal_external_regressors_motion_task_models.json
uses all the above options, uses the partial models to add a classification_tag, but not change results and retains components that correlate to the task (R2>0.5), have kappa>elbow irregardless of what rho is.demo_minimal_external_regressors_single_model.json
uses the minimum number of options to run with external regressors.To do: