Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dataset data_normalize_metric_pam50 #4045

Closed
wants to merge 2 commits into from
Closed

Conversation

mguaypaq
Copy link
Member

Fixes #4044.

For testing, I successfully downloaded the data with:

sct_download_data -d data_normalize_metric_pam50

Question for the reviewer(s): is data_normalize_metric_pam50 the name we want to use for this dataset?

@mguaypaq mguaypaq added feature category: new functionality sct_download_data context: labels Feb 21, 2023
@mguaypaq mguaypaq added this to the 6.0 milestone Feb 21, 2023
@mguaypaq mguaypaq self-assigned this Feb 21, 2023
@sandrinebedard
Copy link
Member

Will it be downloaded with the SCT installation?

Question for the reviewer(s): is data_normalize_metric_pam50 the name we want to use for this dataset?

Yes it is fine!

@mguaypaq
Copy link
Member Author

Will it be downloaded with the SCT installation?

Not by default. Should it be? If so, we just need to add it to install_sct here:

# Download data
print info "Installing data..."
run mkdir -p "$SCT_DIR/$DATA_DIR"
for data in PAM50 optic_models pmj_models deepseg_sc_models deepseg_gm_models deepseg_lesion_models c2c3_disc_models deepreg_models; do
run sct_download_data -d "$data"
done

and to install_sct.bat here:

rem Install external dependencies
echo:
echo ### Downloading model files and binaries...
FOR %%D IN (PAM50 optic_models pmj_models deepseg_sc_models deepseg_gm_models deepseg_lesion_models c2c3_disc_models binaries_win deepreg_models) DO sct_download_data -d %%D -k || goto error

spinalcordtoolbox/download.py Show resolved Hide resolved
@@ -159,7 +159,13 @@
"https://github.com/ivadomed/multimodal-registration/releases/download/r20220512/models.zip"
],
"default_location": os.path.join(__sct_dir__, "data", "deepreg_models"),
}
},
"data_normalize_metric_pam50": {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question for the reviewer(s): is data_normalize_metric_pam50 the name we want to use for this dataset?

Some nitpicks:

  • Elsewhere in sct_download_data, we use PAM50 (capitalized), so it might be good to unify that.
  • Part of me wants to use a name that represents a noun, similar to the other dataset names? For example, rather than "data normalize metric pam50", we would say "PAM50-normalized metrics", which could then become the dataset name PAM50_normalized_metrics .

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with renaming the repo and folder!

"https://github.com/spinalcordtoolbox/data_normalize_metric_pam50/archive/refs/tags/r20230221.zip"
],
"default_location": os.path.join(__sct_dir__, "data", "data_normalize_metric_pam50"),
},
}


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side-question: Would it be beneficial to combine this PR with #4003?

The reason I suggest this is because of the following scenario:

  • We merge this PR to master.
  • #4003 is updated with master to include the new dataset link.
  • We start playing around with #4003 to compute compression metrics from the CSV dataset.
  • During testing, we identify some flaw or concern with the dataset (perhaps some issue with how the metrics were generated via #3977).
  • We need to regenerate the dataset and create a new release link, and thus a new change to master.

By comparison, if this dataset link were to be included as part of #4003, then it could potentially save a bit of clutter on master, and keep related development grouped in the same location, since any changes to the dataset would be part of #4003's development.

(But, maybe the scenario I describe is unlikely, or easy enough to handle if something goes wrong?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we can include this in #4003, I agree!

@sandrinebedard
Copy link
Member

Will it be downloaded with the SCT installation?

Not by default. Should it be? If so, we just need to add it to install_sct here:

# Download data
print info "Installing data..."
run mkdir -p "$SCT_DIR/$DATA_DIR"
for data in PAM50 optic_models pmj_models deepseg_sc_models deepseg_gm_models deepseg_lesion_models c2c3_disc_models deepreg_models; do
run sct_download_data -d "$data"
done

and to install_sct.bat here:

rem Install external dependencies
echo:
echo ### Downloading model files and binaries...
FOR %%D IN (PAM50 optic_models pmj_models deepseg_sc_models deepseg_gm_models deepseg_lesion_models c2c3_disc_models binaries_win deepreg_models) DO sct_download_data -d %%D -k || goto error

OK great! We discussed this in the sct_dev meeting and with the project team for the compression metrics, we want it within the sct installation!

@sandrinebedard
Copy link
Member

Closing since I merged this branch into #4003. --> 124cf66

@mguaypaq mguaypaq deleted the mgp/add-data branch March 13, 2023 19:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature category: new functionality sct_download_data context:
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add data_normalize_metric_pam50 in sct_download_data
3 participants