Make check in RSMTool more explicit #393

desilinguist · 2020-03-05T18:20:57Z

There's a currently a place in RSMTool where we raise a ValueError if a coefficients file already exists in the output directory. There's no documentation as to why this is being checked.

From @aloukina, this is why we do this:

This is what I think it’s doing in a roundabout way: it uses the existence of coefficients file as a shortcut to establish that we are dealing with a linear model. If you first run a LR experiment and then followed it with SVR using the same id and the same output_directory, you’ll get this error. The coefficients file will be there but the modeler will not be able to scale them because these lines will fail.

We should replace this roundabout check with a much more explicit check, e.g., simply testingwhether predconfig.get_coefficients() fails? That would make it much more readable.

The text was updated successfully, but these errors were encountered:

desilinguist · 2020-05-01T15:28:27Z

@aloukina perhaps what we should do here is to just delete the original coefficients file and print a warning saying that an old coefficients file was detected and removed?

aloukina · 2020-05-04T13:33:26Z

I am more in favor or raising an error and letting the user do the clean up: if there is indeed an old LR experiment there, there would be other files too: coefficients_scaled, _model_fit, _betas etc. If these stay, it can create a lot of confusion down the line.

aloukina · 2020-05-04T13:35:02Z

Another solution is if the user specifies -f when running rsmtool, we first clean up any output with the same id from the output directories.

aloukina · 2020-05-04T13:36:32Z

The danger here is that we cannot really use glob: if the user is not being very smart and has two experiments in the same folder, "exp1" and "exp1_updated" and then decide to re-run "exp1", we'll end up deleting both outputs unless we actually look for specific filenames.

desilinguist · 2020-05-04T13:37:01Z

Hmm, I am not sure I want to change the semantics of -f at this juncture.

How about if I modify Modeler.scale_coefficients() to raise an error if its call to get_cofficients() fails since should be captured in the API too? And then tweak the error message in rsmtool.py? Would that work?

aloukina · 2020-05-04T13:38:18Z

So keep the general current logic but make the code more readable by testing for the right thing, right?

desilinguist · 2020-05-04T13:38:34Z

Yes!

aloukina · 2020-05-04T13:38:47Z

Yes, I like that!

desilinguist added the bug label Mar 5, 2020 — with Slack

desilinguist added this to To do in RSMTool 8 via automation Mar 5, 2020

desilinguist moved this from To do to In progress in RSMTool 8 May 1, 2020

desilinguist mentioned this issue May 4, 2020

Improve check for linear model #428

Merged

desilinguist closed this as completed in #428 May 4, 2020

RSMTool 8 automation moved this from In progress to Done May 4, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make check in RSMTool more explicit #393

Make check in RSMTool more explicit #393

desilinguist commented Mar 5, 2020 •

edited

desilinguist commented May 1, 2020

aloukina commented May 4, 2020

aloukina commented May 4, 2020

aloukina commented May 4, 2020

desilinguist commented May 4, 2020 •

edited

aloukina commented May 4, 2020

desilinguist commented May 4, 2020

aloukina commented May 4, 2020

Make check in RSMTool more explicit #393

Make check in RSMTool more explicit #393

Comments

desilinguist commented Mar 5, 2020 • edited

desilinguist commented May 1, 2020

aloukina commented May 4, 2020

aloukina commented May 4, 2020

aloukina commented May 4, 2020

desilinguist commented May 4, 2020 • edited

aloukina commented May 4, 2020

desilinguist commented May 4, 2020

aloukina commented May 4, 2020

desilinguist commented Mar 5, 2020 •

edited

desilinguist commented May 4, 2020 •

edited