Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify normalize.py for easier usage of whitening method #96

Closed
AdeboyeML opened this issue Sep 21, 2020 · 9 comments
Closed

Modify normalize.py for easier usage of whitening method #96

AdeboyeML opened this issue Sep 21, 2020 · 9 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@AdeboyeML
Copy link
Contributor

  • Modify normalize.py script to allow users use the whitening method once a whiten_method is given i.e. once whiten_method = ZCA/PCA/PCA-cor/ZCA-cor without the need to state method = whiten

suggested approach:

  • So whiten_method will be by default None, once a whiten_method is given by the user and is one of the methods we have in whiten method list, automatically method is set to 'whiten'.

for example: whiten_method_list = ['ZCA', 'PCA', 'PCA-cor', 'ZCA-cor']
that means: if (whiten_method != None) and (whiten_method in whiten_method_list): method = 'whiten'

@gwaybio
Copy link
Member

gwaybio commented Sep 21, 2020

Thanks for opening this issue @AdeboyeML - I am looking forward to your contributions!

So whiten_method will be by default None, once a whiten_method is given by the user and is one of the methods we have in whiten method list, automatically method is set to 'whiten'.

I like this approach in principle, but we should also do one other thing to avoid a potentially disastrous scenario.

Disastrous scenario

Imagine that a user copy and pasted code from one repo to another. In the first repo, the original code was:

normalize(
    profiles=anno_df,
    samples="all",
    method="whiten",
    whiten_method="ZCA"
)

However, the user wanted to switch from whitening to standardizing. If the user switched only this:

normalize(
    profiles=anno_df,
    samples="all",
    method="standardize",
    whiten_method="ZCA"
)

Then the code would default to method="whiten"? This would be bad!!

W would want the internal normalize() function to at least warn the user that a different method was provided than what is being used.

Alternative approach

One alternative approach could be to set whiten_method to a reasonable default that we determine in broadinstitute/lincs-cell-painting#38 (comment)

In this case, we need to make it clear in the normalize() docstring that the whitening options are only used when method="whitening".

@AdeboyeML
Copy link
Contributor Author

Yes, the alternative approach is more elegant, we should give the user the preference to always set method to their desired normalization method.

@gwaybio
Copy link
Member

gwaybio commented Sep 23, 2020

Let's go with ZCA-cor as default for pycytominer

@gwaybio
Copy link
Member

gwaybio commented Sep 23, 2020

see broadinstitute/lincs-cell-painting#38 (comment) for the decision.

We'll close this issue once we make the software fix and it passes old (and potentially new) tests.

@AdeboyeML - do you want to make this fix?

@AdeboyeML
Copy link
Contributor Author

Yes, I will make the fix.

@AdeboyeML
Copy link
Contributor Author

I have changed the whiten_method default to "ZCA-cor" and I also ran the test_normalize.py on my local machine and it passed the test.

@gwaybio
Copy link
Member

gwaybio commented Sep 24, 2020

Thanks for opening the PR in #97 @AdeboyeML ! The best kind of PRs are the ones that make minimal changes.

I didn't realize that we basically already have the alternate scenario as described above #96 (comment)

I think we should go ahead and merge #97 (I will do this) but we should still keep this issue open. We should keep it open because my alternative solution isn't really that satisfying...

there must be some happy medium between the current method (still error prone) and the suggestion you list above:

Modify normalize.py script to allow users use the whitening method once a whiten_method is given i.e. once whiten_method = ZCA/PCA/PCA-cor/ZCA-cor without the need to state method = whiten

Can you think of a better way? Maybe splitting out whitening from normalize.py? Maybe removing the the whiten_method all together and create four different whitening options? (e.g. method="whiten_ZCA") 🤷

@AdeboyeML
Copy link
Contributor Author

Yes, maybe we could split out whitening from normalize.py and create a new function called whiten , where the only parameter the user has to input will be the whiten_method, this may be more straightforward

@gwaybio
Copy link
Member

gwaybio commented Sep 24, 2020

sounds good - let's put this "on the shelf" for now.

Edit: idiom 2 in the updated link

@gwaybio gwaybio closed this as completed Mar 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants