Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perturbation space - pseudobulk, discriminative classifier, clustering #316

Merged
merged 20 commits into from
Jul 19, 2023

Conversation

AlejandroTL
Copy link
Contributor

PR Checklist

Description of changes

Update of the PerturbationalSpace, including:

  • Pseudobulk
  • Discriminative classifier (includes Pytorch Lightning models and trainer)
  • Clustering K-Means and DBSCAN

@github-actions github-actions bot added the enhancement New feature or request label Jul 11, 2023
@Zethson Zethson changed the title Feature/perturbation space Perturbation space - pseudobulk, discriminative classifier, clustering Jul 12, 2023
Copy link
Member

@Zethson Zethson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much!

Things I did:

  1. Solved all pre-commit errors. You seem to like mutable defaults (e.g. populated lists) for argument defaults. Don't do this, please :)
  2. Moved all Made the PerturbationSpace class abstract and all other classes are now implementing it and will be called standalone as discussed
  3. Moved the _utils stuff into the discriminator file because this is where it's supposed to live
  4. Random minor stuff

General comments:

  1. Don't run any functions explicitly in test files. pytest picks them up and runs them on its own.
  2. The tests are currently bad for two reasons: 1. Some of them are broken and don't even run; 2. They should actually test the calculated perturbation space. Currently you're just running the function and then hoping for the best. Create artificial data that allows you to test for whether the calculated perturbation space makes sense. e.g. 3 very distinct clusters of cells and then measure the R2 between the 3 points of the perturbation space that you calculated and it should be at least SOMENUMBER or so.
  3. Ensure that every function signature that people will use (call) primarily is similar. All of them should have a copy parameter, a layer parameter, .... you get the gist. The experience when working with the spaces should be very consistent.
  4. Document the functions properly with docstrings. We'll be exposing some functions that are not yet documented (all call primarily)
  5. Do NOT add type hints in docstrings (as shitty VScode does automatically). We have them in the function signature and Sphinx will renders this eventually.

Please resolve my comments as you go through them. In an ideal world (that we 100% do not have due to lack of time) one would add one commit per comment.

If stuff is unclear - ask away!

.gitignore Show resolved Hide resolved
pertpy/tools/_perturbation_space/_clustering.py Outdated Show resolved Hide resolved
pertpy/tools/_perturbation_space/_clustering.py Outdated Show resolved Hide resolved
pertpy/tools/_perturbation_space/_clustering.py Outdated Show resolved Hide resolved
pertpy/tools/_perturbation_space/_clustering.py Outdated Show resolved Hide resolved
pertpy/tools/_perturbation_space/_simple.py Outdated Show resolved Hide resolved
pertpy/tools/_perturbation_space/_simple.py Show resolved Hide resolved
@Zethson Zethson merged commit 412110c into development Jul 19, 2023
12 of 17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants