Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update ponyo simulations #20

Merged
merged 9 commits into from
Dec 1, 2020
Merged

Update ponyo simulations #20

merged 9 commits into from
Dec 1, 2020

Conversation

ajlee21
Copy link
Contributor

@ajlee21 ajlee21 commented Nov 20, 2020

This PR addresses issues #18 and #17

This PR also updates the test scripts based on these new changes.

Some minor associated updates:

  • Change "file" to "filename"
  • Pull out files defined in notebook to config file
  • Update readme based on config file updates

This will be ponyo v 0.3

Note: Linter is failing due to being on a fork

@coveralls
Copy link

coveralls commented Nov 25, 2020

Pull Request Test Coverage Report for Build 394793710

  • 71 of 71 (100.0%) changed or added relevant lines in 4 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+1.09%) to 98.867%

Totals Coverage Status
Change from base Build 348099773: 1.09%
Covered Lines: 349
Relevant Lines: 353

💛 - Coveralls

@ajlee21 ajlee21 marked this pull request as ready for review November 25, 2020 19:08
Copy link

@dongbohu dongbohu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

Copy link
Member

@jjc2718 jjc2718 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Just a few comments.

"""
Returns sample ids (found in gene expression df) associated with
a given list of experiment ids (found in the metadata)

Arguments
----------
experiment_ids_file: str
File containing all cleaned experiment ids
metadata_filename: str
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be useful to link to an example of a metadata file here (e.g. the one in human_tests).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good tip

@@ -437,20 +469,14 @@ def shift_template_experiment(

Arguments
----------
normalized_data_file: str
File containing normalized gene expression data
normalized_data: str
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a pd.DataFrame, right? Not a str?

@@ -199,12 +199,15 @@ def simulate_by_random_sampling(

def simulate_by_latent_transformation(
num_simulated_experiments,
normalized_data_file,
normalized_data_filename,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing you could think about is changing these simulate_by_X functions to also take a DataFrame as input (i.e. normalized_data rather than normalized_data_filename). This would make them consistent with shift_template_experiment, as well as possibly making them faster (although I'm not sure if you're calling these ones in a loop anywhere).

Might be a non-trivial amount of work to update other projects to conform with this, though, so if you want to create a separate issue and think about it later that's fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I was waffling about this too. So the other two methods don't run in a for loop so its not required that we use the dataframe. I left them because it makes more sense me to input filenames instead of dataframes (i know its an extra step but feels like its one less hassle) and this one method is more of an exception.

I'll add an issue. One thing I could see doing is making a wrapper function to call the for loop

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants