Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create boxplot function in visualize #218

Merged
merged 12 commits into from
Sep 13, 2020
Merged

Create boxplot function in visualize #218

merged 12 commits into from
Sep 13, 2020

Conversation

alex-l-kong
Copy link
Contributor

@alex-l-kong alex-l-kong commented Sep 10, 2020

What is the purpose of this PR?

Addresses and closes #204. Addresses and closes #183.

How did you implement your changes

Add a function draw_boxplot to visualize.py which will allow the user to draw (and customize) a boxplot. In the meantime, clean up visualize.py and visualize_test.py.

Remaining issues

The previous way save_dir was being used was more of like a save_prefix. I changed the nature of save_dir and added a little more error checking to make sure we weren't trying to save to a save_dir that didn't exist.

@alex-l-kong alex-l-kong self-assigned this Sep 10, 2020
Copy link
Member

@ngreenwald ngreenwald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Can you add it in as a cell to the visualization notebook so we can see what it looks like?

@ngreenwald
Copy link
Member

Also looks like you took care of some of #183, can you see if there's anything else?

Copy link
Contributor

@ackagel ackagel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks super duper; just a few comments

ark/analysis/visualize.py Outdated Show resolved Hide resolved
ark/analysis/visualize.py Outdated Show resolved Hide resolved
Comment on lines +11 to +12
random_data = test_utils.make_segmented_csv(100)
random_data = random_data[random_data['PatientID'].isin(np.arange(1, 5))]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could make this an option in the test_utils if you think cases like this will become more common

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only did it because I didn't want the boxplot to display like 10 different categories on one chart. I don't imagine we'll be doing this too often unless we plan to add a lot more faceted graphing functions in visualize.py, for example.

@ngreenwald
Copy link
Member

ngreenwald commented Sep 11, 2020 via email

Copy link
Member

@ngreenwald ngreenwald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seaborn defaults look quite nice. However, you picked Na to plot, aka Sodium, which is not actually a protein target: it's a naturally abundant metal. AKA we don't really care at all about sodium expression. Try picking CD20, CD45, PD-1, or one of the other biological targets.

@alex-l-kong
Copy link
Contributor Author

Seaborn defaults look quite nice. However, you picked Na to plot, aka Sodium, which is not actually a protein target: it's a naturally abundant metal. AKA we don't really care at all about sodium expression. Try picking CD20, CD45, PD-1, or one of the other biological targets.

NaBrO (sorry, I couldn't resist). But yeah, I'll fix that up.

Copy link
Member

@ngreenwald ngreenwald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Small tweak to function description

ark/analysis/visualize.py Outdated Show resolved Hide resolved
Copy link
Contributor

@ackagel ackagel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks gucci

@ngreenwald
Copy link
Member

woops sorry, forgot to click go. You guys can merge stuff in once it's approved

@ngreenwald ngreenwald merged commit d764892 into master Sep 13, 2020
@ngreenwald ngreenwald deleted the boxplot_func branch September 13, 2020 23:47
alex-l-kong added a commit that referenced this pull request Jan 14, 2021
* Add boxplot viz function

* Add test function for draw_boxplot and change save_dir functionality so it actually acts like a save_dir

* Fix leftover PYCODESTYLE errors

* Increase coverage with error tests, and fix split_vals check

* PYCODESTYLE

* Add new changes, including demo of draw_boxplot

* Fix visualize cells test and add plot barchart tests

* Finish up visualization cleanup, and adding creation of viz directory in notebook

* Fix documentation for get_sorted_data

* Change column to visualize boxplot for

* Make get_sorted_data args and variable names more clear for documentation purposes
y2kbugger pushed a commit that referenced this pull request Jul 29, 2021
* Add boxplot viz function

* Add test function for draw_boxplot and change save_dir functionality so it actually acts like a save_dir

* Fix leftover PYCODESTYLE errors

* Increase coverage with error tests, and fix split_vals check

* PYCODESTYLE

* Add new changes, including demo of draw_boxplot

* Fix visualize cells test and add plot barchart tests

* Finish up visualization cleanup, and adding creation of viz directory in notebook

* Fix documentation for get_sorted_data

* Change column to visualize boxplot for

* Make get_sorted_data args and variable names more clear for documentation purposes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Default box and whisker visualization function Code cleanup for analysis/visualize.py
3 participants