Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix warnings from anndata v0.6.22 #730

Merged
merged 8 commits into from
Jul 15, 2019
Merged

Conversation

ivirshup
Copy link
Member

@ivirshup ivirshup commented Jul 7, 2019

Fixes for deprecation warnings from anndatav0.6.22. This is mostly by replacing access to .X with access via {obs,var}_vector or sc.get.obs_df.

Supercedes #713, fixes #700 and #690.

Functions changed:

  • sc.pl.violin
    • Dataframe for plotting now constructed with obs_df, access to adata.X no longer used.
  • sc.pl.scatter
    • Changed default layer from "X" to "None". "X" is still supported, but should throw a deprecation warning if it's explicitly used.
    • Replace usage of ._get_obs_array with .obs_vector
  • sc.pl._tools.scatterplots.plot_scatter
    • Normalized access to layers, now sparse and dense should similarly.
  • sc.get.obs_df
    • Added support for use_raw

This should also future proof retrieving colors for always-2d anndata.
Adding support to var_df would probably just involve writing a seperate function.
@ivirshup ivirshup requested a review from fidelram July 7, 2019 06:54
Additionally, added a test
@falexwolf
Copy link
Member

Great! Don't we want to add anndata v0.6.22 to the requirements and the version check at startup in scanpy/__init__.py? At least we should make sure anndata is 0.6.21 as otherwise .obs_vector doesn't exist (https://anndata.readthedocs.io/en/stable/#post-v0-6-june-6-2019).

Otherwise, happy to make Scanpy v1.4.4.

@ivirshup
Copy link
Member Author

ivirshup commented Jul 8, 2019

Sure! I wansn't sure if there were other bugs to fix or PRs to merge before we wanted to make a release. I'd also like to get @fidelram to give this a look over. I think I didn't break anything, but he'd be in a better position to tell if that were the case.

@fidelram
Copy link
Collaborator

@ivirshup I checked and now I don't get warnings :)

However, I could not plot the layer. In the following example, I make a new layer that is the negative of the default layer. As you see, bot the default and the negative ('test') layer are identical.

image

@ivirshup
Copy link
Member Author

ivirshup commented Jul 11, 2019

@ivirshup
Copy link
Member Author

Also, heads up, a bunch of plots break going from matplotlib v3.1.0 -> v3.1.1.

@ivirshup
Copy link
Member Author

ivirshup commented Jul 12, 2019

Fixed. Also added a couple tests. I didn't change the test mentioned above though, which might be a good thing to do.

@fidelram
Copy link
Collaborator

@ivirshup Indeed the problem is use_raw=True by default. In the test, I think what happens is that the raw data is being plotted and thus no error appears. The tolerance for the image difference may hide the problem if indeed the test image is correct.

To avoid this confusion when plotting a layer I think it is better to override use_raw. This is how it was supposed to be working before the changes according to the documentation:

layer : typing.Union[str, NoneType], optional (default: None)
    Name of the AnnData object layer that wants to be plotted. By default
    adata.raw.X is plotted. If `use_raw=False` is set, then `adata.X` is plotted.
    If `layer` is set to a valid layer name, then the layer is plotted. `layer`
    takes precedence over `use_raw`.

The current logic is around this lines https://github.com/theislab/scanpy/blob/master/scanpy/plotting/_tools/scatterplots.py#L744

PS: the use_raw has been a source of many confusions for me. Now I know when raw is used by default but for new users this may not be obvious. One solution is to add a warning message everytime that use_raw is set to True by the code.

@fidelram
Copy link
Collaborator

@ivirshup I tested and now is working as expected. Thanks for adding the new tests. From my side is ready to go.

@ivirshup
Copy link
Member Author

@fidelram I've made sure that only use_raw or layer has been specified, though the default of use_raw being True is still used if layer isn't set. I think it would be good if this was covered in the docs for use_raw as well.

I also get tripped up by use_raw pretty frequently. I think a warning for this behavior would be good, but I don't like the idea of the default setting issuing a warning. What if we changed the default to false? This would need a deprecation warning for a bit and waiting until the next big release though.

@ivirshup
Copy link
Member Author

I've updated the docs a little and am going to go ahead and merge this. Thanks for the feedback @fidelram!

@ivirshup ivirshup merged commit b9800e0 into scverse:master Jul 15, 2019
@ivirshup
Copy link
Member Author

@falexwolf, realized I didn't change the AnnData dependency. I'm not totally sure what to do with that, since we've already got a requirement on 0.6.22 due to scipy and statsmodels.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

plot_scatter throws error when sparse layers used for color
3 participants