Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plot a celltypist.dotplot to visualise celltypist's classification using a probability threshold #33

Closed
sanchezy opened this issue Aug 8, 2022 · 3 comments

Comments

@sanchezy
Copy link

sanchezy commented Aug 8, 2022

Hi all,
I am trying to visualise the results of the classification using a probability threshold and majority of voting on a cell typist.dotplot. I get an error:

Traceback (most recent call last):
  File "celltypist-scRNA-test.py", line 66, in <module>
    celltypist.dotplot(predictions, use_as_reference = 'predicted.celltype.l2', use_as_prediction = 'majority_voting', save ='scRNA-test-celltypist-probabilistic-majority_voting.png')
  File "/Users/ysanchez/opt/anaconda3/envs/transcriptomicsconda/lib/python3.8/site-packages/celltypist/plot.py", line 140, in dotplot
    dot_size_df, dot_color_df = _get_fraction_prob_df(predictions, use_as_reference, use_as_prediction, None, None)
  File "/Users/ysanchez/opt/anaconda3/envs/transcriptomicsconda/lib/python3.8/site-packages/celltypist/plot.py", line 33, in _get_fraction_prob_df
    score = [row[pred[index]] for index, row in predictions.probability_matrix.iterrows()]
  File "/Users/ysanchez/opt/anaconda3/envs/transcriptomicsconda/lib/python3.8/site-packages/celltypist/plot.py", line 33, in <listcomp>
    score = [row[pred[index]] for index, row in predictions.probability_matrix.iterrows()]
  File "/Users/ysanchez/opt/anaconda3/envs/transcriptomicsconda/lib/python3.8/site-packages/pandas/core/series.py", line 851, in __getitem__
    return self._get_value(key)
  File "/Users/ysanchez/opt/anaconda3/envs/transcriptomicsconda/lib/python3.8/site-packages/pandas/core/series.py", line 959, in _get_value
    loc = self.index.get_loc(label)
  File "/Users/ysanchez/opt/anaconda3/envs/transcriptomicsconda/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 3082, in get_loc
    raise KeyError(key) from err
KeyError: 'Unassigned'

Could you please let me know if there is a way to get around this?

Many thanks for your help!

@ChuanXu1
Copy link
Collaborator

ChuanXu1 commented Aug 8, 2022

@sanchezy, this plot function is now designed only for best-match mode - I will expand its application to 'prob match' in the future.

Generally you should avoid dot plotting the result derived from probability thresholding, as the dot plot itself incorporates probability information, based on which you can judge cell type prediction confidence.

However, if you really want to plot the probability threshold result, for now the temporary solution is as below:
all_celltypes = predictions.predicted_labels.predicted_labels.cat.categories
new_celltypes = all_celltypes.difference(predictions.probability_matrix.columns)
predictions.probability_matrix[new_celltypes] = np.repeat(predictions.probability_matrix.max(axis=1).values[:, np.newaxis], len(new_celltypes), axis=1)
then apply the dot plot function on predictions

@sanchezy
Copy link
Author

sanchezy commented Aug 9, 2022

Thanks a lot! It works!

@ChuanXu1
Copy link
Collaborator

I have added support for this kind of plot. This will be available >= CellTypist 1.2.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants