Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IrChain instantiation fails when no expr parameter is specified #253

Closed
diitaz93 opened this issue Mar 18, 2021 · 5 comments · Fixed by #241
Closed

IrChain instantiation fails when no expr parameter is specified #253

diitaz93 opened this issue Mar 18, 2021 · 5 comments · Fixed by #241
Projects

Comments

@diitaz93
Copy link

Thanks for this amazing toolkit!
I am trying to import my data in "other format" in a similar fashion as in the importing tutorial. However, my data does not have a TCR expression field, so when I try to omit it, it takes the default value (expr=None) and this crashes an internal casting operation. In this toy example you can see the error:

import numpy as np
import pandas as pd
import scanpy as sc
import scirpy as ir
from matplotlib import pyplot as plt

ir.io.IrChain(locus='TRA',cdr3_nt='CCACACTTTGTCA')
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-4-c503e025a0de> in <module>
----> 1 ir.io.IrChain(locus='TRA',cdr3_nt='CCACACTTTGTCA')

~/anaconda3/lib/python3.7/site-packages/scirpy/io/_datastructures.py in __init__(self, locus, cdr3, cdr3_nt, expr, expr_raw, is_productive, v_gene, d_gene, j_gene, c_gene, junction_ins)
     95         self.cdr3 = cdr3.upper() if not _is_na(cdr3) else None
     96         self.cdr3_nt = cdr3_nt.upper() if not _is_na(cdr3_nt) else None
---> 97         self.expr = float(expr)
     98         self.expr_raw = float(expr_raw) if not _is_na(expr_raw) else None
     99         self.is_productive = _is_true(is_productive)

TypeError: float() argument must be a string or a number, not 'NoneType'

As you can see I am using anaconda3, python 3.7 in a jupyter notebook with Ubuntu 20.04. The versions of the modules are the following:

sc.logging.print_header()
scanpy==1.7.1 anndata==0.7.5 umap==0.4.6 numpy==1.17.1 scipy==1.5.2 pandas==1.2.3 scikit-learn==0.23.2 statsmodels==0.11.1 python-igraph==0.9.0
ir.__version__
'0.6.1'
@grst
Copy link
Collaborator

grst commented Mar 18, 2021

Hi Sebastian,

I'm afraid the expr column is currently mandatory. As a workaround, you could simply pass expr=1 for all chains.
I can try to make it optional while working on #241.

The reason why it is required, is that when multiple VJ or VDJ chains are provided per cell, the chain with the higher expression is considered the "primary" chain, while the one with the lower expression considered "secondary". If more than two chains are provided, the ones with lower expression are discarded and the cell flagged as "multichain".

Cheers,
Gregor

@diitaz93
Copy link
Author

HI Gregor,
Thanks for your answer, the workaround of setting expr=1 works fine for now.

Cheers,
Sebastian

@grst grst added this to In progress in scirpy-dev Mar 21, 2021
@grst grst closed this as completed in #241 Apr 7, 2021
scirpy-dev automation moved this from In progress to Done Apr 7, 2021
@grst
Copy link
Collaborator

grst commented Apr 7, 2021

Hi @diitaz93,

#241 is now merged into master, a release follows probably end of April.
I'd be happy about feedback -- If you want to give it a try you can install the development version using

pip install git+https://github.com/icbi-lab/scirpy.git@master

The docs of the development version are available here.

In #241, I changed the scirpy data structure to use AIRR Rearrangement column names everywhere. Additionally, all columns are now optional, except for those mandated by the standard.

@diitaz93
Copy link
Author

Hi @grst,
Sorry for the late response. I have tested the development version as in the tutorial and everything works fine. I saw that the previous error now returns a warning if the expression (now consensus_count) is missing, which I think is a great idea.

In the Loading tutorial, in the part of importing data in other formats, however, I believe that cells 18 and 19 are swapped, as adata_tcr.obs is called before adata_tcr is defined.

Cheers!
Sebastian

@grst
Copy link
Collaborator

grst commented Apr 28, 2021

Good catch, fixed in #263!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
scirpy-dev
  
Done
Development

Successfully merging a pull request may close this issue.

2 participants