Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when running example data: "XLMHG error" #1

Closed
tilofrei opened this issue Jun 4, 2019 · 7 comments
Closed

Error when running example data: "XLMHG error" #1

tilofrei opened this issue Jun 4, 2019 · 7 comments

Comments

@tilofrei
Copy link

tilofrei commented Jun 4, 2019

Hi,

Thank you for publishing your new algorithm. I am really excited about it! I think it is a common obstacle to find the right markers on the protein level after doing single cell RNA seq.

Could you help me troubleshooting the error message I get when running the test dataset you provided?

The first error is the missing mhg_cython C extension which does not seem to be critical. The XLMHG error appears to have problems with the markers file. I get the same error messages when running my own data.

(comet_env) COMPUTER1:example_ins USER1$ Comet tabmarker.txt tabvis.txt tabcluster.txt output/
Warning (xlmhg): Failed to import "mhg_cython" C extension.
Warning (xlmhg): Failed to import the "mhg_cython" C extension.Falling back to the pure Python implementation, which is very slow.
Started on 2019-06-03T22:14:39.260567
Reading data...
Generating complement data...
[1]
########
# Processing cluster 1...
########
2 gene combinations
Running t test on singletons...
Calculating fold change
Running XL-mHG on singletons...
X = 0
L = 10
Cluster size 5
XLMHG error
('get_xlmhg_stat() takes from 3 to 4 positional arguments but 6 were given', 'occurred at index AAMP')
q-val error
local variable 'xlmhg' referenced before assignment
error in sliding values
local variable 'xlmhg' referenced before assignment
Creating discrete expression matrix...
discrete matrix construction failed
local variable 'cutoff_value' referenced before assignment
Process Process-1:
Traceback (most recent call last):
  File "/usr/local/Cellar/python/3.7.2_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/usr/local/Cellar/python/3.7.2_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/anaconda3/bin/comet_env/lib/python3.7/site-packages/Comet/__main__.py", line 329, in process
    discrete_exp_full = discrete_exp.copy()
UnboundLocalError: local variable 'discrete_exp' referenced before assignment
[2]

Then the same error messages are repeated for cluster 2. The script ends without creating output files. Im running python 3.7.2 on macosx 10.13.6 and made the installation via virtualenv / git clone.

Thank you!

@Cnrdelaney
Copy link
Contributor

Hello!

Thanks for reaching out, I think I have the solution. The 'mhg_cython' install is a codependency with the xlmhg package, which we use as our statistical test. I believe the xlmhg package is currently only supported by python 3.6 versions and it is very much necessary for the tool to run. If you use a python 3.6 virtual environment it should install correctly and work from there. Double-back here if you run into more issues!

@MeromitSinger
Copy link

Please see "Installing via a Virtual Environment" in our documentation:
https://hgmd.readthedocs.io/en/latest/quickstart.html
and let us know if that doesn't work for you.

@tilofrei
Copy link
Author

tilofrei commented Jun 4, 2019

Switching back to python3.6 did the trick indeed - thank you for your quick help @Cnrdelaney and @MeromitSinger!

(In case anybody else is using scanpy for scRNAseq data exploration, this is the export formatting I used:)

# markers.txt
markers = pd.DataFrame(data=adata.T.X, index=adata.var_names, columns=adata.obs_names)
markers.to_csv('markers.txt', sep='\t', index_label=False)

# vis.txt
vis = pd.DataFrame(data=adata.obsm['X_draw_graph_fa'], index=adata.obs_names)
vis.to_csv('vis.txt', header=False, sep='\t')

# cluster.txt
cluster = pd.DataFrame(data=adata.obs['cluster'], index=adata.obs_names)
cluster.to_csv('cluster.txt', header=False, sep='\t')

@tilofrei tilofrei closed this as completed Jun 4, 2019
@MeromitSinger
Copy link

Thanks @tilofreiwald ! We indeed intend for COMET to be a simple expansion from scanpy, and your input is appreciated :-)

@EannaFennell
Copy link

Hello,
I'm getting the exact same error using Python 3.6.10.
I have a conda environment instead of virtualenv though, should this matter?

@kevinwang09
Copy link

Dear maintainer, I am wondering if anything can be done about supporting Python3.7 and beyond? Looking at this schedule, it looks like Python3.6 will no longer be supported at the end of this year. https://www.python.org/downloads/. Thank you.

@Christopher-Yang-Chq
Copy link

Hello, @MeromitSinger
I found the same problem with running example data. I followed the recommendation to create a virtual environment with python 3.6 on windows 10. However, it kept running the same error. I noticed that @tilofrei did the analysis on Mac OS. Is there any difference between windows and Mac?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants