Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error in xpore diffmod #219

Open
lyj95618 opened this issue Jun 3, 2024 · 2 comments
Open

error in xpore diffmod #219

lyj95618 opened this issue Jun 3, 2024 · 2 comments

Comments

@lyj95618
Copy link

lyj95618 commented Jun 3, 2024

Hi,

I got the following error when I ran xpore diffmod

Loading python/3.9.13
  Loading requirement: gcc/7.2.0 readline/8.1 curl/7.74.0 libxml2/2.9.1
    pcre/8.44.utf8 libpng/1.2.59 sqlite/3.35.3 geos/3.4.2 libtiff/4.0.9
    proj/7.2.0 tcltk/8.6.11 CpG-tools/1.1.0
Using the signal of unmodified RNA from /hpf/largeprojects/ccmbio/yliang/long_read_RNA/nanopore_brian/python_venv/lib/python3.9/site-packages/xpore/diffmod/model_kmer.csv
Process Consumer-11:
Traceback (most recent call last):
  File "/hpf/largeprojects/ccmbio/yliang/long_read_RNA/nanopore_brian/python_venv/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3790, in get_loc
    return self._engine.get_loc(casted_key)
  File "index.pyx", line 152, in pandas._libs.index.IndexEngine.get_loc
  File "index.pyx", line 181, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 7080, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 7088, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'GCTATGCTC'

This is my yaml for input

data:
    MIA:
        rep1: /hpf/largeprojects/ccmbio/acelik_files/kalish/nanopore/nanopore/lauren_test/debug_nanopolish/new_xpore/8327-M2/dataprep
        rep2: /hpf/largeprojects/ccmbio/acelik_files/kalish/nanopore/nanopore/lauren_test/debug_nanopolish/new_xpore/8327-M3/dataprep
        rep3: /hpf/largeprojects/ccmbio/acelik_files/kalish/nanopore/nanopore/lauren_test/debug_nanopolish/new_xpore/Sample1/dataprep
    PBS:
        rep1: /hpf/largeprojects/ccmbio/acelik_files/kalish/nanopore/nanopore/lauren_test/debug_nanopolish/new_xpore/4147-M1/dataprep
        rep2: /hpf/largeprojects/ccmbio/acelik_files/kalish/nanopore/nanopore/lauren_test/debug_nanopolish/new_xpore/4147-M2/dataprep
        rep3: /hpf/largeprojects/ccmbio/acelik_files/kalish/nanopore/nanopore/lauren_test/debug_nanopolish/new_xpore/Sample2/dataprep

out: /hpf/largeprojects/ccmbio/acelik_files/kalish/nanopore/nanopore/lauren_test/debug_nanopolish/new_xpore/diffmod_output

sample1 and sample2 were from R10 flowcell and the rest were from R9 flowcell. Since nanopolish doesn't support R10 data, I used f5c, which supports R10 and R9 and does the same thing as nanopolish, to process all the data. I noticed there are some differences in the eventalign.txt output. In the eventalign output file, the two samples from R10 have the 9 k-mer and the rest R9 data has 5 k-mer.

R10 data eventalign output:

contig	position	reference_kmer	read_index	strand	event_index	event_level_mean	event_stdv	event_length	model_kmer	model_mean	model_stdv	standardized_level	start_idx	end_idx
ENSMUST00000103679.2	4	GATAAGGAT	0	t	995	102.35	3.626	0.00350	GATAAGGAT	97.12	3.70	1.23	49450	49464
ENSMUST00000103679.2	5	ATAAGGATT	0	t	996	116.55	6.388	0.00350	ATAAGGATT	111.40	3.22	1.40	49436	49450

R9 data:

contig	position	reference_kmer	read_index	strand	event_index	event_level_mean	event_stdv	event_length	model_kmer	model_mean	model_stdv	standardized_level	start_idx	end_idx
ENSMUST00000181768.2	21	AGGTG	0	t	1	108.57	6.003	0.00400	AGGTG	117.25	3.37	-2.28	126162	126174

Would this be the issue of why xpore outputs this error?

Thanks for the help!
Laur

@yuukiiwa
Copy link
Collaborator

yuukiiwa commented Jun 4, 2024

Hi Laur (tagging you here @lyj95618),

xpore only support 5mer comparison for now, so 9mer doesn't work.

Thanks!

Best wishes,
Yuk Kei

@lyj95618
Copy link
Author

lyj95618 commented Jun 7, 2024

Thank you for your reply and the suggestion in another thread about changing the 9mer to 5mer!

I have one more question about the xPore comparison analysis. Since I am combining data from R9 flowcell and R10(rna004) flowcell, is there a way xPore can adjust for the potential batch effect?

My comparison condition:

Sample A (R9), Sample B (R9), Sample1 (rna004) Vs Sample D (R9), Sample E (R9), Sample2 (rna004)

Thanks,
Laur

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants