Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HOME-pairwise for different numbers of replicate #31

Open
cell101 opened this issue Oct 21, 2019 · 4 comments
Open

HOME-pairwise for different numbers of replicate #31

cell101 opened this issue Oct 21, 2019 · 4 comments

Comments

@cell101
Copy link

cell101 commented Oct 21, 2019

Hi
I'm trying to use HOME for DMR analysis
I have 6 replicate for WT and 4 replicate for mutant
When I run HOME-pairwise, I got the following error message

[lee@ko44 HOME]$ ${HOME_DMR}/HOME-pairwise -t CG -npp 16 -i ${data}/HOME_DMR_sample_file_CG.txt -o ${data}/HOME_DMR_gz_out --BSSeeker2 --delta 0.2 --minc 5
Traceback (most recent call last):
File "/home/lee/NGS/sw/HOME_met/bin/HOME-pairwise", line 4, in
import('pkg_resources').run_script('HOME==1.0.0', 'HOME-pairwise')
File "/home/lee/NGS/sw/HOME_met/lib/python2.7/site-packages/pkg_resources/init.py", line 666, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/home/lee/NGS/sw/HOME_met/lib/python2.7/site-packages/pkg_resources/init.py", line 1469, in run_script
exec(script_code, namespace, namespace)
File "/home/lee/NGS/sw/HOME_met/lib/python2.7/site-packages/HOME-1.0.0-py2.7.egg/EGG-INFO/scripts/HOME-pairwise", line 314, in

File "/home/lee/NGS/sw/HOME_met/lib/python2.7/site-packages/pandas/io/parsers.py", line 498, in parser_f
return _read(filepath_or_buffer, kwds)
File "/home/lee/NGS/sw/HOME_met/lib/python2.7/site-packages/pandas/io/parsers.py", line 285, in _read
return parser.read()
File "/home/lee/NGS/sw/HOME_met/lib/python2.7/site-packages/pandas/io/parsers.py", line 747, in read
ret = self._engine.read(nrows)
File "/home/lee/NGS/sw/HOME_met/lib/python2.7/site-packages/pandas/io/parsers.py", line 1197, in read
data = self._reader.read(nrows)
File "pandas/parser.pyx", line 766, in pandas.parser.TextReader.read (pandas/parser.c:7988)
File "pandas/parser.pyx", line 788, in pandas.parser.TextReader._read_low_memory (pandas/parser.c:8244)
File "pandas/parser.pyx", line 842, in pandas.parser.TextReader._read_rows (pandas/parser.c:8970)
File "pandas/parser.pyx", line 829, in pandas.parser.TextReader._tokenize_rows (pandas/parser.c:8838)
File "pandas/parser.pyx", line 1833, in pandas.parser.raise_parser_error (pandas/parser.c:22649)
pandas.parser.CParserError: Error tokenizing data. C error: Expected 5 fields in line 2, saw 7

I set HOME_DMR_sample_file_CG.txt as
mutant /data/path/mutant.r1.CGmap.gz data/path/mutant.r2.CGmap.gz data/path/mutant.r3.CGmap.gz data/path/mutant.r4.CGmap.gz
WT /data/path/WT.r1.CGmap.gz /data/path/WT.r2.CGmap.gz /data/path/WT.r3.CGmap.gz /data/path/WT.r4.CGmap.gz /data/path/WT.r5.CGmap.gz /data/path/WT.r6.CGmap.gz

I wonder that HOME does not support samples with different replicate numbers

Do you have any suggestion for this case?

@Akanksha2511
Copy link
Collaborator

Akanksha2511 commented Oct 22, 2019

Hi,
HOME supports samples with different number of replicates, so that's not the issue.
Is your HOME_DMR_sample_file_CG.txt tab separated?

If not, please tab separate it.

Also, there seems to be missing slash in the path for replicate 2, 3 and 4 for mutant. Please check the path for the replicates and provide the full path (HOME does not support relative path at the moment).

Thanks,
Akanksha

@cell101
Copy link
Author

cell101 commented Oct 22, 2019

Hi Akanksha

As your comments, I used full path and tab separated.
but It was not working and showed same error message

pandas.parser.CParserError: Error tokenizing data. C error: Expected 5 fields in line 2, saw 7

I tested HOME_DMR_sample_file_CG.txt with blank tab to make same field number.
then it works

mutant /data/path/mutant.r1.CGmap.gz data/path/mutant.r2.CGmap.gz data/path/mutant.r3.CGmap.gz data/path/mutant.r4.CGmap.gz {tab} {tab}
WT /data/path/WT.r1.CGmap.gz /data/path/WT.r2.CGmap.gz /data/path/WT.r3.CGmap.gz /data/path/WT.r4.CGmap.gz /data/path/WT.r5.CGmap.gz /data/path/WT.r6.CGmap.gz

I don't know why but it works.
I think during HOME-pairwise, pandas.parser need same field number for mutant and WT for input.

I also used testcase file with

sample1 /home/lee/NGS/sw/HOME/testcase/CG/sample1_r1.txt /home/lee/NGS/sw/HOME/testcase/CG/sample1_r2.txt
sample2 /home/lee/NGS/sw/HOME/testcase/CG/sample2_r1.txt /home/lee/NGS/sw/HOME/testcase/CG/sample2_r2.txt /home/lee/NGS/sw/HOME/testcase/CG/sample3_r1.txt

( I added sample3_r1.txt because if use 1 replicate vs 2 replicate, I got following message)
error: cannot handle 1 replicate in 1 group and more than 1 in other

Traceback (most recent call last):
File "/home/lee/NGS/sw/HOME_met/bin/HOME-pairwise", line 4, in
import('pkg_resources').run_script('HOME==1.0.0', 'HOME-pairwise')
File "/home/lee/NGS/sw/HOME_met/lib/python2.7/site-packages/pkg_resources/init.py", line 666, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/home/lee/NGS/sw/HOME_met/lib/python2.7/site-packages/pkg_resources/init.py", line 1469, in run_script
exec(script_code, namespace, namespace)
File "/home/lee/NGS/sw/HOME_met/lib/python2.7/site-packages/HOME-1.0.0-py2.7.egg/EGG-INFO/scripts/HOME-pairwise", line 314, in

File "/home/lee/NGS/sw/HOME_met/lib/python2.7/site-packages/pandas/io/parsers.py", line 498, in parser_f
return _read(filepath_or_buffer, kwds)
File "/home/lee/NGS/sw/HOME_met/lib/python2.7/site-packages/pandas/io/parsers.py", line 285, in _read
return parser.read()
File "/home/lee/NGS/sw/HOME_met/lib/python2.7/site-packages/pandas/io/parsers.py", line 747, in read
ret = self._engine.read(nrows)
File "/home/lee/NGS/sw/HOME_met/lib/python2.7/site-packages/pandas/io/parsers.py", line 1197, in read
data = self._reader.read(nrows)
File "pandas/parser.pyx", line 766, in pandas.parser.TextReader.read (pandas/parser.c:7988)
File "pandas/parser.pyx", line 788, in pandas.parser.TextReader._read_low_memory (pandas/parser.c:8244)
File "pandas/parser.pyx", line 842, in pandas.parser.TextReader._read_rows (pandas/parser.c:8970)
File "pandas/parser.pyx", line 829, in pandas.parser.TextReader._tokenize_rows (pandas/parser.c:8838)
File "pandas/parser.pyx", line 1833, in pandas.parser.raise_parser_error (pandas/parser.c:22649)
pandas.parser.CParserError: Error tokenizing data. C error: Expected 3 fields in line 2, saw 4

if I add blank tab to make same filed number,
sample1 /home/lee/NGS/sw/HOME/testcase/CG/sample1_r1.txt /home/lee/NGS/sw/HOME/testcase/CG/sample1_r2.txt {tab}
sample2 /home/lee/NGS/sw/HOME/testcase/CG/sample2_r1.txt /home/lee/NGS/sw/HOME/testcase/CG/sample2_r2.txt /home/lee/NGS/sw/HOME/testcase/CG/sample3_r1.txt

it works
Preparing the DMRs from HOME.....
GOOD LUCK !
DMRs for sample1_VS_sample2_13 done
DMRs for sample1_VS_sample2_10 done
DMRs for sample1_VS_sample2_12 done
Congratulations the DMRs are ready

I hope this report helps for improving HOME

@Akanksha2511
Copy link
Collaborator

Ok glad it worked. We tested for equal number of replicates and it works perfectly fine but will test it again thanks.

@Fred6887
Copy link

I get the exact same error, but for me the blank tab doesn't work. There is an issue when replicate numbers are not the same. I have 2 replicates for the WT and 4 for the mutant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants