Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to open the database file - Program halted (file path variable related error) #4

Open
MDSharma opened this issue Aug 19, 2018 · 4 comments

Comments

@MDSharma
Copy link

MDSharma commented Aug 19, 2018

Hi Folks, looks like there is a small error in the KREATION.py script when it executes cd-hit-est. Please see reproduced error below - I think, when the cd-hit-est step is being executed, its not picking up the correct path for the *transcripts_org_clu.fa file and then cd-hit-est crashes.

Done with 91116 scaffolds, 0 gaps finished, 163126 gaps overall
time elapsed: 44m
time for the whole pipeline: 673m
mv: missing file operand
Try 'mv --help' for more information.

Fatal Error:
Failed to open the database file
Program halted !!

Looks like the mv command fails due to an operand being missing and then the cd-hit-est fails due to an incomplete path. A quick fix will be much appreciated.

cd-hit-est -i /21_transcripts_org_clu.fa -o /scratch/abcd-2/pipeline/Cluster/21/transcripts_clust.fa -c 0.99 -M 2000M -T 10
================================================================
Program: CD-HIT, V4.7 (+OpenMP), Aug 18 2018, 08:08:27
Command: cd-hit-est -i /21_transcripts_org_clu.fa -o
         /scratch/abcd-2/pipeline/Cluster/21/transcripts_clust.fa
         -c 0.99 -M 2000M -T 10

Started: Sun Aug 19 21:02:31 2018
================================================================
                            Output
----------------------------------------------------------------

Fatal Error:
Failed to open the database file
Program halted !!
@MDSharma
Copy link
Author

This seems to be the problematic section:

        os.system(cmd_rn)
        os.system("cd-hit-est -i "+dirname+"/"+str(i)+"_transcripts_org_clu.fa -o "+output+"/Cluster/"+str(i)+"/transcripts_clust.fa -c 0.99 -M 2000M -T 10 >> "+output+"/Cluster/"+str(i)+"/transcripts_clust.log")

A few lines above it, dirname is set as follows:

        dirname = os.path.dirname(ts)
        #ts1=ts.replace("transcripts.fa","transcripts_org.fa")
        ts1=ts.replace(str(filename.strip()),str(i)+"_transcripts_org.fa")
        print "moving stuff 1"
        os.system("mv "+ts+" "+ts1)
        cmd_rn = "perl "+cwd+"/src/rename_sequence.pl "+ts1+" "+str(i)+""
        os.system(cmd_rn)
        os.system("cd-hit-est -i "+dirname+"/"+str(i)+"_transcripts_org_clu.fa -o "+output+"/Cluster/"+str(i)+"/transcripts_clust.fa -c 0.99 -M 2000M -T 10 >> "+output+"/Cluster/"+str(i)+"/transcripts_clust.log")

@MDSharma MDSharma changed the title Failed to open the database file - Program halted (cd-hit-est related error) Failed to open the database file - Program halted (file path variable related error) Aug 19, 2018
@rafaeltiveron
Copy link

rafaeltiveron commented Oct 20, 2018

@daokoder
@weizhongli

@rafaeltiveron
Copy link

rafaeltiveron commented Oct 22, 2018

The problem is on these lines:

status, ts=commands.getstatusoutput("find "+output2.strip()+" -name "+filename.strip())
#status, ts=commands.getstatusoutput("find "+output.strip()+" -name transcripts_"+str(i)+".fa")
dirname = os.path.dirname(ts)
#ts1=ts.replace("transcripts.fa","transcripts_org.fa")
ts1=ts.replace(str(filename.strip()),str(i)+"_transcripts_org.fa")
os.system("mv "+ts+" "+ts1)

If "filename.strip()" is substituted in variables "ts" and "ts1", like:

status, ts=commands.getstatusoutput("find "+output2.strip()+" -name transcripts.fa")

And

ts1=ts.replace("transcripts.fa",str(i)+"_transcripts_org.fa")

I think the problem will be solved for using Oases as assembler. Let's try. For SOAPdenovo, I don't recommend doing this, because it is not necessary to get it working.

@rafaeltiveron
Copy link

rafaeltiveron commented Oct 26, 2018

I've tested the recommendations above. But it seems algorithm still doesn't recognize files generated by itself:

rm: could not remove '/.../Assembly//21/oasesPipeline_21/21_transcripts_org_clu.fa': File or directory not found
rm: Could not remove '/.../Cluster/Combined/combine_p.fa': File or directory not found
mv: unable to get status from '/.../Cluster/Combined/combine.fa': File or directory not found
rm: could not remove '/.../Assembly//23/oasesPipeline_23/23_transcripts_org_clu.fa': File or directory not found
rm: Could not remove '/.../Cluster/Combined/combine_p.fa': File or directory not found
rm: could not remove '/.../Assembly//25/oasesPipeline_25/25_transcripts_org_clu.fa': File or directory not found
rm: could not remove '/.../Assembly//27/oasesPipeline_27/27_transcripts_org_clu.fa': File or directory not found
rm: could not remove '/.../Assembly//29/oasesPipeline_29/29_transcripts_org_clu.fa': File or directory not found
rm: could not remove '/.../Assembly//31/oasesPipeline_31/31_transcripts_org_clu.fa': File or directory not found

Looking at the folders, these files were generated and filled by the software normally, until k-mer 31 cycle. After that, the workflow breaks:

velvetg: Could not write to oasesPipeline_33/Log: No such file or directory
Traceback (most recent call last):
File "/mnt/hgfs/SharedFolders/programas/KREATION/oases/oases_pipeline_2.py", line 123, in <module>
main()
File "/mnt/hgfs/SharedFolders/programas/KREATION/oases/oases_pipeline_2.py", line 117, in main singleKAssemblies(options)
File "/mnt/hgfs/SharedFolders/programas/KREATION/oases/oases_pipeline_2.py", line 45, in singleKAssemblies
assert p.returncode == 0, "Velvetg failed at k = %i\n%s" % (k, output[0]) AssertionError: Velvetg failed at k = 33

All the commands were typed using root account.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants