-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pre-aligned BAM files, and reversion to single end for HiC-Pro use #105
Comments
Hi James,
|
I just tried this method, but it doesn't recognize the format of the resulting sam file. I have removed some of the condition names for privacy purposes of those with whom I work. See the results below:
|
could you try with BAM files instead of SAM ? |
I did, but I could try making SAM files instead of bam if it is helpful. Here is the syntax I used to create them. samtools view -f 0x40 /lscratch/${SLURM_JOBID}/X_AHJNHNAFXX.1009_NEXTSEQ-2017-04-25.fq.hg19.bwa.sorted.bam > /lscratch/${SLURM_JOBID}/Pg/X_AHJNHNAFXX.1009_NEXTSEQ-2017-04-25.fq.hg19.bwa.sorted_R1.bam samtools view -f 0x40 /lscratch/${SLURM_JOBID}/X_AHJY2LAFXX.1005_NEXTSEQ-2017-04-18.fq.hg19.bwa.sorted.bam > /lscratch/${SLURM_JOBID}/Neg/X_AHJY2LAFXX.1005_NEXTSEQ-2017-04-18.fq.hg19.bwa.sorted_R1.bam |
Here is the config file: Please change the variable settings below if necessary######################################################################### Paths and Settings - Do not edit !######################################################################### TMP_DIR = tmp ####################################################################### SYSTEM - PBS - Start Editing Here !!####################################################################### JOB_NAME = hicpro-HiChip-Control ######################################################################### Data######################################################################### PAIR1_EXT = _R1 ####################################################################### Alignment options####################################################################### FORMAT = phred33 BOWTIE2_IDX_PATH = /fdb/igenomes/Homo_sapiens/UCSC/hg19/Sequence/Bowtie2Index/ ####################################################################### Annotation files####################################################################### REFERENCE_GENOME = genome ####################################################################### Allele specific####################################################################### ALLELE_SPECIFIC_SNP = ####################################################################### Digestion Hi-C####################################################################### GENOME_FRAGMENT = /data/path/HindIII_resfrag_hg19.bed ####################################################################### Hi-C processing####################################################################### MIN_CIS_DIST = ####################################################################### Contact Maps####################################################################### BIN_SIZE = 1000 2500 5000 10000 25000 500000 1000000 ####################################################################### ICE Normalization####################################################################### |
Could you please show me the content of : |
You have to use the data organisation with one folder per sample. best |
Here is the directory listing, after performing the asked operations (on the deleted post, I noticed I had done ls prior to the samtools lines, hence the lack of split reads): |
ok. Sounds good, but please remove the |
done. samtools view -f 0x40 /lscratch/${SLURM_JOBID}/X_AHJNHNAFXX.1009_NEXTSEQ-2017-04-25.fq.hg19.bwa.sorted.bam > /lscratch/${SLURM_JOBID}/Pg/X_AHJNHNAFXX.1009_NEXTSEQ-2017-04-25.fq.hg19.bwa.sorted_R1.bam samtools view -f 0x40 /lscratch/${SLURM_JOBID}/X_AHJY2LAFXX.1005_NEXTSEQ-2017-04-18.fq.hg19.bwa.sorted.bam > /lscratch/${SLURM_JOBID}/Neg/X_AHJY2LAFXX.1005_NEXTSEQ-2017-04-18.fq.hg19.bwa.sorted_R1.bam OUTPUT: Run HiC-Pro 2.9.0Thu Nov 2 13:54:10 EDT 2017 |
from pysam-developers/pysam#51, it seems that there may be a way to make it work by setting check_sq to false on line 222 of mergeSAM.py. |
Could you show the mergeSAM.py command line used. It should be written in the hicpro_T47D-HiChip-Pg.log file. |
/usr/local/Anaconda/envs_app/hicpro/2.9.0/bin/python /usr/local/Anaconda/envs_app/hicpro/2.9.0/HiC-Pro_2.9.0/scripts/mergeSAM.py -q 0 -t -v -f bam_input/Neg/X_AHJY2LAFXX.1005_NEXTSEQ-2017-04-18.fq.hg19.bwa.sorted_R1.bam -r bam_input/Neg/X_AHJY2LAFXX.1005_NEXTSEQ-2017-04-18.fq.hg19.bwa.sorted_R2.bam -o bowtie_results/bwt2/Neg/X_AHJY2LAFXX.1005_NEXTSEQ-2017-04-18.fq.hg19.bwa.sorted.bam.bwt2pairs.bam > logs/Neg/mergeSAM.log |
why do you have bam_input/... instead of rawdata/... ?? |
I figured it would be best to store the bam files in a separate directory so that the precise files I wanted (the bam files) could be read. I've included the output. It's very small and only consists of logs. I've also sent you the data, with permission from the experimenter. Thanks so much for being willing to look into this. I'd love to use this for BAM files and do the alignment myself, if possible. Having bam files input be more seamless should make the software better for many others outside of my use case as well. If you can, please keep the input data confidential. |
Hi James, |
Sounds good. Thanks so much for looking into this. If you find a way to make it work, let me know. I'd be interested to make this work as it will speed processing by quite a bit if one can bypass the aligment step and also allow workarounds when alignment issues arise. If there is anything further I can do, let me know. I imagine you are quite busy. |
Was there ever a resolution to this? I'm also trying to run HiCPro on some pre-aligned data and having a lot of trouble. It would save me a lot of time to bypass the bowtie alignment step, so I'd love to find out if it's possible. |
not yet, but it is in my top priority ! |
We found that we couldn't do it on non-bowtie aligned files previously as HiC-pro functions are dependent on bowtie format BAM files. Just my thinking, but perhaps you could try if the files were bowtie aligned. |
Thanks! My files are bowtie aligned (they're actually just subsampled from a previous alignment run through HiC-Pro), but I'm not sure exactly which files HiC-Pro needs for the next steps. The docs just say ".bam" files, but does this mean the bwt2pairs.bam, or bwt2merged.bam files? And does it also require any of the .mapstat or .mmapstat or .mpairstat files? |
Hi, |
Perfect, thanks! I was doing exactly what you said, but used the bwt2pairs.bam files, so that must be why it wasn't working. |
Yes I know. This is exactly what I have to update. |
Hello, |
yes, move to the nf-core-hic pipeline https://nf-co.re/hic/latest/docs/usage/ |
Hello!
I'm having difficulty running the program in a stepwise fashion:
After copying two bam files to lscratch, I receive the following error:
"Exit: Error: Directory Hierarchy of rawdata '/lscratch/52514766/' is not correct. Paired '.bam' files with _R1/_R2 are required for 'proc_hic quality_checks merge_persample build_contact_maps ice_norm' step(s)"
I was aware that R1 and R2 fastq files were needed, but I was unaware that R1 and R2 bam files would be needed. What is the best tool to convert a single, aligned bam file to a pair of bamfiles that HiC-Pro will accept?
My syntax is as follows (I am attempting to do all of the steps except for the first, which requires fastq files).
HiC-Pro -i /lscratch/${SLURM_JOBID}/ -o /path/to/ouptut -c config_file.txt -s proc_hic -s quality_checks -s merge_persample -s build_contact_maps -s ice_norm
Thanks,
James D
The text was updated successfully, but these errors were encountered: