-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Building Index WGBS Error #16
Comments
Hi Bonnie,
Acturally, you need to download BS-Seeker2 from the homepage: https://github.com/BSSeeker/BSseeker2 . There are other scripts in the packages are needed.
For your second question, I suggest you unzip the .fa.gz to .fa file, and then build the genome
Best,
Weilong
At 2017-11-10 10:57:01, "bacantre" <notifications@github.com> wrote:
Hello,
I am trying to use BS Seeker 2 to make an index of the Bovine Reference Genome for later alignment with WGBS data.
I ran the code:
python bs_seeker2-build.py -f ~/reference/file/location --aligner=bowtie2 -d ~/output/file/location
I got the error:
Traceback (most recent call last):
File "bs_seeker2-build.py", line 5, in ,module. from bs_index.wg_build import *
ImportError: No module names bs_index.wg_build
I had my university core download BS Seeker2 and Bowtie2 to the server I am using. I have also downloaded the bs_seeker2-build.py, bs_seeker2-align.py, and bs_seeker2-call_methylation.py files from here. Are there other files that I or the university core need to download?
I was also assuming the .py files were meant to run as is so I did not modify them any except save them in a .txt and then change the file name to .py before transferring it to the server.
Also do I need to unzip the reference genome from fa.gz to make it fa or will it run as a fa.gz?
Thank you,
Bonnie
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Hello Weilong,
I have been in contact with my university and am trying to figure out what went wrong with the download of this. I ended up copying over this entire download and unzipped it. I then ran this code and got back this error:
python bs_seeker2-build.py -f UMD3.1_chromosomes.fa --aligner=bowtie2 -d ~/WGBS_Bovine_Brain/reference/
BS-Seeker2 v2.1.3 - Oct. 25, 2017
Traceback (most recent call last):
File "bs_seeker2-build.py", line 56, in <module>
if os.path.isfile(os.path.join(os.dbpath, fasta_file)):
AttributeError: 'module' object has no attribute 'dbpath'
I have the unzipped file located in ~/WGBS_Bovine_Brain/reference/ and am running this code within the BSseeker2-master folder so that I am running the code in the same location as the bs_seeker2-build.py. I unzipped the reference genome (UMD3.1_chromosomes.fa from UMD3.1_chromosomes.fa.gz, but zcat left the .gz so I just removed it in WinSCP, I am not actually sure if this is a proper way to unzip it and if that is my problem).
Do you know what I am doing wrong? Is there any recommendations for what I need to tell my University to make sure this is uploaded correctly if it is something on their end? They have me load BSseeker2 and Bowtie2 by doing the commands “spack load BSseeker2” and “spack load Bowtie2” if this helps you.
Thank you,
Bonnie
From: Weilong Guo [mailto:notifications@github.com]
Sent: Thursday, November 9, 2017 10:04 PM
To: BSSeeker/BSseeker2 <BSseeker2@noreply.github.com>
Cc: Bonnie Cantrell <bacantre@uvm.edu>; Author <author@noreply.github.com>
Subject: Re: [BSSeeker/BSseeker2] Building Index WGBS Error (#16)
Hi Bonnie,
Acturally, you need to download BS-Seeker2 from the homepage: https://github.com/BSSeeker/BSseeker2 . There are other scripts in the packages are needed.
For your second question, I suggest you unzip the .fa.gz to .fa file, and then build the genome
Best,
Weilong
At 2017-11-10 10:57:01, "bacantre" <notifications@github.com<mailto:notifications@github.com>> wrote:
Hello,
I am trying to use BS Seeker 2 to make an index of the Bovine Reference Genome for later alignment with WGBS data.
I ran the code:
python bs_seeker2-build.py -f ~/reference/file/location --aligner=bowtie2 -d ~/output/file/location
I got the error:
Traceback (most recent call last):
File "bs_seeker2-build.py", line 5, in ,module. from bs_index.wg_build import *
ImportError: No module names bs_index.wg_build
I had my university core download BS Seeker2 and Bowtie2 to the server I am using. I have also downloaded the bs_seeker2-build.py, bs_seeker2-align.py, and bs_seeker2-call_methylation.py files from here. Are there other files that I or the university core need to download?
I was also assuming the .py files were meant to run as is so I did not modify them any except save them in a .txt and then change the file name to .py before transferring it to the server.
Also do I need to unzip the reference genome from fa.gz to make it fa or will it run as a fa.gz?
Thank you,
Bonnie
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#16 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/Af-0TTKh_EzFQoBhpIjCkEviM5FjNrpOks5s070HgaJpZM4QZAGV>.
|
Thanks for reporting this error message. It was a bug, but rare reported. Now it has been fixed in v2.1.5. Best, |
Hello Weilong,
Thank you for all of your help and for fixing this.
I tried to rerun the code again with the new version and got a little farther in the code, but am still receiving errors. This is what I got. It seems like it might be a problem with my reference genome file. Do you know from this error what I am doing wrong?
Code below.
python bs_seeker2-build.py -f ~/WGBS_Bovine_Brain/reference/UMD3.1_chromosomes.fa --aligner=bowtie2 -d ~/WGBS_Bovine_Brain/reference/
BS-Seeker2 v2.1.5 - Dec. 21, 2017
Reference genome file: /users/b/a/bacantre/WGBS_Bovine_Brain/reference/UMD3.1_chromosomes.fa
Reduced Representation Bisulfite Sequencing: False
Short reads aligner you are using: bowtie2
Builder path: /gpfs1/arch/spack-20170803/opt/spack/linux-rhel7-x86_64/gcc-5.4.0/ bowtie2-2.2.5-ojakqj7byt2zuj6hn2d5ezuopyhudm5d/bin/bowtie2-build
[Preprocessing 8__fO_] Last: 0:00:00.023925 Total: 0:00:00.023955
[Preprocessing _______y_k______0XN_____D______a9___vt__S____K________Q0_X______] Last: 0:00:00.003813 Total: 0:00:00.027802
[Preprocessing _____Y_____2_____] Last: 0:00:00.056983 Total: 0:00:00.084818
[Preprocessing ____W_i__W___________47__po____] Last: 0:00:00.002809 Total: 0 :00:00.087659
[Preprocessing _7_c___________9____________________M_a_______Y____b_c__1___] Las t: 0:00:00.000844 Total: 0:00:00.088533
[Preprocessing ________Ff__] Last: 0:00:00.008730 Total: 0:00:00.097293
[Preprocessing _____649_____________1______9_____] Last: 0:00:00.008953 T otal: 0:00:00.106276
[Preprocessing _h_] Last: 0:00:00.006022 Total: 0:00:00.112328
[Preprocessing __fo__cp_E__________2__T__U_____k____3_L____Z____________bXl_Yl__ ___________1____L_________N_8m] Last: 0:00:00.004124 Total: 0:00:00.116483
[Preprocessing _S____y___M_______p______L__7___E_] Last: 0:00:00.003323 T otal: 0:00:00.119865
[Preprocessing __K8_______G_RT_____0___k__________p_____UO_Xy] Last: 0:00:00.004 664 Total: 0:00:00.124559
[Preprocessing _l________] Last: 0:00:00.018344 Total: 0:00:00.142935
[Preprocessing ______I________H_WDQi_9_______CRqI__8kB__r_g_R__________70_____ZU ____Y____i_c__dJ_____N__tp___5_Eq__________T_Fa_A_] Last: 0:00:00.002081 T otal: 0:00:00.145049
[Preprocessing e____________C__H1____m____xX______v______t__1___D___m______i____ ____1________i___Uz__cjR__L_] Last: 0:00:00.015408 Total: 0:00:00.160491
[Preprocessing _] Last: 0:00:00.003249 Total: 0:00:00.163771
[Preprocessing W______1_____m______c___h9_______gV_____3_WwIhx______BU_______bW_ _____K_____n________Q__SaP_____O_S______f___Y_i______4x_J_______] Last: 0:00:00. 001747 Total: 0:00:00.165547
[Preprocessing 5R__7____Y______TEM___B___D_n_k__p_________D__O_S____S__w_0kO____ ___t__] Last: 0:00:00.007278 Total: 0:00:00.172856
[Preprocessing ___U__q_A0____2___A_Q__h_D______C] Last: 0:00:00.030251 Total: 0 :00:00.203164
[Preprocessing H_M_J___4T___Mk__L] Last: 0:00:00.015125 Total: 0:00:00.2 18320
[Preprocessing __C__k_kM_______eD__Dj_____5_________d_p__T___i____H_P] Last: 0:0 0:00.007477 Total: 0:00:00.225829
[Preprocessing __________B___F_0____pT_l_z_F___qit__F_l__5B6__K__l___EO________S OM_tL___b_______B_B_x___aKO____0____sz_] Last: 0:00:00.002442 Total: 0:00:00.2 28302
[Preprocessing __R__4_____6_______va______v____D________tX__] Last: 0:00:00.0167 71 Total: 0:00:00.245107
[Preprocessing ____yy________t_x___z_3_______________4__L________a_____LG__Wf___ 3_gef___W_N_Uf_____3___c_________3U__B____3__F_________A____r_____2_gv___ge_9j__ __W___] Last: 0:00:00.007155 Total: 0:00:00.252294
[Preprocessing ] Last: 0:00:00.003678 Total: 0:00:00.256004
[Preprocessing __1__TL__I__1] Last: 0:00:00.006726 Total: 0:00:00.262761
ERROR: BS Seeker found identical sequence ids (id: _) in the fasta file: /users/ b/a/bacantre/WGBS_Bovine_Brain/reference/UMD3.1_chromosomes.fa. Please, make sure that all sequence ids are unique and contain only alphanumeric characters: A-Za-z0-9_
Thank you,
Bonnie
From: Weilong Guo [mailto:notifications@github.com]
Sent: Thursday, November 9, 2017 10:04 PM
To: BSSeeker/BSseeker2 <BSseeker2@noreply.github.com>
Cc: Bonnie Cantrell <bacantre@uvm.edu>; Author <author@noreply.github.com>
Subject: Re: [BSSeeker/BSseeker2] Building Index WGBS Error (#16)
Hi Bonnie,
Acturally, you need to download BS-Seeker2 from the homepage: https://github.com/BSSeeker/BSseeker2 . There are other scripts in the packages are needed.
For your second question, I suggest you unzip the .fa.gz to .fa file, and then build the genome
Best,
Weilong
At 2017-11-10 10:57:01, "bacantre" <notifications@github.com<mailto:notifications@github.com>> wrote:
Hello,
I am trying to use BS Seeker 2 to make an index of the Bovine Reference Genome for later alignment with WGBS data.
I ran the code:
python bs_seeker2-build.py -f ~/reference/file/location --aligner=bowtie2 -d ~/output/file/location
I got the error:
Traceback (most recent call last):
File "bs_seeker2-build.py", line 5, in ,module. from bs_index.wg_build import *
ImportError: No module names bs_index.wg_build
I had my university core download BS Seeker2 and Bowtie2 to the server I am using. I have also downloaded the bs_seeker2-build.py, bs_seeker2-align.py, and bs_seeker2-call_methylation.py files from here. Are there other files that I or the university core need to download?
I was also assuming the .py files were meant to run as is so I did not modify them any except save them in a .txt and then change the file name to .py before transferring it to the server.
Also do I need to unzip the reference genome from fa.gz to make it fa or will it run as a fa.gz?
Thank you,
Bonnie
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#16 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/Af-0TTKh_EzFQoBhpIjCkEviM5FjNrpOks5s070HgaJpZM4QZAGV>.
|
@bacantre It should be TXT file. And do you really have a chromosome or contig name as "8__fO_" ? You can double check if you specific right genome file. Best, |
Hello Weilong,
Happy holidays. I decided to download a different reference genome format and then re-installed it appropriately. This allowed me to run the code:
python bs_seeker2-build.py -f ~/WGBS/reference/referenceUMD3.1.1.fa –aligner=bowtie2 -d ~/WGBS/reference/
I then got a folder named referenceUMD3.1.1.fa_bowtie2 that contains .data files .bt2 files and .log files.
This all seemed to work correctly.
When I went to do the alignment, it all went wrong again. I ran the code and got the error below:
python bs_seeker2-align.py -1 ~/WGBS/fastQfiles/D2239Amy_1.fq -2 ~/WGBS/fastQfiles/D2239Amy_2.fq --aligner=bowtie2 -o ~/WGBS/alignment/D2239Amy.bam -f bam -g ~/WGBS/reference/referenceUMD3.1.1.fa
BS-Seeker2 v2.1.5 - Dec. 21, 2017
ERROR: Index DIR "referenceUMD3.1.1.fa.." cannot be found in /gpfs1/home/b/a/bacantre/WGBS/reference/BSseeker2-master/bs_utils/reference_genomes.
Please run the bs_seeker2-build.py to create it with the correct parameters for -g, -r, --low, --up and --aligner.
It seems like it wants the reference_genomes to a directory folder, but it creates this as a file in bs_utils. I have tried to make a directory called reference_genomes, but get an error because a file is already named it. I have also tried moving the index directory (referenceUMD3.1.1.fa_bowtie2) and reference genome to bs_utils directory that I put in WGBS/reference.
I also tried redoing the build below without specifying a -d, but got this error:
python bs_seeker2-build.py -f ~/WGBS/reference/referenceUMD3.1.1.fa --aligner=bowtie2
BS-Seeker2 v2.1.5 - Dec. 21, 2017
Reference genome file: /users/b/a/bacantre/WGBS_Bovine_Brain/reference/referenceUMD3.1.1.fa
Reduced Representation Bisulfite Sequencing: False
Short reads aligner you are using: bowtie2
Builder path: /gpfs1/arch/spack-20170803/opt/spack/linux-rhel7-x86_64/gcc-5.4.0/bowtie2-2.2.5-ojakqj7byt2zuj6hn2d5ezuopyhudm5d/bin/bowtie2-build
ERROR: /gpfs1/home/b/a/bacantre/WGBS/reference/BSseeker2-master/bs_utils/reference_genomes must be a directory. Please, delete it or change the -d option.
What am I doing wrong? Is there a specific location the builds are suppose to be sent to? Was getting the reference_genomes as a text file in bs_utils correct for running the build code?
Additionally:
I originally tried this with also adding a -d ~/WGBS/reference/referenceUMD3.1.1.fa_bowtie2 to indicate the index, but got the same error for reference location either way. I originally thought it was the index file, so I took out using -d
Do I need to reference the index folder? Your examples just use the -g, so I stuck with that to keep it simple.
Thank you,
Bonnie
From: Weilong Guo [mailto:notifications@github.com]
Sent: Thursday, December 21, 2017 6:24 PM
To: BSSeeker/BSseeker2 <BSseeker2@noreply.github.com>
Cc: Bonnie Cantrell <bacantre@uvm.edu>; Mention <mention@noreply.github.com>
Subject: Re: [BSSeeker/BSseeker2] Building Index WGBS Error (#16)
@bacantre<https://github.com/bacantre>
The error message said:
Is your input genome file : /users/ b/a/bacantre/WGBS_Bovine_Brain/reference/UMD3.1_chromosomes.fa a TXT file or a binary file?
It should be TXT file.
And do you really have a chromosome or contig name as "8__fO_" ? You can double check if you specific right genome file.
Best,
Weilong
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#16 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/Af-0TdnTqPHLifrC6XQ1K_UhMF-KyK2Pks5tCuiegaJpZM4QZAGV>.
|
Hi @bacantre , As you built the index using the following command:
Then you need to specifying the folder in following way
Please note that, for bs_seeker2-align.py,
Let me know if it still not works. Best, |
Related issue -- I've tried 2 genome indexes, once in default directory, and once here:
Failed alignment, cannot find directory:
Contents of the indexed directory: lynx.fa_rrbs_ATTAAT-ATGCAT_20_500_bowtie2 Bowtie2 warns that My dd enzymes were AseI / NsiI. Thanks for any tips! |
Hi Justin,
As you run the command for building the genome
python bs_seeker2-build.py -f /shafer3/lynx_meth/genome/lynx.fa --aligner=bowtie2 -r -c AT-TAAT,ATGCA-T -d /shafer3/lynx_meth/genome/bs2/
Then you need to also specify parameter "-r" (for RRBS) and "-c" (for your enzymes) for alignment:
python bs_seeker2-align.py -1 /shafer3/lynx_meth/data/raw_fastq/1_R1.fastq -2 /shafer3/lynx_meth/data/raw_fastq/1_R2.fastq --aligner=bowtie2 -o /shafer3/lynx_meth/data/bs_bam/0001.bam -f bam -g lynx.fa -d /shafer3/lynx_meth/genome/bs2/ -r -c AT-TAAT,ATGCA-T
Let me know if it still not works.
Best,
--
Weilong
At 2018-01-21 23:06:24, "justinjohns" <notifications@github.com> wrote:
Related issue -- I've tried 2 genome indexes, once in default directory, and once here:
python bs_seeker2-build.py -f /shafer3/lynx_meth/genome/lynx.fa --aligner=bowtie2 -r -c AT-TAAT,ATGCA-T -d /shafer3/lynx_meth/genome/bs2/
Failed alignment, cannot find directory:
`python bs_seeker2-align.py -1 /shafer3/lynx_meth/data/raw_fastq/1_R1.fastq -2 /shafer3/lynx_meth/data/raw_fastq/1_R2.fastq --aligner=bowtie2 -o /shafer3/lynx_meth/data/bs_bam/0001.bam -f bam -g lynx.fa -d /shafer3/lynx_meth/genome/bs2/
BS-Seeker2 v2.1.3 - Oct. 25, 2017
ERROR: Index DIR "lynx.fa.." cannot be found in /shafer3/lynx_meth/genome/bs2/.
Please run the bs_seeker2-build.py to create it with the correct parameters for -g, -r, --low, --up and --aligner.`
Contents of the indexed directory: lynx.fa_rrbs_ATTAAT-ATGCAT_20_500_bowtie2
index_directory.txt
Head of genome:
Uploading head_lynx.txt…
Bowtie2 warns that Warning: Encountered reference sequence with only gaps, but I have indexed and aligned successfully with Bismark (using Bowtie2), so I don't see why this isn't working. Originally the genome was named ena.fa, but I renamed to lynx.fa, both indexes came up with the same results/
My dd enzymes were AseI / NsiI.
Thanks for any tips!
Justin
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
I am having a related issue trying to use BSseeker2 to map PE reads of WGBS to the reference tomato genome (AEKE03.fasta). I did not indicate a specific path with the -d option but used the default for the index with this command: [clr@rra-login1 BSseeker2-master]$ python bs_seeker2-build.py -f AEKE03.fasta --aligner bowtie2 Which resulted in a directory full of .data files: C_C2T.1.bt2 and then ran for PE conversion to single end mode: got the error for pysam: We installed it with: then ran: and got this error:
ERROR: Index DIR "AEKE03.fasta.." cannot be found in /shares/pi_clr/BSSeeker/BSseeker2-master/bs_utils/reference_genomes. Does it need -r --low or --up fro WGBS? Or do I need to modify the way that the index is built for PE? |
As you use "--aligner bowtie2" for buiding the index, you need alto to use "--aligner=bowtie2" for the alignment step. Best, |
Thanks so much for your quick response!! It looks like it started with this command: [clr@rra-login1 BSseeker2-master]$ python bs_seeker2-align.py -1 10_P_1.fq -2 10_P_2.fq -g AEKE03.fasta --aligner=bowtie2 -o 10_P.bam -u unmapped And ran this far to a new error: OSError: [Errno 2] No such file or directory [2019-08-23 09:50:15] Mode: Bowtie2, local alignment [2019-08-23 09:50:15] Reference genome library path: /shares/pi_clr/BSSeeker/BSseeker2-master/bs_utils/reference_genomes/AEKE03.fasta_bowtie2 |
I may have solved this error by "adding" bowtie2 again since it read (on line 6): now that line says: I had run [clr@rra-login1 ~]$ module add apps/bowtie/2.3.4.1 to create the index, but I guess I have to re-add it for each session? Now its reading: |
Hi again! It went this far but I'm not sure it was finished? [2019-08-23 12:52:49] Launched: /apps/bowtie/2.3.4.1/bin/bowtie2 --local --quiet -D 50 --no-mixed --norc --sam-nohead --no-discordant -k 2 -p 2 -X 500 --fr -x /shares/pi_clr/BSSeeker/BSseeker2-master/bs_utils/reference_genomes/AEKE03.fasta_bowtie2/W_C2T -f -1 /tmp/bs_seeker2_10_P.bam_-bowtie2-local-TMP-a1eCZi/Trimed_FCT_1.fa.tmp-8477077 -2 /tmp/bs_seeker2_10_P.bam_-bowtie2-local-TMP-a1eCZi/Trimed_RGA_2.fa.tmp-8477077 -S /tmp/bs_seeker2_10_P.bam_-bowtie2-local-TMP-a1eCZi/W_C2T_fr_m4.mapping.tmp-8477077 |
Hi @christinalrichards , Sorry for the late reply, as I might have missed this message in email. It takes some time to run if you have lots of data. Best, |
Hello,
I am trying to use BS Seeker 2 to make an index of the Bovine Reference Genome for later alignment with WGBS data.
I ran the code:
python bs_seeker2-build.py -f ~/reference/file/location --aligner=bowtie2 -d ~/output/file/location
I got the error:
Traceback (most recent call last):
File "bs_seeker2-build.py", line 5, in ,module. from bs_index.wg_build import *
ImportError: No module names bs_index.wg_build
I had my university core download BS Seeker2 and Bowtie2 to the server I am using. I have also downloaded the bs_seeker2-build.py, bs_seeker2-align.py, and bs_seeker2-call_methylation.py files from here. Are there other files that I or the university core need to download?
I was also assuming the .py files were meant to run as is so I did not modify them any except save them in a .txt and then change the file name to .py before transferring it to the server.
Also do I need to unzip the reference genome from fa.gz to make it fa or will it run as a fa.gz?
Thank you,
Bonnie
The text was updated successfully, but these errors were encountered: