Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lost features from Augustus input #193

Open
splaisan opened this issue Aug 16, 2018 · 8 comments
Open

lost features from Augustus input #193

splaisan opened this issue Aug 16, 2018 · 8 comments

Comments

@splaisan
Copy link

splaisan commented Aug 16, 2018

I ran GAG on a Augustus output of E.Coli and only find genes in the GAG output. Both transcripts and CDS (present in the input) are not transferred to genome.mrna.fasta and genome.proteins.fasta.

The features are also absent in other gag gff outputs (.ignored .invalid)!

-rw-r--r-- 1 u0002316 domain users 1.6M Aug 16 12:14 genome.comments.gff
-rw-r--r-- 1 u0002316 domain users 4.5M Aug 16 12:14 genome.fasta
-rw-r--r-- 1 u0002316 domain users 238K Aug 16 12:14 genome.gff
-rw-r--r-- 1 u0002316 domain users 332K Aug 16 12:14 genome.ignored.gff
-rw-r--r-- 1 u0002316 domain users 587K Aug 16 12:14 genome.invalid.gff
-rw-r--r-- 1 u0002316 domain users 0 Aug 16 12:14 genome.mrna.fasta
-rw-r--r-- 1 u0002316 domain users 0 Aug 16 12:14 genome.proteins.fasta
-rw-r--r-- 1 u0002316 domain users 0 Aug 16 12:14 genome.removed.gff
-rw-r--r-- 1 u0002316 domain users 1.8K Aug 16 12:14 genome.stats
-rw-r--r-- 1 u0002316 domain users 161K Aug 16 12:14 genome.tbl

                                 Genome      
                                 ------      
Total sequence length            4704622     
Number of genes                  4185        
Number of mRNAs                  0           
Number of exons                  0           
Number of introns                0           
Number of CDS                    0           
Overlapping genes                1137        
Contained genes                  0           
CDS: complete                    0           
CDS: start, no stop              0           
CDS: stop, no start              0           
CDS: no stop, no start           0           
Total gene length                3995097     
Total mRNA length                0           
Total exon length                0           
Total intron length              0           
Total CDS length                 0           
Shortest gene                    72          
Shortest mRNA                    0           
Shortest exon                    0           
Shortest intron                  0           
Shortest CDS                     0           
Longest gene                     7077        
Longest mRNA                     0           
Longest exon                     0           
Longest intron                   0           
Longest CDS                      0           
mean gene length                 955         
mean mRNA length                 0           
mean exon length                 0           
mean intron length               0           
mean CDS length                  0           
% of genome covered by genes     84.9        
% of genome covered by CDS       0.0         
mean mRNAs per gene              0           
mean exons per mRNA              0           
mean introns per mRNA            0    

my input looks like below, what is wrong with it?
Thanks

##gff-version 3
# This output was generated with AUGUSTUS (version 3.3.1).
# AUGUSTUS is a gene prediction tool written by M. Stanke (mario.stanke@uni-greifswald.de),
# O. Keller, S. König, L. Gerischer, L. Romoth and Katharina Hoff.
# Please cite: Mario Stanke, Mark Diekhans, Robert Baertsch, David Haussler (2008),
# Using native and syntenically mapped cDNA alignments to improve de novo gene finding
# Bioinformatics 24: 637-644, doi 10.1093/bioinformatics/btn013
# No extrinsic information on sequences given.
# Initialising the parameters using config directory /opt/biotools/Augustus/config/ ...
# E_coli_K12 version. Using species specific transition matrix: /opt/biotools/Augustus/config/species/E_coli_K12/E_coli_K12_trans_shadow_bacterium.pbl
# Using species specific overlap length distribution: /opt/biotools/Augustus/config/species/E_coli_K12/E_coli_K12_ovlp_len.pbl
# admissible start codons and their probabilities: ATA(0), ATC(0), ATG(0.915), ATT(0), CTG(0.000562), GTG(0.0703), TTG(0.0141)
# Looks like job-133-85882ac6-9f24-45f0-ae08-edbb6552e6b7-file.fasta is in fasta format.
# We have hints for 0 sequences and for 0 of the sequences in the input set.
#
# ----- prediction on sequence number 1 (length = 4673810, name = 000000F|arrow) -----
#
# Predicted genes for sequence number 1 on both strands
# start gene g1
000000F|arrow	AUGUSTUS	gene	83	2383	0.97	+	.	ID=g1
000000F|arrow	AUGUSTUS	transcript	83	2383	0.97	+	.	ID=g1.t1;Parent=g1
000000F|arrow	AUGUSTUS	start_codon	83	85	.	+	0	Parent=g1.t1
000000F|arrow	AUGUSTUS	CDS	83	2383	0.97	+	0	ID=g1.t1.cds;Parent=g1.t1
000000F|arrow	AUGUSTUS	stop_codon	2381	2383	.	+	0	Parent=g1.t1
# protein sequence = [MYAQTNEYGFLETPYRKVTDGVVTDEIHYLSAIEEGNYVIAQANSNLDEEGHFVEDLVTCRSKGESSLFSRDQVDYMD
# VSTQQVVSVGASLIPFLEHDDANRALMGANMQRQAVPTLRADKPLVGTGMERAVAVDSGVTAVAKRGGVVQYVDASRIVIKVNEDEMYPGEAGIDIYN
# LTKYTRSNQNTCINQMPCVSLGEPVERGDVLADGPSTDLGELALGQNMRVAFMPWNGYNFEDSILVSERVVQEDRFTTIHIQELACVSRDTKLGPEEI
# TADIPNVGEAALSKLDESGIVYIGAEVTGGDILVGKVTPKGETQLTPEEKLLRAIFGEKASDVKDSSLRVPNGVSGTVIDVQVFTRDGVEKDKRALEI
# EEMQLKQAKKDLSEELQILEAGLFSRIRAVLVAGGVEAEKLDKLPRDRWLELGLTDEEKQNQLEQLAEQYDELKHEFEKKLEAKRRKITQGDDLAPGV
# LKIVKVYLAVKRRIQPGDKMAGRHGNKGVISKINPIEDMPYDENGTPVDIVLNPLGVPSRMNIGQILETHLGMAAKGIGDKINAMLKQQQEVAKLREF
# IQRAYDLGADVRQKVDLSTFSDEEVMRLAENLRKGMPIATPVFDGAKEAEIKELLKLGDLPTSGQIRLYDGRTGEQFERPVTVGYMYMLKLNHLVDDK
# MHARSTGSYSLVTQQPLGGKAQFGGQRFGEMEVWALEAYGAAYTLQEMLTVKSDDVNGRTKMYKNIVDGNHQMEPGMPESFNVLLKEIRSLGINIELE
# DE]
# end gene g1
###
@Edison2021
Copy link

Edison2021 commented Oct 12, 2018

I have the same problem. I guess the GAG reads the #9 rather than #3 for CDS.

@Juke34
Copy link

Juke34 commented Feb 19, 2020

You could try agat_sp_gxf_to_gff3.pl from AGAT to fix your gff file first.

@splaisan
Copy link
Author

tried this and it doe snot like the result

python /opt/biotools/GAG/gag.py --fasta job-133-85882ac6-9f24-45f0-ae08-edbb6552e6b7-file.fasta --gff agat_job-133_Augustus.gff3 --out gag_out_agat-gff
Reading fasta...
Done.
Reading gff...
Traceback (most recent call last):
File "/opt/biotools/GAG/gag.py", line 50, in
main()
File "/opt/biotools/GAG/gag.py", line 46, in main
controller.execute(args)
File "/opt/biotools/GAG/src/controller.py", line 74, in execute
self.read_gff(gffpath, out_dir)
File "/opt/biotools/GAG/src/controller.py", line 286, in read_gff
genes, comments, invalids, ignored = gffreader.read_file(reader)
File "/opt/biotools/GAG/src/gff_reader.py", line 336, in read_file
if len(line) == 0 or line.startswith('#'):
TypeError: startswith first arg must be bytes or a tuple of bytes, not str

Not sure yet what could be the cause...
AGAT has removed the translations and comment lines

original GFF starts like this

##gff-version 3
# This output was generated with AUGUSTUS (version 3.3.1).
# AUGUSTUS is a gene prediction tool written by M. Stanke (mario.stanke@uni-greifswald.de),
# O. Keller, S. König, L. Gerischer, L. Romoth and Katharina Hoff.
# Please cite: Mario Stanke, Mark Diekhans, Robert Baertsch, David Haussler (2008),
# Using native and syntenically mapped cDNA alignments to improve de novo gene finding
# Bioinformatics 24: 637-644, doi 10.1093/bioinformatics/btn013
# No extrinsic information on sequences given.
# Initialising the parameters using config directory /opt/biotools/Augustus/config/ ...
# E_coli_K12 version. Using species specific transition matrix: /opt/biotools/Augustus/config/species/E_coli_K12/E_coli_K12_trans_shadow_bacterium.pbl
# Using species specific overlap length distribution: /opt/biotools/Augustus/config/species/E_coli_K12/E_coli_K12_ovlp_len.pbl
# admissible start codons and their probabilities: ATA(0), ATC(0), ATG(0.915), ATT(0), CTG(0.000562), GTG(0.0703), TTG(0.0141)
# Looks like job-133-85882ac6-9f24-45f0-ae08-edbb6552e6b7-file.fasta is in fasta format.
# We have hints for 0 sequences and for 0 of the sequences in the input set.
#
# ----- prediction on sequence number 1 (length = 4673810, name = 000000F|arrow) -----
#
# Predicted genes for sequence number 1 on both strands
# start gene g1
000000F|arrow   AUGUSTUS        gene    83      2383    0.97    +       .       ID=g1
000000F|arrow   AUGUSTUS        transcript      83      2383    0.97    +       .       ID=g1.t1;Parent=g1
000000F|arrow   AUGUSTUS        start_codon     83      85      .       +       0       Parent=g1.t1
000000F|arrow   AUGUSTUS        CDS     83      2383    0.97    +       0       ID=g1.t1.cds;Parent=g1.t1
000000F|arrow   AUGUSTUS        stop_codon      2381    2383    .       +       0       Parent=g1.t1
# protein sequence = [MYAQTNEYGFLETPYRKVTDGVVTDEIHYLSAIEEGNYVIAQANSNLDEEGHFVEDLVTCRSKGESSLFSRDQVDYMD
# VSTQQVVSVGASLIPFLEHDDANRALMGANMQRQAVPTLRADKPLVGTGMERAVAVDSGVTAVAKRGGVVQYVDASRIVIKVNEDEMYPGEAGIDIYN
# LTKYTRSNQNTCINQMPCVSLGEPVERGDVLADGPSTDLGELALGQNMRVAFMPWNGYNFEDSILVSERVVQEDRFTTIHIQELACVSRDTKLGPEEI
# TADIPNVGEAALSKLDESGIVYIGAEVTGGDILVGKVTPKGETQLTPEEKLLRAIFGEKASDVKDSSLRVPNGVSGTVIDVQVFTRDGVEKDKRALEI
# EEMQLKQAKKDLSEELQILEAGLFSRIRAVLVAGGVEAEKLDKLPRDRWLELGLTDEEKQNQLEQLAEQYDELKHEFEKKLEAKRRKITQGDDLAPGV
# LKIVKVYLAVKRRIQPGDKMAGRHGNKGVISKINPIEDMPYDENGTPVDIVLNPLGVPSRMNIGQILETHLGMAAKGIGDKINAMLKQQQEVAKLREF
# IQRAYDLGADVRQKVDLSTFSDEEVMRLAENLRKGMPIATPVFDGAKEAEIKELLKLGDLPTSGQIRLYDGRTGEQFERPVTVGYMYMLKLNHLVDDK
# MHARSTGSYSLVTQQPLGGKAQFGGQRFGEMEVWALEAYGAAYTLQEMLTVKSDDVNGRTKMYKNIVDGNHQMEPGMPESFNVLLKEIRSLGINIELE
# DE]
# end gene g1
###
# start gene g2
000000F|arrow   AUGUSTUS        gene    2460    6683    1       +       .       ID=g2
000000F|arrow   AUGUSTUS        transcript      2460    6683    1       +       .       ID=g2.t1;Parent=g2
000000F|arrow   AUGUSTUS        start_codon     2460    2462    .       +       0       Parent=g2.t1
000000F|arrow   AUGUSTUS        CDS     2460    6683    1       +       0       ID=g2.t1.cds;Parent=g2.t1
000000F|arrow   AUGUSTUS        stop_codon      6681    6683    .       +       0       Parent=g2.t1
# protein sequence = [MKDLLKFLKAQTKTEEFDAIKIALASPDMIRSWSFGEVKKPETINYRTFKPERDGLFCARIFGPVKDYECLCGKYKRL
# KHRGVICEKCGVEVTQTKVRRERMGHIELASPTAHIWFLKSLPSRIGLLLDMPLRDIERVLYFESYVVIEGGMTNLERQQILTEEQYLDALEEFGDEF
# DAKMGAEAIQALLKSMDLEQECEQLREELNETNSETKRKKLTKRIKLLEAFVQSGNKPEWMILTVLPVLPPDLRPLVPLDGGRFATSDLNDLYRRVIN
# RNNRLKRLLDLAAPDIIVRNEKRMLQEAVDALLDNGRRGRAITGSNKRPLKSLADMIKGKQGRFRQNLLGKRVDYSGRSVITVGPYLRLHQCGLPKKM
# ALELFKPFIYGKLELRGLATTIKAAKKMVEREEAVVWDILDEVIREHPVLLNRAPTLHRLGIQAFEPVLIEGKAIQLHPLVCAAYNADFDGDQMAVHV
# PLTLEAQLEARALMMSTNNILSPANGEPIIVPSQDVVLGLYYMTRDCVNAKGEGMVLTGPKEAERLYRSGLASLHARVKVRITEYEKDANGELVAKTS
# LKDTTVGRAILWMIVPKGLPYSIVNQALGKKAISKMLNTCYRILGLKPTVIFADQIMYTGFAYAARSGASVGIDDMVIPEKKHEIISEAEAEVAEIQE
# QFQSGLVTAGERYNKVIDIWAAANDRVSKAMMDNLQTETVINRDGQEEKQVSFNSIYMMADSGARGSAAQIRQLAGMRGLMAKPDGSIIETPITANFR
# EGLNVLQYFISTHGARKGLADTALKTANSGYLTRRLVDVAQDLVVTEDDCGTHEGIMMTPVIEGGDVKEPLRDRVLGRVTAEDVLKPGTADILVPRNT
# LLHEQWCDLLEENSVDAVKVRSVVSCDTDFGVCAHCYGRDLARGHIINKGEAIGVIAAQSIGEPGTQLTMRTFHIGGAASRAAAESSIQVKNKGSIKL
# SNVKSVVNSSGKLVITSRNTELKLIDEFGRTKESYKVPYGAVLAKGDGEQVAGGETVANWDPHTMPVITEVSGFVRFTDMIDGQTITRQTDELTGLSS
# LVVLDSAERTAGGKDLRPALKIVDAQGNDVLIPGTDMPAQYFLPGKAIVQLEDGVQISSGDTLARIPQESGGTKDITGGLPRVADLFEARRPKEPAIL
# AEISGIVSFGKETKGKRRLVITPVDGSDPYEEMIPKWRQLNVFEGERVERGDVISDGPEAPHDILRLRGVHAVTRYIVNEVQDVYRLQGVKINDKHIE
# VIVRQMLRKATIVNAGSSDFLEGEQVEYSRVKIANRELEANGKVGATYSRDLLGITKASLATESFISAASFQETTRVLTEAAVAGKRDELRGLKENVI
# VGRLIPAGTGYAYHQDRMRRRAAGEAPAAPQVTAEDASASLAELLNAGLGGSDNE]
# end gene g2
###
...

agat-fixed like this

##gff-version 3
# This output was generated with AUGUSTUS (version 3.3.1).
# AUGUSTUS is a gene prediction tool written by M. Stanke (mario.stanke@uni-greifswald.de),
# O. Keller, S. König, L. Gerischer, L. Romoth and Katharina Hoff.
# Please cite: Mario Stanke, Mark Diekhans, Robert Baertsch, David Haussler (2008),
# Using native and syntenically mapped cDNA alignments to improve de novo gene finding
# Bioinformatics 24: 637-644, doi 10.1093/bioinformatics/btn013
# No extrinsic information on sequences given.
# Initialising the parameters using config directory /opt/biotools/Augustus/config/ ...
# E_coli_K12 version. Using species specific transition matrix: /opt/biotools/Augustus/config/species/E_coli_K12/E_coli_K12_trans_shadow_bacterium.pbl
# Using species specific overlap length distribution: /opt/biotools/Augustus/config/species/E_coli_K12/E_coli_K12_ovlp_len.pbl
# admissible start codons and their probabilities: ATA(0), ATC(0), ATG(0.915), ATT(0), CTG(0.000562), GTG(0.0703), TTG(0.0141)
# Looks like job-133-85882ac6-9f24-45f0-ae08-edbb6552e6b7-file.fasta is in fasta format.
# We have hints for 0 sequences and for 0 of the sequences in the input set.
#
# ----- prediction on sequence number 1 (length = 4673810, name = 000000F|arrow) -----
#
# Predicted genes for sequence number 1 on both strands
# start gene g1
000000F|arrow   AUGUSTUS        gene    83      2383    0.97    +       .       ID=g1
000000F|arrow   AUGUSTUS        transcript      83      2383    0.97    +       .       ID=g1.t1;Parent=g1
000000F|arrow   AUGUSTUS        exon    83      2383    0.97    +       .       ID=nbis_NEW-exon-3283;Parent=g1.t1
000000F|arrow   AUGUSTUS        CDS     83      2383    0.97    +       0       ID=g1.t1.cds;Parent=g1.t1
000000F|arrow   AUGUSTUS        start_codon     83      85      .       +       0       ID=start_codon-1;Parent=g1.t1
000000F|arrow   AUGUSTUS        stop_codon      2381    2383    .       +       0       ID=stop_codon-1;Parent=g1.t1
000000F|arrow   AUGUSTUS        gene    2460    6683    1       +       .       ID=g2
000000F|arrow   AUGUSTUS        transcript      2460    6683    1       +       .       ID=g2.t1;Parent=g2
000000F|arrow   AUGUSTUS        exon    2460    6683    1       +       .       ID=nbis_NEW-exon-924;Parent=g2.t1
000000F|arrow   AUGUSTUS        CDS     2460    6683    1       +       0       ID=g2.t1.cds;Parent=g2.t1
000000F|arrow   AUGUSTUS        start_codon     2460    2462    .       +       0       ID=start_codon-2;Parent=g2.t1
000000F|arrow   AUGUSTUS        stop_codon      6681    6683    .       +       0       ID=stop_codon-2;Parent=g2.t1

@Juke34
Copy link

Juke34 commented Feb 19, 2020

Then it could be due to the empty commented line
#
Could you try to trow all lines starting by #prior using GAG?
Otherwise you could use EMBLmyGFF3 to submit via ENA instead of NCBI (the data will end up at the same place at the end), I know EMBLmyGFF3 works fine with Augustus annotation.

@splaisan
Copy link
Author

Merci Jacques,

I kept only the shebang line and removed all other ^# before applying agat and still error with GAG.
What can be this string type error?

EMBLmyGFF3 is python2-only and I cannot install it right now using conda

python /opt/biotools/GAG/gag.py --fasta job-133-85882ac6-9f24-45f0-ae08-edbb6552e6b7-file.fasta --gff agat.gff3 --out gag_out_agat-gff
Reading fasta...
Done.
Reading gff...
Traceback (most recent call last):
  File "/opt/biotools/GAG/gag.py", line 50, in <module>
    main()
  File "/opt/biotools/GAG/gag.py", line 46, in main
    controller.execute(args)
  File "/opt/biotools/GAG/src/controller.py", line 74, in execute
    self.read_gff(gffpath, out_dir)
  File "/opt/biotools/GAG/src/controller.py", line 286, in read_gff
    genes, comments, invalids, ignored = gffreader.read_file(reader)
  File "/opt/biotools/GAG/src/gff_reader.py", line 336, in read_file
    if len(line) == 0 or line.startswith('#'):
TypeError: startswith first arg must be bytes or a tuple of bytes, not str

##gff-version 3
000003F|arrow   AUGUSTUS        gene    44      1345    1       +       .       ID=g4163
000003F|arrow   AUGUSTUS        transcript      44      1345    1       +       .       ID=g4163.t1;Parent=g4163
000003F|arrow   AUGUSTUS        exon    44      1345    1       +       .       ID=nbis_NEW-exon-3212;Parent=g4163.t1
000003F|arrow   AUGUSTUS        CDS     44      1345    1       +       0       ID=g4163.t1.cds;Parent=g4163.t1
000003F|arrow   AUGUSTUS        start_codon     44      46      .       +       0       ID=start_codon-4163;Parent=g4163.t1
000003F|arrow   AUGUSTUS        stop_codon      1343    1345    .       +       0       ID=stop_codon-4163;Parent=g4163.t1
000003F|arrow   AUGUSTUS        gene    2698    3009    0.76    +       .       ID=g4164
000003F|arrow   AUGUSTUS        transcript      2698    3009    0.76    +       .       ID=g4164.t1;Parent=g4164
000003F|arrow   AUGUSTUS        exon    2698    3009    0.76    +       .       ID=nbis_NEW-exon-711;Parent=g4164.t1
000003F|arrow   AUGUSTUS        CDS     2698    3009    0.76    +       0       ID=g4164.t1.cds;Parent=g4164.t1
000003F|arrow   AUGUSTUS        start_codon     2698    2700    .       +       0       ID=start_codon-4164;Parent=g4164.t1
000003F|arrow   AUGUSTUS        stop_codon      3007    3009    .       +       0       ID=stop_codon-4164;Parent=g4164.t1
000003F|arrow   AUGUSTUS        gene    3206    3394    0.73    +       .       ID=g4165
000003F|arrow   AUGUSTUS        transcript      3206    3394    0.73    +       .       ID=g4165.t1;Parent=g4165
000003F|arrow   AUGUSTUS        exon    3206    3394    0.73    +       .       ID=nbis_NEW-exon-618;Parent=g4165.t1
000003F|arrow   AUGUSTUS        CDS     3206    3394    0.73    +       0       ID=g4165.t1.cds;Parent=g4165.t1
000003F|arrow   AUGUSTUS        start_codon     3206    3208    .       +       0       ID=start_codon-4165;Parent=g4165.t1
000003F|arrow   AUGUSTUS        stop_codon      3392    3394    .       +       0       ID=stop_codon-4165;Parent=g4165.t1
000003F|arrow   AUGUSTUS        gene    3358    3669    0.93    +       .       ID=g4166
000003F|arrow   AUGUSTUS        transcript      3358    3669    0.93    +       .       ID=g4166.t1;Parent=g4166
000003F|arrow   AUGUSTUS        exon    3358    3669    0.93    +       .       ID=nbis_NEW-exon-1365;Parent=g4166.t1
000003F|arrow   AUGUSTUS        CDS     3358    3669    0.93    +       0       ID=g4166.t1.cds;Parent=g4166.t1
000003F|arrow   AUGUSTUS        start_codon     3358    3360    .       +       0       ID=start_codon-4166;Parent=g4166.t1
000003F|arrow   AUGUSTUS        stop_codon      3667    3669    .       +       0       ID=stop_codon-4166;Parent=g4166.t1
000003F|arrow   AUGUSTUS        gene    3702    4157    1       -       .       ID=g4167
000003F|arrow   AUGUSTUS        transcript      3702    4157    1       -       .       ID=g4167.t1;Parent=g4167
000003F|arrow   AUGUSTUS        exon    3702    4157    1       -       .       ID=nbis_NEW-exon-2264;Parent=g4167.t1
000003F|arrow   AUGUSTUS        CDS     3702    4157    1       -       0       ID=g4167.t1.cds;Parent=g4167.t1
000003F|arrow   AUGUSTUS        start_codon     4155    4157    .       -       0       ID=start_codon-4167;Parent=g4167.t1
000003F|arrow   AUGUSTUS        stop_codon      3702    3704    .       -       0       ID=stop_codon-4167;Parent=g4167.t1
000003F|arrow   AUGUSTUS        gene    4301    4492    0.98    -       .       ID=g4168
000003F|arrow   AUGUSTUS        transcript      4301    4492    0.98    -       .       ID=g4168.t1;Parent=g4168
000003F|arrow   AUGUSTUS        exon    4301    4492    0.98    -       .       ID=nbis_NEW-exon-1181;Parent=g4168.t1
000003F|arrow   AUGUSTUS        CDS     4301    4492    0.98    -       0       ID=g4168.t1.cds;Parent=g4168.t1
000003F|arrow   AUGUSTUS        start_codon     4490    4492    .       -       0       ID=start_codon-4168;Parent=g4168.t1
000003F|arrow   AUGUSTUS        stop_codon      4301    4303    .       -       0       ID=stop_codon-4168;Parent=g4168.t1
...

@Juke34
Copy link

Juke34 commented Feb 20, 2020

EMBLmyGFF3 v2 is in python3

@splaisan
Copy link
Author

this is not what conda tells me :-)

(agat) u0002316@gbw-s-pacbio01:~$ python --version
Python 3.6.10 :: Anaconda, Inc.

(agat) u0002316@gbw-s-pacbio01:~$ conda install -c bioconda emblmygff3
Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: | 
Found conflicts! Looking for incompatible packages.
This can take several minutes.  Press CTRL-C to abort.
failed                                                                                                                                                                              

UnsatisfiableError: The following specifications were found
to be incompatible with the existing python installation in your environment:

Specifications:

  - emblmygff3 -> python[version='<3']

Your python: python=3.6

If python is on the left-most side of the chain, that's the version you've asked for.
When python appears to the right, that indicates that the thing on the left is somehow
not available for the python version you are constrained to. Note that conda will not
change your python version to a different minor version unless you explicitly specify
that.

@Juke34
Copy link

Juke34 commented Feb 20, 2020

I had just pushed it into Bioconda, it was maybe not yet on their server. I checked now and it's there.
Let me know if you still don't see it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants