mikado serialise: sqlalchemy.exc.InvalidRequestError: No transaction is begun. #206

alelim-bio · 2019-08-05T22:45:31Z

I have moved onto mikado serialise and have ran into another error. I was wondering if I could get your help?

Below is the error that I am currently getting.

2019-08-05 18:12:49,628 - main - __init__.py:123 - ERROR - main - MainProcess - Mikado crashed, cause:
2019-08-05 18:12:49,629 - main - __init__.py:124 - ERROR - main - MainProcess - No transaction is begun.
Traceback (most recent call last):
  File "/pylon5/mc5fr6p/alelim/CONDA/anaconda3/envs/bio/lib/python3.6/site-packages/Mikado/__init__.py", line 109, in main
    args.func(args)
  File "/pylon5/mc5fr6p/alelim/CONDA/anaconda3/envs/bio/lib/python3.6/site-packages/Mikado/subprograms/serialise.py", line 336, in serialise
    load_orfs(args, logger)
  File "/pylon5/mc5fr6p/alelim/CONDA/anaconda3/envs/bio/lib/python3.6/site-packages/Mikado/subprograms/serialise.py", line 145, in load_orfs
    serializer()
  File "/pylon5/mc5fr6p/alelim/CONDA/anaconda3/envs/bio/lib/python3.6/site-packages/Mikado/serializers/orf.py", line 317, in __call__
    self.serialize()
  File "/pylon5/mc5fr6p/alelim/CONDA/anaconda3/envs/bio/lib/python3.6/site-packages/Mikado/serializers/orf.py", line 291, in serialize
    self.session.commit()
  File "/pylon5/mc5fr6p/alelim/CONDA/anaconda3/envs/bio/lib/python3.6/site-packages/sqlalchemy/orm/session.py", line 1024, in commit
    raise sa_exc.InvalidRequestError("No transaction is begun.")
sqlalchemy.exc.InvalidRequestError: No transaction is begun.

Here is the command I am currently using:

mikado serialise --json-conf configuration.yaml --xml mikado.blast.xml.gz --orfs ../stringtie/longest_orfs.pep.transdecoder.bed

Additionally, I have attached a toy dataset of what I am working with.

Kind Regards,

Alex Lim

Question unrelated to the error if you have some time:

I was wondering if it was possible to learn what was done to generate the .gff3 file from the Trinity transcripts using GMAP. If my understanding is correct usually you get a standard alignment file in (.sam/.bam) format?

toy_bed.zip

The text was updated successfully, but these errors were encountered:

lucventurini · 2019-08-06T09:18:49Z

Hi @AsclepiusDoc, many thanks for reporting this. I will try to replicate and solve it ASAP.

lucventurini · 2019-08-06T14:56:06Z

Dear @AsclepiusDoc , I was able to reproduce the bug. I will fix it today.
However, the bug was triggered because you provided Mikado serialise with the wrong input file - ie ORFs called on the input transcript assemblies for Mikado prepare (StringTie, if I am not mistaken) rather than the output of Mikado prepare itself.

To avoid this, I would recommend using the pipeline manager we developed - Daijin, included in Mikado - to perform the analysis on cotton. Specifically, Daijin can be used to run all the mikado steps.

In order:
1- run mikado configure, as you did before, but adding the --daijin switch.
a- if you want to use TransDecoder instead of Prodigal, please add --use-transdecoder to the command line
b- if you want to use BLAST+ instead of DIAMOND, please add --use-blast to the command line
2- run daijin to perform the Mikado pipeline:

daijin mikado --jobs {maximum number of jobs} --cores {# of cores} --threads {# of cores} -nd --nolock configuration.yaml

This will ensure that the steps are executed correctly.
Further information can be found on the online documentation.

Thank you for reporting the bug!

lucventurini · 2019-08-06T15:09:13Z

PS:

I was wondering if it was possible to learn what was done to generate the .gff3 file from the Trinity transcripts using GMAP. If my understanding is correct usually you get a standard alignment file in (.sam/.bam) format?

gmap can return a GFF3 output if called with the -f 2 switch.

…d by @AsclepiusDoc will raise an error.

lucventurini · 2019-08-07T09:51:24Z

Dear @AsclepiusDoc , now Mikado will die informatively when a situation like yours arises; this should make it easier to spot a problem in the pipeline's usage.

The rationale is that if Mikado had not died during serialisation, the data present in the database would have been useless - Mikado would have had no way to link the ORFs with the transcripts present in the GTF (as their names change due to the juxtaposition of a label).

…e log for Mikado serialise, and 09b97aa

alelim-bio · 2019-08-07T17:28:19Z

Hello @lucventurini ,

Thank you so much for your help! That did the trick and I was able to get Mikado outputs. Additionally, I followed your suggestion and was able to get a Trinity output for Mikado however, I seem to have ran into another parsing issue with the Trinity .gff3 file. I have attached the error here if that is okay?

Error:

2019-08-07 13:05:09,936 - prepare - prepare.py:496 - ERROR - prepare - MainProcess - tr_TRINITY_GG_1628_c45_g1_i1.mrna1 is present multiple times in /pylon5/mc5fr6p/alelim/ACTIVE_ANALYSIS/cotton/transcript_assembly/SA_1766/genome_guided_trinity/gmap_alignment/cleaned_trinity_gmap.gff3(label: tr). This breaks the input parsing.
        Please ensure that the file is properly formatted, with unique IDs.
Traceback (most recent call last):
  File "/pylon5/mc5fr6p/alelim/CONDA/anaconda3/envs/bio/lib/python3.6/site-packages/Mikado/preparation/prepare.py", line 466, in prepare
    max_intron=args.json_conf["prepare"]["max_intron_length"],)
  File "/pylon5/mc5fr6p/alelim/CONDA/anaconda3/envs/bio/lib/python3.6/site-packages/Mikado/preparation/prepare.py", line 373, in load_exon_lines
    _load_exon_lines_single_thread(args, shelve_names, logger, min_length, strip_cds, max_intron)
  File "/pylon5/mc5fr6p/alelim/CONDA/anaconda3/envs/bio/lib/python3.6/site-packages/Mikado/preparation/prepare.py", line 270, in _load_exon_lines_single_thread
    strand_specific=strand_specific or is_reference)
  File "/pylon5/mc5fr6p/alelim/CONDA/anaconda3/envs/bio/lib/python3.6/site-packages/Mikado/preparation/annotation_parser.py", line 438, in load_from_gff
    __raise_invalid(tid, gff_handle.name, label)
  File "/pylon5/mc5fr6p/alelim/CONDA/anaconda3/envs/bio/lib/python3.6/site-packages/Mikado/preparation/annotation_parser.py", line 169, in __raise_invalid
    "(label: {0})".format(label) if label != '' else ""))
Mikado.exceptions.InvalidAssembly: tr_TRINITY_GG_1628_c45_g1_i1.mrna1 is present multiple times in /pylon5/mc5fr6p/alelim/ACTIVE_ANALYSIS/cotton/transcript_assembly/SA_1766/genome_guided_trinity/gmap_alignment/cleaned_trinity_gmap.gff3(label: tr). This breaks the input parsing.
        Please ensure that the file is properly formatted, with unique IDs.

Additionally, I grepped the string that was reported to be non-unique and got the following results:

10000001        SA1766  mRNA    38780180        40626956        .       -       .       ID=TRINITY_GG_1628_c45_g1_i1.mrna1;Name=TRINITY_GG_1628_c45_g1_i1;Parent=TRINITY_GG_1628_c45_g1_i1.path1;coverage=100.0;identity=99.6;matches=232;mismatches=1;indels=0;unknowns=0
10000001        SA1766  exon    40626876        40626956        100     -       .       ID=TRINITY_GG_1628_c45_g1_i1.mrna1.exon1;Name=TRINITY_GG_1628_c45_g1_i1;Parent=TRINITY_GG_1628_c45_g1_i1.mrna1;Target=TRINITY_GG_1628_c45_g1_i1 81 1 .
10000001        SA1766  exon    38785950        38785957        100     -       .       ID=TRINITY_GG_1628_c45_g1_i1.mrna1.exon2;Name=TRINITY_GG_1628_c45_g1_i1;Parent=TRINITY_GG_1628_c45_g1_i1.mrna1;Target=TRINITY_GG_1628_c45_g1_i1 97 90 .
10000001        SA1766  exon    38781015        38781024        100     -       .       ID=TRINITY_GG_1628_c45_g1_i1.mrna1.exon3;Name=TRINITY_GG_1628_c45_g1_i1;Parent=TRINITY_GG_1628_c45_g1_i1.mrna1;Target=TRINITY_GG_1628_c45_g1_i1 107 98 .
1       SA1766  exon    38780180        38780313        99      -       .       ID=TRINITY_GG_1628_c45_g1_i1.mrna1.exon4;Name=TRINITY_GG_1628_c45_g1_i1;Parent=TRINITY_GG_1628_c45_g1_i1.mrna1;Target=TRINITY_GG_1628_c45_g1_i1 243 110 .
10000001        SA1766  CDS     40626876        40626913        100     -       0       ID=TRINITY_GG_1628_c45_g1_i1.mrna1.cds1;Name=TRINITY_GG_1628_c45_g1_i1;Parent=TRINITY_GG_1628_c45_g1_i1.mrna1;Target=TRINITY_GG_1628_c45_g1_i1 81 44 .
10000001        SA1766  CDS     38785950        38785957        100     -       1       ID=TRINITY_GG_1628_c45_g1_i1.mrna1.cds2;Name=TRINITY_GG_1628_c45_g1_i1;Parent=TRINITY_GG_1628_c45_g1_i1.mrna1;Target=TRINITY_GG_1628_c45_g1_i1 97 90 .
10000001        SA1766  CDS     38781015        38781024        100     -       0       ID=TRINITY_GG_1628_c45_g1_i1.mrna1.cds3;Name=TRINITY_GG_1628_c45_g1_i1;Parent=TRINITY_GG_1628_c45_g1_i1.mrna1;Target=TRINITY_GG_1628_c45_g1_i1 107 98 .
10000001        SA1766  CDS     38780296        38780313        100     -       0       ID=TRINITY_GG_1628_c45_g1_i1.mrna1.cds4;Name=TRINITY_GG_1628_c45_g1_i1;Parent=TRINITY_GG_1628_c45_g1_i1.mrna1;Target=TRINITY_GG_1628_c45_g1_i1 127 110 .

Furthermore, I will look into the Daijin pipeline as you have recommended. Once again thank you for all your help!

Kind Regards,

Alex Lim

lucventurini · 2019-08-07T23:55:44Z

Dear @AsclepiusDoc, the problem is that one of the lines is on chromosome 1 (rather than 10000001). That is clearly a mistake on GMAP part, as the culprit exon is actually missing from the correct chromosome (it's the starting exon of the model).

Mikado presumes that models should belong to the same scaffold /chromosome, so this is flagged as a clear a mistake. I do not know why it happened; it looks like a bug in GMAP. I would try a different version of the program to see whether the bug represents itself again.

Glad to hear that my suggestions solved the problem for you!

* Fix EI-CoreBioinformatics#206. Now Mikado serialise will raise an error and exit when trying to add ORFs for transcripts that have not been serialised (if any has already). This will avoid a common mistake (see in EI-CoreBioinformatics#206) whereupon experimenters will try to use ORFs called on Mikado prepare's **inputs** rather than its output.

lucventurini self-assigned this Aug 6, 2019

lucventurini added the bug label Aug 6, 2019

lucventurini added this to the 2.0 milestone Aug 6, 2019

lucventurini mentioned this issue Aug 6, 2019

Update documentation #136

Closed

lucventurini added a commit to lucventurini/mikado that referenced this issue Aug 7, 2019

Fix EI-CoreBioinformatics#206. Now situations like the one encountere…

09b97aa

…d by @AsclepiusDoc will raise an error.

lucventurini added a commit to lucventurini/mikado that referenced this issue Aug 7, 2019

Fix EI-CoreBioinformatics#206, a previously unknown bug on setting th…

00754ad

…e log for Mikado serialise, and 09b97aa

lucventurini added a commit to lucventurini/mikado that referenced this issue Aug 7, 2019

Fix EI-CoreBioinformatics#206 and previous commit 00754ad

64cb75d

lucventurini closed this as completed in cc1960b Aug 7, 2019

lucventurini added this to Closed in Version 2 Oct 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mikado serialise: sqlalchemy.exc.InvalidRequestError: No transaction is begun. #206

mikado serialise: sqlalchemy.exc.InvalidRequestError: No transaction is begun. #206

alelim-bio commented Aug 5, 2019 •

edited

Loading

lucventurini commented Aug 6, 2019

lucventurini commented Aug 6, 2019

lucventurini commented Aug 6, 2019

lucventurini commented Aug 7, 2019

alelim-bio commented Aug 7, 2019 •

edited

Loading

lucventurini commented Aug 7, 2019

mikado serialise: sqlalchemy.exc.InvalidRequestError: No transaction is begun. #206

mikado serialise: sqlalchemy.exc.InvalidRequestError: No transaction is begun. #206

Comments

alelim-bio commented Aug 5, 2019 • edited Loading

lucventurini commented Aug 6, 2019

lucventurini commented Aug 6, 2019

lucventurini commented Aug 6, 2019

lucventurini commented Aug 7, 2019

alelim-bio commented Aug 7, 2019 • edited Loading

lucventurini commented Aug 7, 2019

alelim-bio commented Aug 5, 2019 •

edited

Loading

alelim-bio commented Aug 7, 2019 •

edited

Loading