-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Understand the meaning of Espresso SJ output #37
Comments
Those files are only intended to be useful as intermediate files for ESPRESSO itself to use, but if you find them useful that's great
|
Thank you very much Eric for your answers! It is very helpful.
Best regards,
Junjie
…On Wed, Oct 4, 2023 at 1:32 PM Eric Kutschera ***@***.***> wrote:
Those files are only intended to be useful as intermediate files for
ESPRESSO itself to use, but if you find them useful that's great
{chr}_SJ_simplified_list is written here:
https://github.com/Xinglab/espresso/blob/v1.3.2/src/ESPRESSO_S.pl#L547
The format is the SJ_cluster line:
SJ_cluster {group_number} {sort_index} {other_sort_index} {chr}
{cluster_start_coord} {cluster_end_coord}
And then 1 line per SJ in that cluster:
{group_number} {chr}:{SJ_start_coord}:{SJ_end_coord}:{strand}
{SJ_start_coord} {SJ_end_coord} {strand} {number_of_perfect_read}
{number_of_reads} {1st_2_nt_in_intron} {last_2_nt_in_intron} {enum}
{is_putative} {is_annotated} {is_high_confidence} {sort_index}
A perfect read for a splice junction has no mismatches, insertions, or
deletions around the SJ. The {enum} is: 2 -> annotated, 1 -> strand
determined based on 1st and last 2 nt, 0 -> strand not determined.
is_putative is 1 if the SJ was seen in the input alignments
SJ_group_all.fa is written here:
https://github.com/Xinglab/espresso/blob/v1.3.2/src/ESPRESSO_S.pl#L554
The format is 1 line to describe the SJ: >{chr}:{SJ_start_coord}:{SJ_end_coord}:{strand}
SJclst:{sort_index}: group:{group_number}:
and the next line is the genomic sequence 25nt leading up to the SJ and
25nt after the SJ
sj.list is written here:
https://github.com/Xinglab/espresso/blob/v1.3.2/src/ESPRESSO_S.pl#L880
The format is {group_number} {chr}:{SJ_start_coord}:{SJ_end_coord} {chr}
{SJ_start_coord} {SJ_end_coord} {number_of_perfect_reads}
{number_of_total_reads}
{comma_seperated_list_of_perfect_read_IDs_for_this_SJ}
{comma_seperated_list_of_all_read_IDs_for_this_SJ}
—
Reply to this email directly, view it on GitHub
<#37 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/A224UECERIHNUPZEDGIVWH3X5WMRBAVCNFSM6AAAAAA5RRGVQSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONBXGM2DONZRGQ>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Among the output files, there are a couple types of splice junction files. Could you please help to illustrate the column names of the files that I listed below? If it's possible, could you please educate me a little bit about how were these files generated and what could be the potential use of these files? I am sorry for asking such basic questions, but I am really trying to make fully use of the ESPRESSO output as much as I could. Thank you!
i.e. chr1_SJ_simplified_list
SJ_cluster 11475 0 0 chr1 3492124 3740774
11475 chr1:3492124:3740774:1 3492124 3740774 1 0 0 TBD TBD 2 no yes 1 0
SJ_cluster 11476 0 0 chr1 3492124 3740774
11476 chr1:3492124:3740774:1 3492124 3740774 1 0 0 TBD TBD 2 no yes 1 0
SJ_cluster 11477 0 0 chr1 3492124 3740774
11477 chr1:3492124:3740774:1 3492124 3740774 1 0 0 TBD TBD 2 no yes 1 0
SJ_cluster 11478 0 0 chr1 4562891 4563322
11478 chr1:4562891:4563322:1 4562891 4563322 1 1 1 CT AC 2 yes yes 1 0
SJ_cluster 11478 1 1 chr1 4562891 4563994
11478 chr1:4562891:4563994:1 4562891 4563994 1 0 0 TBD TBD 2 no yes 1 1
SJ_group_all.fa
Then in each sample folder (if I have multiple samples), there is
sj.list
1 chr12:72831310:72833445 chr12 72831310 72833445 1 1 m64060_200922_102352/3/ccs, m64060_200922_102352/3/ccs,
1 chr12:72837515:72839551 chr12 72837515 72839551 1 1 m64060_200922_102352/3/ccs, m64060_200922_102352/3/ccs,
1 chr12:72808405:72830456 chr12 72808405 72830456 1 1 m64060_200922_102352/3/ccs, m64060_200922_102352/3/ccs,
1 chr12:72839609:72840485 chr12 72839609 72840485 1 1 m64060_200922_102352/3/ccs, m64060_200922_102352/3/ccs,
1 chr12:72833563:72837406 chr12 72833563 72837406 1 1 m64060_200922_102352/3/ccs, m64060_200922_102352/3/ccs,
The text was updated successfully, but these errors were encountered: