Skip to content

Code and data associated with the PastDB publication (Martin et al, 2021).

License

Notifications You must be signed in to change notification settings

vastdb-pastdb/pastdb

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 

Repository files navigation

PastDB scripts and associated data

Code and data associated with the PastDB web and publication (Martín et al, Genome Biol 2021). For any further enquires, please feel free to contact Manuel Irimia (mirimia@gmail.com) and/or Guiomar Martín (guiomarm@igc.gulbenkian.pt). Additional information can be found in PastDB.

Full citation: Martín, G., Márquez, Y., Duque, P., Irimia, M. (2021). Alternative splicing landscapes in Arabidopsis thaliana across tissues and stress conditions highlight major functional differences with animals. Genome Biol, 22:35.


  • Scripts in bin (all perl scripts contain a help option on how to be run and internal comments):

    • Get_Event_Stats.pl: to calculate general statistics per AS event from any INCLUSION table.
    • Get_PanAS_Events.pl: to define PanAS events from any INCLUSION table.
    • Get_Stress_Cores.pl: to get abiotic and biotic stress AS core sets, as well as the associated control sets.
    • Get_Tissue_Specific_AS.pl: to get tissue-specific AS events from any INCLUSION table.
    • Get_Tissue_Specific_GE.pl: to get genes with tissue-specific expression from any cRPKM/TPM table.
    • Quantify_AS_by_Subsampling.pl: calculate the fraction of genes that are alternatively spliced by event type from an INCLUSION table.
    • Get_Plots_Stress_vs_Tissues.R: used to plot Figure 5c (comparing stress vs tissue AS contributions in the four species).
    • Calculate_SS_SCORES_From_PWMs.R: to calculate PWM-based splice site scores.
    • Pipeline_Get_Chain_Aln.sh: bash pipeline to obtain liftOver files.
    • Get_Results_From_Liftover.pl: used to parse the pairwise liftover outputs
    • Get_Results_From_ExOrthist.pl: used to perform the 4-way overlap between core AS sets.
  • Files from PastDB: the main data files used for the analyses are available for download in PastDB, and are also copied here:

  • Files in data/ folder:

    • General files:

      • AllEvents_for_comparison-Ath.txt.gz (1.3M)
      • Ath.Event-Gene.IDs.txt (9.7M)
      • Stress_vs_Tissues-input_table.tab (5.1M)
    • Splice sites to calculate SS scores based on PWMs:

      • Annotated_ACCEPTORS-Ath.fasta.gz (1.2M)
      • Annotated_DONORS-Ath.fasta.gz (615K)
      • REFERENCE-ALL_ANNOT-Ath163-3ss.fasta.gz (3.6M)
      • REFERENCE-ALL_ANNOT-Ath163-5ss.fasta.gz (1.8M)
    • Lifted events to Brassicacea species by event type:

      • EX-Ath-to-Aal-FILTERED.tab.gz (803K)
      • EX-Ath-to-Aly-FILTERED.tab.gz (920K)
      • EX-Ath-to-Bra-FILTERED.tab.gz (766K)
      • EX-Ath-to-Csa-FILTERED.tab.gz (892K)
      • INT-Ath-Aal-FILTERED.tab.gz (498K)
      • INT-Ath-Aly-FILTERED.tab.gz (1.0M)
      • INT-Ath-Bra-FILTERED.tab.gz (573K)
      • INT-Ath-Csa-FILTERED.tab.gz (941K)
      • ALTA-Ath-to-Aal-FILTERED.tab.gz (459K)
      • ALTA-Ath-to-Aly-FILTERED.tab.gz (648K)
      • ALTA-Ath-to-Bra-FILTERED.tab.gz (412K)
      • ALTA-Ath-to-Csa-FILTERED.tab.gz (580K)
      • ALTD-Ath-to-Aal-FILTERED.tab.gz (245K)
      • ALTD-Ath-to-Aly-FILTERED.tab.gz (356K)
      • ALTD-Ath-to-Bra-FILTERED.tab.gz (216K)
      • ALTD-Ath-to-Csa-FILTERED.tab.gz (315K)
    • Gene and exon orthology clusters:

      • gene_cluster_file-araTha10_ce11_dm6_hg38.gz (286K)
      • EX_clusters-int2b.tab (3.3M)

About

Code and data associated with the PastDB publication (Martin et al, 2021).

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published