Skip to content

raramayo/Busco_Plot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 

Repository files navigation

DOI

Busco_Plot

Motivation

This script has been altered from the original code by Mathieu Seppey available at https://gitlab.com/ezlab/busco/-/tree/master.
Its purpose is to create figures displaying the BUSCO lineage used in processing genomic, transcriptomic, or proteomic files.

Moreover, this module enables the generation of figures in 'jpeg', 'png', or 'tiff' formats.

Lastly, various pieces of code originally spread across different files have been consolidated into one file for simplicity and convenience.

Documentation

usage: Busco_Plot.v1.0.0.py -wd PATH [-l {Eukaryotic Lineage,Metazoan Lineage}] [-f {jpeg,png,tiff}] [-rt RUN_TYPE] [--no_r] [-q] [-h]

####################################################################################################
ARAMAYO_LAB
BUSCO_Plot

Original Code Link:    https://gitlab.com/ezlab/busco/
Original Code Version: 4.0.0

Licensed under the MIT license.

This program was modified from code initially generated by Evgeny Zdobnov (ez@ezlab.org),
and as such it inherits it's original MIT License.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without
even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

See LICENSE.md file for details.

You should have received a copy of the MIT License along with this program, If not,
see: https://opensource.org/license/MIT

Author current release:  Rodolfo Aramayo
                           WORK_EMAIL:     raramayo@tamu.edu
                           PERSONAL_EMAIL: rodolfo@aramayo.org
Author original release: Mathieu Seppey

MODULE__NAME:       Busco_Plot.v1.0.0.py
MODULE_VERSION:     1.0.0
MODULE_SYNOPSIS:    This module produces a graphic summary for BUSCO runs based on short summary files

This tool uses the short summary files produces by the Busco tool (https://busco.ezlab.org/ and https://gitlab.com/ezlab/busco/-/tree/master),
and ggplot2 (2.2.0+) (https://ggplot2.tidyverse.org/), to produce and execute a file containing R code needed to produce a figure in either:
    'jpeg' (default), 'png', or 'tiff' formats.

This tool assumes your system is able to run R. It also uses the following R libraries:

    dplyr (https://cran.r-project.org/web/packages/dplyr/readme/README.html)
    tidyr (https://tidyr.tidyverse.org/)
    forcats (https://forcats.tidyverse.org/)
    Cairo (https://cran.r-project.org/web/packages/Cairo/index.html)

If these libraries are not installed, the tool will attempt to install them.

To use this module place all BUSCO short summary files (short_summary.[generic|specific].dataset.label.txt) in a single directory, and provide
this directory PATH to this module.

The resulting plots will be written in the same directory where the short summary files are present.

You can find both the resulting R script for customisation (if so desired) and the resulting figure in the format requested in the specified directory.

MAIN DEPENDENCY: R (https://www.r-project.org/)
####################################################################################################

required arguments:
  -wd PATH, --working_directory PATH
                        Define the location of the working directory where the Busco files are located
  -l {Eukaryotic Lineage,Metazoan Lineage}, --lineage {Eukaryotic Lineage,Metazoan Lineage}
                        Define the lineage used when running Busco. Default is 'Eukaryotic Lineage'. Choose between 'Eukaryotic Lineage' or 'Metazoan Lineage'.

optional arguments:
  -f {jpeg,png,tiff}, --file_type {jpeg,png,tiff}
                        select the output file format desired: 'jpeg', 'png', or 'tiff'
  -rt RUN_TYPE, --run_type RUN_TYPE
                        type of summary to use, `generic` or `specific`
  --no_r                To avoid to run R. It will just create the R script file in the working directory
  -q, --quiet           Disable the info logs, displays only errors
  -h, --help            Show this help message and exit

Development/Testing Environment:

Distributor ID:       Apple, Inc.
Description:          Apple M1 Max
Release:              14.4.1
Codename:             Sonoma
Distributor ID:       Ubuntu
Description:	      Ubuntu 22.04.3 LTS
Release:	          22.04
Codename:	          jammy

Required Script Dependencies:

Version Number: R version 4.3.3 (2024-02-29) -- "Angel Food Cake"

R version 4.3.3 (2024-02-29) -- "Angel Food Cake"
Copyright (C) 2024 The R Foundation for Statistical Computing
Platform: aarch64-apple-darwin20 (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under the terms of the
GNU General Public License versions 2 or 3.
For more information about these matters see
https://www.gnu.org/licenses/.

Version Number: 3.5.0

Package: ggplot2
Version: 3.5.0
Title: Create Elegant Data Visualisations Using the Grammar of Graphics

Version Number: 1.1.4

Type: Package
Package: dplyr
Title: A Grammar of Data Manipulation
Version: 1.1.4

Version Number: 1.3.1

Package: tidyr
Title: Tidy Messy Data
Version: 1.3.1

Version Number: 1.6.2

Package: Cairo
Version: 1.6-2
Title: R Graphics Device using Cairo Graphics Library for Creating
        High-Quality Bitmap (PNG, JPEG, TIFF), Vector (PDF, SVG,
        PostScript) and Display (X11 and Win32) Output

Version Number: 1.0.0

Package: forcats
Title: Tools for Working with Categorical Variables (Factors)
Version: 1.0.0

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published