Java utilities for Bioinformatics
Pierre Lindenbaum PhD
http://plindenbaum.blogspot.com
see Cite
##Tools
| Tool | Description |
|---|---|
| SplitBam | Split a BAM by chromosome group. Creates EMPTY bams if no reads was found for a given group. |
| SamJS | Filtering a SAM/BAM with javascript (rhino). |
| VCFFilterJS | Filtering a VCF with javascript (rhino) |
| SortVCFOnRef | Sort a VCF using the order of the chromosomes in a REFerence index. |
| Illuminadir | Create a structured (**JSON** or **XML**) representation of a directory containing some Illumina FASTQs. |
| BamStats04 | Coverage statistics for a BED file. It uses the Cigar string instead of the start/end to compute the coverage |
| BamStats05 | same as BamStats04 but group by gene |
| BamStats01 | Statistics about the reads in a BAM. |
| VCFBed | Annotate a VCF with the content of a BED file indexed with tabix. |
| VCFPolyX | Number of repeated REF bases around POS. |
| VCFBigWig | Annotate a VCF with the data of a bigwig file. |
| VCFTabixml | Annotate a value from a vcf+xml file.4th column of the BED indexed with TABIX is a XML string. |
| GroupByGene | Group VCF data by gene/transcript. |
| VCFPredictions | Basic variant prediction using UCSC knownGenes. |
| FindCorruptedFiles | Reads filename from stdin and prints corrupted NGS files (VCF/BAM/FASTQ). |
| VCF2XML | Transforms a VCF to XML. |
| VCFAnnoBam | Annotate a VCF with the Coverage statistics of a BAM file + BED file of capture. It uses the Cigar string instead of the start/end to get the voverage |
| VCFTrio | Check for mendelian incompatibilities in a VCF. |
| SamGrep | Search reads in a BAM |
| VCFFixIndels | Fix samtools INDELS for @SolenaLS |
| NgsFilesSummary | Scan folders and generate a summary of the files (SAMPLE/BAM SAMPLE/VCF etc..). |
| NoZeroVariationVCF | creates a VCF containing one fake variation if the input is empty. |
| HowManyBamDict | for @abinouze : quickly find the number of distinct BAM Dictionaries from a set of BAM files. |
| ExtendBed | Extends a BED file by 'X' bases. |
| CmpBams | Compare two or more BAMs. |
| IlluminaFastqStats | Statistics on Illumina Fastqs |
| Bam2Raster | Save a BAM alignment as a PNG image. |
| VcfRebase | Finds restriction sites overlapping variants in a VCF file |
| FastqRevComp | Reverse complement a FASTQ file for mate-pair alignment |
| PicardMetricsToXML | Convert picards metrics file to XML. |
| Bam2Wig | Bam to Wiggle converter |
| TViewWeb | CGI/Web based version of samtools tview |
| VcfRegistryWeb | CGI/Web tool printing all variants at a given position for a collection VCF |
| BlastMapAnnots | Maps uniprot/genbank annotations on a blast result. See http://www.biostars.org/p/76056 |
| VcfViewGui | Simple java-Swing-based VCF viewer. |
| BamViewGui | Simple java-Swing-based BAM viewer. |
| Biostar81455 | Defining precisely the genomic context based on a position http://www.biostars.org/p/81455/ |
| MapUniProtFeatures | map Uniprot features on reference genome. |
| Biostar86363 | Set genotype of specific sample/genotype comb to unknown in multisample vcf file. |
| FixVCF | Fix a VCF HEADER when I forgot to declare a FILTER or an INFO field in the HEADER |
| Biostar78400 | Add the read group info to the sam file on a per lane basis |
| Biostar78285 | Extract regions of genome that have 0 coverage See http://www.biostars.org/p/78285/ |
| Biostar77288 | Low resolution sequence alignment visualization http://www.biostars.org/p/77288/ |
| Biostar77828 | Divide the human genome among X cores, taking into account gaps See http://www.biostars.org/p/77828/ |
| Biostar76892 | Fix strand of two paired reads close but on the same strand http://www.biostars.org/p/76892/ |
| VCFCompareGT | VCF : compare genotypes of two or more callers for the same samples. |
| SAM4WebLogo | Creates an Input file for BAM + WebLogo. |
| SAM2Tsv | Tabular view of each base of the reads vs the reference. |
| Biostar84786 | Table transposition |
| VCF2SQL | Generate the SQL code to insert a VCF into a database |
| VCFStripAnnotations | Removes one or more field from the INFO column from a VCF. |
| VCFGeneOntology | Finds and filters the GO terms for VCF annotated with SNPEFF or VEP |
| Biostar86480 | Genomic restriction finder See http://www.biostars.org/p/86480/ |
| BamToFastq | Shrink your FASTQ.bz2 files by 40+% using this one weird tip by ordering them by alignment to reference |
| PadEmptyFastq | Pad empty fastq sequence/qual with N/# |
| SamFixCigar | Replace 'M'(match) in SAM cigar by 'X' or '=' |
| FixVcfFormat | Fix PL format in VCF. Problem is described in http://gatkforums.broadinstitute.org/discussion/3453 |
| VcfToRdf | Convert a VCF to RDF. |
| VcfShuffle | Shuffle a VCF. |
| DownSampleVcf | Down sample a VCF. |
| VcfHead | Print the first variants of a VCF. |
| VcfTail | Print the last variants of a VCF |
| VcfCutSamples | Select/Exclude some samples from a VCF |
| VcfStats | Generate some statistics from a VCF |
| VcfSampleRename | Rename Samples in a VCF. |
| VcffilterSequenceOntology | Filter a VCF on Seqence Ontology (SO). |
| Biostar59647 | position of mismatches per read from a sam/bam file (XML) See http://www.biostars.org/p/59647/ |
| VcfRenameChromosomes | Rename chromosomes in a VCF (eg. convert hg19/ucsc to grch37/ensembl) |
| BamRenameChromosomes | Rename chromosomes in a BAM (eg. convert hg19/ucsc to grch37/ensembl) |
| BedRenameChromosomes | Rename chromosomes in a BED (eg. convert hg19/ucsc to grch37/ensembl) |
| BlastnToSnp | Map variations from a BLASTN-XML file. |
| Blast2Sam | Convert a BLASTN-XML input to SAM |
| VcfMapUniprot | Map uniprot features on VCF annotated with VEP or SNPEff. |
| VcfCompare | Compare two VCF files. |
| VcfBiomart | Annotate a VCF with the data from Biomart. |
| VcfLiftOver | LiftOver a VCF file. |
| BedLiftOver | LiftOver a BED file. |
| VcfConcat | Concatenate VCF files. |
| MergeSplittedBlast | Merge Blast hit from a splitted database |
| FindMyVirus | Virus+host cell : split BAM into categories. |
| Biostar90204 | linux split equivalent for BAM file . |
| VcfJaspar | Finds JASPAR profiles in VCF |
| GenomicJaspar | Finds JASPAR profiles in Fasta |
| VcfTreePack | Create a TreeMap from one or more VCF |
| BamTreePack | Create a TreeMap from one or more Bam. |
| FastqRecordTreePack | Create a TreeMap from one or more Fastq files. |
| WorldMapGenome | Map bed file to Genome + geographic data. |
| AddLinearIndexToBed | Use a Sequence dictionary to create a linear index for a BED file. Can be used as a X-Axis for a chart. |
| VCFComm | Compare mulitple VCF files, ouput a new VCF file. |
| VcfIn | Prints variants that are contained/not contained into another VCF |
| Biostar92368 | Binary interactions depth See also http://www.biostars.org/p/92368 |
| VCFStopCodon | TODO |
| FastqGrep | Finds reads in fastq files |
| VcfCadd | Annotate a VCF with Combined Annotation Dependent Depletion (CADD) data. |
| SortVCFOnInfo | sort a VCF using a field in the INFO column |
| SamChangeReference | TODO |
| SamExtractClip | TODO |
| GCAndDepth | Extracts GC% and depth for multiple bam using a sliding window. |
| Biostar94573 | Getting a VCF file from a CLUSTAW or FASTA alignment |
| CompareBamAndBuild | Compare two BAM files mapped on two different builds. Requires a liftover chain file. |
| KnownGenesToBed | Convert UCSC KnownGene to BED. |
| Biostar95652 | Drawing a schematic genomic context tree. See also http://www.biostars.org/p/95652/ |
| SamToPsl | Convert SAM/BAM to PSL or BED12 . |
| BWAMemNOp | merge the SA:Z:* attributes of a read mapped with bwa-mem and prints a read containing a cigar string with 'N' (Skipped region from the REF). |
| FastqEntropy | Compute the Entropy of a Fastq file (distribution of the length(gzipped(sequence))) |
| NgsFilesScanner | Build a persistent database of NGS file. Dump as XML. |
| SigFrame | GUI displaying CGH data |
| Biostar103303 | Calculate Percent Spliced In (PSI) |
| VCFComparePredictions | Compare the variant predictions of VCFs |
| BackLocate | Map a position in a protein back to the genomic coordinates. |
| FindAVariation | Search for variations in a set of VCF files. |
| AlleleFrequencyCalculator | VCF: Alelle Frequency Calculator |
| BuildWikipediaOntology | Build a simple RDFS/XML ontology from Wikipedia Categories. |
| AlmostSortedVcf | Sort an 'almost' sorted VCF using an in-memory buffer. |
| Biostar105754 | bigwig: peak distance from specific genomic BED region |
| VcfRegulomeDB | Annotate a VCF with the RegulomeDB data (http://regulome.stanford.edu/) |
| Biostar106668 | unmark duplicates (deprecated) |
| BatchIGVPictures | GUI: Batch pictures with IGV |
| PubmedDump | Dump pubmed data as XML. |
| BamIndexReadNames | Build a dictionary of read names to be searched with BamQueryReadNames. |
| BamQueryReadNames | Query a Bam file indexed with BamIndexReadNames. |
| FastqShuffle | Shuffle Fastq files. |
| FastqSplitInterleaved | Split interleaved Fastq files |
| PubmedFilterJS | Filters pubmed XML using javascript. |
| ReferenceToVCF | Creates a VCF containing all possible substitutions in a Reference Genome.. |
| VcfEnsemblReg | Annotate a VCF with the UCSC genome hub tracks for Ensembl Regulation. |
| FastqJS | Filters a FASTQ file using javascript. |
| Bam2SVG | Convert a BAM to SVG |
| LiftOverToSVG | Convert UCSC LiftOver chain files to animated SVG |
| VCFMerge | Combines VCF files. |
| FixVcfMissingGenotypes | Use BAM to fill missing genotypes in merged VCFs |
| NcbiTaxonomyToXml | Dump NCBI taxonomy tree as a hierarchical XML document |
| BamCmpCoverage | Creates the figure of a comparative view of the depths sample vs sample |
| FindAllCoveragesAtPosition | Find depth at specific position in a list of BAM files |
| VcfMultiToOne | Convert VCF with multiple samples to a VCF with one SAMPLE |
| Evs2Xml | Download data from Exome Variant Server as XML. |
| VcfRemoveGenotypeIfInVcf | Reset Genotypes in VCF if they've been found in another VCF indexed with tabix |
| Biostar130456 | Generate one VCF file for each sample from a multi-samples VCF |
| UniprotFilterJS | Filter Uniprot XML with a javascript expression. |
| SkipXmlElements | Filter XML elements with a javascript expression. |
| MiniCaller | Simple and Stupid Variant Caller designed for @AdrienLeger2 |
| VcfCompareCallersOneSample | For my colleague Julien. Compare VCF allers with VCF with one sample. |
| SamRetrieveSeqAndQual | Is there a tool to add seq and qual to BAM? for @sjackman |
| VcfEnsemblVepRest | Annotate a VCF with Ensembl REST API. |
| VcfCompareCallers | Compare two VCFs and print common/exclusive information for each sample/genotype |
| BamStats02 | Generate and explore statistics about the reads in a BAM (Sample/File/Flags/chroms/MAPQ) |
| BamTile | Bam tiling Path. |
| XContaminations | for @AdrienLeger2 : test for cross contamination between samples in same flowcell/runlane. |
| VCFJoinVcfJS | Join two VCF files using javascript. |
| Biostar139647 | Convert Clustal/Fasta alignment to SAM/BAM |
| BioAlcidae | Reformat bioinformatics files using javascript/rhino (~ awk) |
| VCFBedSetFilter | Set FILTER for VCF having intersection with BED |
| VCFReplaceTag | Replace the key in INFO/FORMAT/FILTER |
| VcfIndexTabix | sort, Compress (bgz) a VCF and create tabix index on the fly. |
| VcfPeekVcf | Peek INFO Tag and ID from another VCF |
| VcfGetVariantByIndex | Access a (plain or tabix-indexed) VCF file by the i-th index. |
| VcfMultiToOneAllele | VCF: "one variant with N ALT alleles" to "N variants with one ALT" |
| BedIndexTabix | Index and sort a BED on the fly with Tabix |
| VcfToHilbert | Plot a Hilbert Curve from a VCF file. |
| Biostar145820 | Shuffl Bam/Subsample BAM to fixed number of alignments |
| PcrClipReads | Soft clip BAM files based on PCR target regions https://www.biostars.org/p/147136/ |
| ExtendReferenceWithReads | Extending ends of REF sequence with the help of reads in BAM https://www.biostars.org/p/148089/ |
| PcrSliceReads | Mark PCR reads to their PCR amplicon https://www.biostars.org/p/149687/" |
| SamJmx | Monitor/interrupt/break a BAM/SAM stream with java JMX |
| VcfJmx | Monitor/interrupt/break a VcfJmx stream with java JMX |
| Gtf2Xml | convert gff to XML in order to be processed with XSLT |
| SortSamRefName | Sort a SAM/BAM on REF/contig and then on read/query name |
| Biostar154220 | Cap BAM to a given coverage. see https://www.biostars.org/p/154220 |
| VcfToBam | create a BAM from a VCF. |
| Biostar165777 | Split a XML file (e.g: blast) |
| BlastFilterJS | Filters a XML Blast Output with a javascript expression |