Skip to content

Latest commit

 

History

History
604 lines (420 loc) · 32.7 KB

2.7.md

File metadata and controls

604 lines (420 loc) · 32.7 KB

Release Notes v2.7

Ability to highlight variants of interest

VCF files may include a large number of variants and it might be helpful to highlight variants of interest based on specific values of their attributes.

Now, system admin can create/edit the special JSON-file (interest_profiles.json) where a list of condition profiles is being described.
Each profile in that file contains an own set of conditions based on the variant attributes. For a condition, the color is being specified.

At the GUI, any user can enable the highlight feature from the Settings (VCF tab) and select any profile from the described file:
ReleaseNotes_2.7

In the variants table, if the variant is satisfy to the certain condition of the selected profile - variant row will be highlighted in that condition color:
ReleaseNotes_2.7

Also at the VCF track, if the variant is satisfy to the certain condition of the selected profile - this variant will be highlighted in that condition color:
ReleaseNotes_2.7

For more details see here.

BLAST search

Now, users have the ability to perform BLAST Search from the NGB.
This allows to search nucleotide/amino acid sequences over BLAST databases and view the corresponding results.
BLAST databases should be previously uploaded into NGB (this could be as downloaded NCBI databases or custom ones).

Ways to start setup the BLAST search:

  • by the context menu at any genomic feature (except variants) in the "Browser" panel:
    ReleaseNotes_2.7
  • or manually open the BLAST panel from the Views menu:
    ReleaseNotes_2.7

Once the BLAST panel is opened, user should specify desired parameters and click the Search button:
ReleaseNotes_2.7

All BLAST search tasks are displayed at the History sub-tab of the BLAST panel.
This sub-tab is being opened automatically after the search starts:
ReleaseNotes_2.7

User can open any finished task by click its row and view the BLAST search results.
BLAST search results contain Sequences table - aggregated results grouped by their sequences.
This form is being opened in the same sub-tab (History), e.g.:
ReleaseNotes_2.7

User can click any row in the Sequence table and the form with details about all matches (alignments) of the search query to the certain sequence will be opened.
This form is being opened in the same sub-tab (History) as well, e.g.:
ReleaseNotes_2.7
ReleaseNotes_2.7

User can view any found match (alignment) at the separate track in the "Browser" panel.
To open the visualization track, user should click the "View at track" link near the match in the "Alignments info" form:
ReleaseNotes_2.7
ReleaseNotes_2.7

For more details see here.

Genes panel

A new panel was added to NGB - the Genes panel.
This panel displays a list of genes/transcripts/exons and other features of the current dataset (from corresponding GFF/GTF and GenBank files) in a tabular view.
ReleaseNotes_2.7

By default in the panel, only the following columns are shown:

  • Chr - chromosome/sequence name
  • Name - feature name
  • Id - feature ID
  • Type - feature type
  • Start - start position of the feature on the chromosome/sequence
  • End - end position of the feature on the chromosome/sequence
  • Strand - feature strand
  • Info - button to open the certain feature full info

Users can display or hide extra columns - from the list appeared by click the hamburger icon in the panel header:
ReleaseNotes_2.7
There are two types of additional columns in this list:

  • mandatory feature fields from origin gene files (Feature, Frame, Gene, Score, Source) are shown as is
  • optional feature attributes from the "Attributes" column of the origin gene files are shown in format <Attribute_name> (attr)

Table supports sorting, pagination and the feature of the downloading the shown data to the local workstation.

To view the full feature info - user can click the corresponding button and view the info in the separate pop-up:
ReleaseNotes_2.7

Additionally, users can modify gene/feature attributes shown in the Genes panel via the GUI.
These changes will be saved in NGB and will be available to all other users, but they don't touch the original GENE files:
ReleaseNotes_2.7
ReleaseNotes_2.7
User can modify (add/edit/remove) the attribute values manually for any gene/feature from the GENES panel and save changes:
ReleaseNotes_2.7

Performed changes can be viewed in the Genes panel:
ReleaseNotes_2.7

Also, the changes history (what was changed, when and by whom) can be viewed at the special tab of the feature Info pop-up:
ReleaseNotes_2.7

For more details see here.

Homologs search

Now, users can search for the homologous genes (similar genes in other species).
The search is being performed in the inner copy of NCBI Homologene database or in uploaded custom databases.

To start the search of homologous genes user should click the gene of interest (at any GENE track or at the Genes panel) and select the item "Show similar genes" in the context menu, e.g.:
ReleaseNotes_2.7

After, the search in the inner copy of NCBI Homologene database will be performed using the gene name and the current species.
Homologs panel with aggregated search results will be opened automatically:
ReleaseNotes_2.7

This panel contains two sub-tabs:

  • "Homologene" (opened by default) - for displaying search results of gene homologs from the NCBI Homologene database
  • "Orthologs & paralogs" - for displaying search results of gene orthologs and paralogs from the corresponding NCBI Database (NCBI Orthologs) and/or inner NGB databases

Click any row in the Homologene sub-tab to open details of the certain Homologene record:
ReleaseNotes_2.7
ReleaseNotes_2.7

Click any domain in the details table and a tooltip will appear with a full "legend" of the corresponding gene domains:
ReleaseNotes_2.7

You can also open the "Orthologs & paralogs" sub-tab to view the search results over orthologs and paralogs NGB databases (NCBI Orthologs or custom ones), e.g.:
ReleaseNotes_2.7

In this sub-tab, you also can click the row to view details:
ReleaseNotes_2.7

For more details see here.

Heatmap displaying support

Heatmap is a data visualization technique that shows magnitude of a phenomenon as color in two dimensions.
Under the hood, it is one of the way to display a plain matrix.
In genetic researches, this may be convenient to present gene expression data for different samples.
From the current version, NGB supports heatmaps displaying.

In NGB, heatmap files can be shown:

  • in the Heatmap panel that can be opened from the Views item of the main menu:
    ReleaseNotes_2.7
  • as a separate track:
    ReleaseNotes_2.7
  • or even at the Summary dataset view in the Browser panel:
    ReleaseNotes_2.7

User can zoom-in/zoom-out the heatmap and configure the color scheme.

Besides their main function to visualize data, heatmap labels can also be used to convenient navigation over NGB objects, e.g. a heatmap in which cell annotations are perceived as links to coordinates:
ReleaseNotes_2.7
ReleaseNotes_2.7

For more details see here.

Strain Lineage

From the current version, NGB supports visualization of a lineage for a family of strains or reference genomes.
Lineage is represented as a one-directional graph where each node represent a strain (genome) and each edge between nodes represent an impact used to create a progeny from a parent strain.

To open a lineage, in which the current genome is included, user should use the corresponding item in the General menu of the reference track:
ReleaseNotes_2.7

The lineage will be displayed in the separate panel (LINEAGE panel):
ReleaseNotes_2.7

At the top of the lineage, its description is displayed.
Strains (tree nodes) are displayed as rectangles with info inside.
Links between strains (edges) are displayed as corresponding arrows.

Inside the node, its name and sequencing date are displayed. Additionally, for a node, can be specified text description and key-value pairs of metadata (attributes).
To view full node's info, user should click the node:
ReleaseNotes_2.7

If a node is associated with a reference ID, node's name is displayed as a hyperlink.
User can click it to navigate to the corresponding reference track.

At the edge, its type of interaction is displayed. Additionally, for an edge, can be specified key-value pairs of metadata (attributes).
To view full edge's info, user should click the edge:
ReleaseNotes_2.7

For more details see here.

Multi-sample VCF support

NGB supports VCF format - users can view VCF-tracks in the Browser, open separate variants to get info, view the full list of variants from the VCF file in the Variants panel.
Previously, this functionality was fully supported for single-sample VCFs.
In strain analysis, it is important to be able to work with several strains at the same time. Including with variants data received from multi-strain VCF files.
From the current version, NGB supports multi-sample (multi-strain) VCF files.

The main difference from a single-sample VCF file: genotype data for samples is presented as separate columns after a FORMAT column - one column per sample (strain).
Format defined in the FORMAT column is applied to the information in each sample column.

Multi-sample VCF file is displayed in the Datasets panel like regular VCF file.
For displaying of multi-sample VCF track in the Browser, the following views are available:

  • Expanded - each sample is being displayed similar to the regular single sample VCF-track, but all sample variants are being located in one line (without vertical distribution):
    ReleaseNotes_2.7
  • Collapsed - samples are being displaying similar to the reads at alignment tracks. Each sample line includes variants that are related to the corresponding sample:
    ReleaseNotes_2.7

Additionally, for the multi-sample VCF track can be enabled:

  • Density - the histogram above the variants block. That histogram displays summary count of variants in each position over all samples of the track:
    ReleaseNotes_2.7
  • Merge samples regimen (only with the collapsed view simultaneously) - in this regimen, all samples are being "merged" in one line, i.e. each variation displayed at the track represents the "merged" variations in that position:
    ReleaseNotes_2.7

When the Variants panel is opened for multi-sample VCFs, an additional column "Samples" appears in the table. This column contains the list of samples ID (names) in which the specific variation is presented, e.g.:
ReleaseNotes_2.7

Samples in the multi-sample VCF file can be "renamed" via the GUI:
ReleaseNotes_2.7

The pop-up will appear where user can specify an alias for any sample of the current file, e.g.:
ReleaseNotes_2.7

After aliases saving:

  • if a sample has an alias - this alias will be displayed
  • if a sample doesn't have an alias - for that sample, its origin sample name will be displayed
    ReleaseNotes_2.7
    These rules are also being applied to sample names that are shown in the Variants panel.

For more details see here.

Motifs search

Sequence motifs are short, recurring patterns in DNA that are presumed to have a biological function.
From the current version, NGB supports the search for a specific nucleotide sequence(s) in a certain reference genome/chromosome via the "built-in" search engine.
Such search is also used to find motifs.

User can start the motifs search from the "General" context menu of the reference track and the corresponding item in it:
ReleaseNotes_2.7

In the appeared pop-up, user should:

  • specify a search pattern
  • specify a name (title) for the forthcoming motifs search task (optionally)
  • determine where the search should be performed - only over the current (opened) chromosome (default behavior) or over the full reference sequence
  • once the motifs search setup is finished, click the Search button:
    ReleaseNotes_2.7

After that, the search will be performed over the opened chromosome/reference.
Once the search is performed:

  • the additional panel "MOTIFS" that contains search results in a table view will appear at the panels section:
    ReleaseNotes_2.7
  • two additional tracks that show results on forward and reverse strands will appear at the Browser panel: ReleaseNotes_2.7

Also, users have the ability to start the motif search from the BAM tracks as well:

  • open the context menu by click any read at the BAM track and select the "Motifs search" item, e.g.:
    ReleaseNotes_2.7
  • the "Search motifs" pop-up will be opened. Selected read will be used and already specified as pattern value:
    ReleaseNotes_2.7

For more details see here.

Metabolic pathways

Metabolic pathways are series of chemical reactions that start with a substrate and finish with an end product.
These pathways are being displayed as certain kind of infographics, often called "maps".
From the current version, NGB supports visualization of pathway maps.

To view pathways - open the PATHWAYS panel:
ReleaseNotes_2.7
ReleaseNotes_2.7

By default at this panel, all loaded in NGB pathway maps are shown.
Maps can be as public ones received from different sources like BioCyc database and as well as internal, produced by pathway modelling software.

For the first ones - the collage source type is used. Pathways are displayed similar to the BioCyc pathway collage view:
ReleaseNotes_2.7
ReleaseNotes_2.7
Nodes of such pathways can be presented as hyperlinks - user can click them and the corresponding gene/feature will be opened:
ReleaseNotes_2.7

For the last ones - the custom source type is used:
ReleaseNotes_2.7
ReleaseNotes_2.7

All objects at a pathway map are draggable.
To drag any object - user should click it and, holding the mouse button, relocate as wants.

User can search over the pathway objects.
Found results are highlighted by green at the pathway image:
ReleaseNotes_2.7

Also user has the ability to add annotation(s) to the displayed pathway map.
Such annotation allows to colorize the pathway map.
The following annotation types are supported:

  • heatmap file, previously registered in NGB
  • variation file, previously registered in NGB
  • data in TSV/CSV format, that can be uploaded from the local workstation
  • manually specified annotation in NGB GUI

For example, adding the manual annotation:
ReleaseNotes_2.7
ReleaseNotes_2.7

User can add and apply several annotations for the map, edit or remove them.

For more details see here.


Other

FeatureCounts support

Now, NGB supports FeatureCounts file format registration and displaying.

FeatureCounts file is being parsed into GFF format during the registration.

FeatureCounts tracks are being displayed in the way similar to GENE tracks, but here only genes and exons are displayed:
ReleaseNotes_2.7

Also, these files can be displayed in "Bar Graph" mode - as histograms:
ReleaseNotes_2.7

For displayed bars, color scheme and scale modes can be configured:
ReleaseNotes_2.7

In case of several source inputs (several alignment files) for the FeatureCounts file, at the FeatureCounts track several sub-tracks are shown - a single sub-track for each source SAM/BAM file:
ReleaseNotes_2.7

For more details see here.

GenBank support

Now, NGB supports the uploading, registration and displaying of GenBank files.
GeneBank file can be loaded to NGB in 2 ways:

  • as reference file with annotations. In this case NGB reads GeneBank file, convert it to 2 separate files - FASTA and GTF, and register these files as reference and gene files
  • as gene annotation file to an existing reference. In this case NGB converts only features data from the GeneBank file into GTF file and register this file as gene annotation file

Visualization of GenBank gene tracks differs from the general ones - features are shown as separate rectangles (or triangles at insufficient zoom). Each feature has name and strand. Each feature type has its own color:
ReleaseNotes_2.7

For more details see here.

Customize gene tracks

If the GENE file has more than one feature type - user can customize the view of the GENE track:

  • show/hide certain feature types via the Features sub-menu in the GENE track header:
    ReleaseNotes_2.7
    ReleaseNotes_2.7
    For details see here.
  • create a "duplicate" of the GENE track with only certain feature type enabled (other feature types are hidden by default):
    ReleaseNotes_2.7
    ReleaseNotes_2.7
    For details see here.

Notifier about zoom level

At a large scale (e.g. whole chromosome view) GENE/BED track is shown as histogram.
This may be misleading for some users.

Now, when a GENE/BED track at histogram view is opened a tooltip with notification appears:
ReleaseNotes_2.7

User can close it by the cross-button or just zoom-in till the medium level appears and this tooltip will disappear.
To not show such notification again - user should set the corresponding checkbox.

Restore panel view

In the current version, the ability to restore default view of the tables in VARIANTS and GENES panels was implemented.
For that, a new item was added into the options context menu of the panels:
ReleaseNotes_2.7
ReleaseNotes_2.7

By click this item, the corresponding table will be restored to its default view:

  • all extra-columns will be hidden
  • default columns will be shown (with their default order)
  • all configured sortings and filters will be reseted

ReleaseNotes_2.7

Download table data

Users can filter and/or sort the table data in GENES or VARIANTS panel as they wish.
In some cases, these "customized" data may be helpful in further work.

Therefore in the current version, to the GENES and VARIANTS panels the ability to download data was added.
User can select the format for the file and set whether the header should be included to the downloading file:
ReleaseNotes_2.7

Once the setup is finished, user should click the Download button. Table will be downloaded automatically:
ReleaseNotes_2.7

The downloaded table will contain only the same data that was displayed in the table before the download (considering all filters and sortings).

For details see here.

Sessions sharing

Previously, saved sessions were stored locally. But in some cases, it would be convenient for the users to have the ability to give access for a session to other users - i.e. to share the session.
In this version, such ability is implemented.
Now, if the user save a session - it will be accessible to all other users that have access to the files of this session. Sessions are stored globally.

Additionally, the ability to specify a description for the session appeared (not only session name):
ReleaseNotes_2.7

To the sessions table new columns were added - Description, Reference, Owner (for the user who saved the session):
ReleaseNotes_2.7

Additionally in the current version, to the Sessions panel was added ability to configure filters - to filter sessions list by any parameter(s):
ReleaseNotes_2.7

Molecular viewer enhancements

In the current version, the Molecular viewer functionality was expanded by the following features:

  • ability to change the view (display) mode for the model
  • ability to change the color scheme for the model

For that, new items were added in the right-upper corner of the model form:
ReleaseNotes_2.7

To change the view (display) mode - you can select one of the accessiblee variants:
ReleaseNotes_2.7

To change the color scheme - you can select one of the accessiblee variants:
ReleaseNotes_2.7

Additionally, from the current version - the full state of the opened model is stored when saving the session. The following options are being saved:

  • model, selected structure and chain
  • view of the model - display mode and color scheme
  • the set angle
  • the set zoom level

So, if user will open such session - the Molecular viewer panel will be opened with the full same model image as it was saved.

Download file from the GUI

In this version, the ability to download original file from datasets or explicitly from tracks was implemented.

To download an original file from the track - use the General menu in the track header:
ReleaseNotes_2.7

To download an original file from the files tree in the DATASETS panel - use the context menu by the right-click on the desired file:
ReleaseNotes_2.7

To download files related to a dataset and invisible in the files tree in the DATASETS panel - use the context menu by the right-click on the dataset:
ReleaseNotes_2.7

Display dataset description

Previously, when user opened a dataset - the summary files/variants info appeared in the "Browser" panel till any certain position/chromosome had been opened.
It could be convenient to display some additional info instead summary - for example, dataset description.
In this version such ability was implemented - for a dataset, an html description file can be added:

  • admin can add/remove dataset description via new NGB CLI commands add_description, remove_description
  • if a dataset has a description file - this file is shown in the "Browser" panel by default when opening dataset
  • such file is not displayed in the datasets tree (at the "Datasets" panel)
  • only one description can be added to a single dataset ReleaseNotes_2.7

If a dataset has an own description file - a new control appears for that dataset in the "Browser" panel header where user can switch between description and previously summary view:
ReleaseNotes_2.7

Dataset notes

In this version additionally to description, the ability to create any count of text notes to any dataset was implemented.
These notes can be opened and viewed in the "Browser" panel when a dataset is selected.
Notes are created as plain text notes, Markdown formatting is supported as well.

Notes management is organized via the menu in the "Browser" panel header.
To create a new note page - click the Add note item from the menu:
ReleaseNotes_2.7

A new form to create a note will appear. User should specify a title for a note and content, e.g.:
ReleaseNotes_2.7

Saved note page will be opened in the browser:
ReleaseNotes_2.7

User can view a notes list and select any note to open it from the notes management section in the "Browser" display menu:
ReleaseNotes_2.7

For details see here.

Dataset tree changes

In the current version the behavior for files selecting in the Datasets panel was changed.
Now:

  • if user sets a checkbox next to a specific file in the closed dataset - this will open the selected file and the dataset object (reference with annotation). Dataset object will be selected as well:
    ReleaseNotes_2.7
    ReleaseNotes_2.7
  • if all files in the dataset were manually unselected - the dataset object itself will stay opened (reference with annotation):
    ReleaseNotes_2.7
    ReleaseNotes_2.7
  • added a restriction for maximum allowed count of the inner dataset files that can be opened automatically at once (default value - 10). If any dataset has more files than this count and the user will try to open such dataset (by clicking a checkbox next to a dataset name) - checkbox will stay unselected, no files will be opened. When hovering such dataset object - a corresponding warning about large count of files will appear in a tooltip. User should expand such dataset and select necessary files manually.
    If user selects manually any file in this dataset - the dataset checkbox becomes selected too but still disabled:
    ReleaseNotes_2.7

Markdown for the root page

Previously, admin could set any html page as the root NGB page (home.url property of the NGB global settings).
In this version, the ability to use markdown files for this property was added - if the passed home.url has a .md extension - GUI will parse and render it as a Markdown source.

Dataset metadata

In the current version, the ability to attach key-value metadata to datasets and files was implemented.
This can be used:

  • to create specific tags/labels to datasets or files
  • to find datasets and files by their metadata values
  • to display tags as dataset/file details (in a tooltip or the summary page)

To manage metadata for a dataset/file - use the context menu in the Datasets panel:
ReleaseNotes_2.7

User can specify any count of key-value pairs for a dataset/file:
ReleaseNotes_2.7

Now, search in the dataset panel supports the search by attributes as well:
ReleaseNotes_2.7

Metadata can be viewed in a tooltip when hovering a dataset/file in the Dataset panel and at the summary page:
ReleaseNotes_2.7

For details see here.

Documentation link

Previously, users could get access to the NGB documentation (installation guide, user manual, release notes and etc.) only from the corresponding MarkDown pages in the NGB GitHub project.
In this version, the ability for users to have a quick-access to the NGB documentation in more comfortable and readable html-view was added.

To open the documentation users should just click the corresponding link in a tooltip of the NGB icon in the main menu:
ReleaseNotes_2.7

Documentation will be opened in a new tab.

Split view

Previously, NGB has already supported the split view for the Browser panel, e.g. to open mate regions of pair alignments or show pair for inter-chromosomal variants.
In the current version, this functionality was expanded and now, users can split browser for any location of interest - via the unified coordinates and search control.

For that, user should specify two locations, separated by space, to the unified coordinates and search control, and confirm the input by press the Enter key, e.g.:
ReleaseNotes_2.7
Each of the both locations can be specified in any currently supporting input format for a single location navigation via the unified coordinates and search control.

After that, the Browser pane will be splitted to two panes, according to specified locations:
ReleaseNotes_2.7

For details see here.

BAM coverage statistics

Users can view the coverage in any position of the BAM track - when hovering the Coverage graph - the coverage info is shown in the tooltip.
In certain cases, it is more useful to view some statistics of the coverage over the whole chromosome/reference.
In the current version, the ability was implemented to calculate and register BAM coverage statistics for different bp intervals and view this statistics in a table view on GUI.

To open the coverage statistics, click the corresponding item in the Coverage context menu of the BAM track:
ReleaseNotes_2.7

This will open the new Coverage panel:
ReleaseNotes_2.7
Table includes info about average coverage on different intervals of the specific chromosome.
The size (step) of the interval is specified during the coverage index creation.
For each BAM file there can be registered any count of indecies with dfferent bp intervals.

By click any row in the Coverage panel, the corresponding interval will be opened in the Browser panel:
ReleaseNotes_2.7

For details see here.