In [2]:
from IPython.display import HTML

HTML('''<script>
code_show=true; 
function code_toggle() {
 if (code_show){
 $('div.input').hide();
 } else {
 $('div.input').show();
 }
 code_show = !code_show
} 
$( document ).ready(code_toggle);
</script>
<form action="javascript:code_toggle()"><input type="submit" value="Click here to toggle on/off the raw code."></form>''')

<img style="float: left; margin-right:20px;" src="https://raw.githubusercontent.com/ODiogoSilva/TriFusion-tutorials/master/tutorials/images/trifusion-icon-64.png"><p style="font-weight:bold; font-size:30px; color:#37abc8; margin-bottom:50px;">Creating and using active data set groups.</p>

# What are active datasets

Most operations in TriFusion can be applied to either the total data set (all files and taxa currently loaded) or custom made data sets, named _active_ datasets. Active data sets can be defined in TriFusion in order to quickly apply operations on different sets of files and/or taxa. The definition of data set groups can serve multiple purposes that are highly user-specific, such as:

- creating output files with or without outgroup taxa;
- separating data sets with nuclear and mitochondrial alignments;
- create alignments with only specific taxa families;
- removing problematic files/taxa from output;
- etc.


# How to define active datasets 

Active datasets can be created/modified in two main ways:

- [Toggle file/taxa buttons in TriFusion's sidepanel](#Toggle-file/taxa-buttons-in-the-side-panel)
    - [Mouse click toggling](#Mouse-click-toggling)
    - [Import selection from file](#Import-selection-from-file)
- [Create data set groups](#Create-dataset-groups)
    - [Manual group creation inside TriFusion](#Manual-group-creation-inside-TriFusion)
    - [Group creation from file](#Group-creation-from-file)

Once created, the desired custom data set groups can be specified in the respective dropdown menus of the __Process__ and __Statistics__ modules.

# Input data

For this tutorial, we will use a medium sized data set of 614 genes and 48 taxa (which can be donwloaded [here](https://github.com/ODiogoSilva/TriFusion-tutorials/raw/master/tutorials/Datasets/Process/medium_protein_dataset/medium_protein_dataset.zip)). 

# How to use active data sets

_Active_ data sets are selected by default. You can change to the _total_ dataset or to any user made dataset by clicking the correct group in the corresponding dropdown menu.

Dropdown menu for data set selection on the __Process__ screen:

<img width="90%" src="https://raw.githubusercontent.com/ODiogoSilva/TriFusion-tutorials/master/tutorials/images/process_data_set_selection.png">

Dropdown menu for data set selection on the __Statisics__ screen:

<img width="90%" src="https://raw.githubusercontent.com/ODiogoSilva/TriFusion-tutorials/master/tutorials/images/statistics_data_set_selection.png">


# Toggle file/taxa buttons in the side panel

## Mouse click toggling

By default, when data is loaded into TriFusion all files/taxa are active. Therefore, the _total_ and _active_ datasets are the same. The quickest way to modify the _active_ dataset is by navigating to _Menu_ > _Open/View Data_ and toggle the corresponding file/taxa buttons. Shift + clicking is also supported to select multiple contiguous files/taxa.

<figure>
    <br>
    <p style="font-size: 14px; text-align: center; font-weight: bold;">Click figure to animate</p>
    <img class="animation" width="90%" src="https://raw.githubusercontent.com/ODiogoSilva/TriFusion-tutorials/master/tutorials/images/dataset_toggle.png" alt="Static Image" data-alt="https://raw.githubusercontent.com/ODiogoSilva/TriFusion-tutorials/master/tutorials/gifs/process_tutorial2_mouse_toggle.gif">
</figure>

Active files/taxa will appear with a blue background, while inactive buttons will have no background. A label below the button list displays how many files/taxa are currently active.

## Import selection from file

When dealing with a large number of files/taxa it may be more convenient to provide the _active_ dataset through a text file. This should be a single text file containing the names of the desired files/taxa in each line (the extension of the file is not important). You can create it yourself, or donwload it [here](https://github.com/ODiogoSilva/TriFusion-tutorials/raw/master/tutorials/Datasets/Process/medium_protein_dataset/taxa_list.txt).

In [None]:
# Example of a text file for taxa selection in TriFusion
Agaricus_bisporus
Botrytis_cinerea
Coniophora_puteana

Providing this file to TriFusion will select these three taxa for the _active_ dataset. The file can be provided by clicking the "__+__" button at the bottom of the panel, and then "__Select taxa names from .txt__". 

<img width="90%" src="https://raw.githubusercontent.com/ODiogoSilva/TriFusion-tutorials/master/tutorials/images/dataset_toggle_fromfile.png">

After loading the file, __ONLY__ the specified taxa will become active, regardless of the previous _active_ dataset. Names that do not match any of the files/taxa present in TriFusion will be ignored.

<figure>
    <br>
    <p style="font-size: 14px; text-align: center; font-weight: bold;">Click figure to animate</p>
    <img class="animation" width="90%" src="https://raw.githubusercontent.com/ODiogoSilva/TriFusion-tutorials/master/tutorials/images/dataset_selected_after_file.png" alt="Static Image" data-alt="https://raw.githubusercontent.com/ODiogoSilva/TriFusion-tutorials/master/tutorials/gifs/process_tutorial2_select_from_file.gif">
</figure>

If you wish to save the active dataset to a new file so that you can use it later, click the "__+__" button and then "__Export selected taxa names to .txt__".

# Create dataset groups

When the workflow requires the application of operations to multiple taxa/files datasets, it is more convenient to define all dataset groups and then use the dropdown menus to select the desired _active_ data set. Dataset groups can be defined in TriFusion by navigating to _Menu_ > _Dataset Groups_.

<img width="90%" src="https://raw.githubusercontent.com/ODiogoSilva/TriFusion-tutorials/master/tutorials/images/dataset_creationpanel.png">

File and taxon groups are sorted into two tabs, like in the _Open/View Data_ panel, and clicking the __Set new [file|taxa] group__ button will start the creation of the corresponding group. 

<img width="90%" src="https://raw.githubusercontent.com/ODiogoSilva/TriFusion-tutorials/master/tutorials/images/dataset_triage.png">

Here you can choose to create the dataset group either manually in TriFusion, or by providing the names of the files/taxa in a text file.


## Manual group creation inside TriFusion

__WARNING:__ This option is discourage for very large data sets (~200 itens). In these cases, use the [group creation from file](#Group-creation-from-file) option.

Let's create a new taxa group, by clicking the _Taxa_ tab and then the __Set new taxa group__ button. Here, groups can be created by selecting the desired files/taxa from the _All [files|taxa]_ column and using the arrow buttons to move them to the _Selected taxa_ column. Once the group is complete, give it a unique name and the group is ready to be defined. If you wish to create multiple groups in one sitting, click the __Apply__ button to create the group but remain in the dialog for further group definition.

<figure>
    <br>
    <p style="font-size: 14px; text-align: center; font-weight: bold;">Click figure to animate</p>
    <img class="animation" width="90%" src="https://raw.githubusercontent.com/ODiogoSilva/TriFusion-tutorials/master/tutorials/images/dataset_manual_creation.png" alt="Static Image" data-alt="https://raw.githubusercontent.com/ODiogoSilva/TriFusion-tutorials/master/tutorials/gifs/process_tutorial2_manual_selection.gif">
</figure>

Previously created groups will be listed under the _Created groups_ column, where they can be selected (this will move the group members to the _Selected taxa_ column) or removed. 

## Group creation from file

Here, you will only have to provide a text file with the names of the files/taxa you wish to select for the current group. The text file is the same as described in the [import selection from file](#Import-selection-from-file) section. 


In [None]:
# Example of a text file for taxa selection in TriFusion
Agaricus_bisporus
Botrytis_cinerea
Coniophora_puteana

After providing the file with the group members list, specify a unique name for the new data set group, and that's it!

<img width="90%" src="https://raw.githubusercontent.com/ODiogoSilva/TriFusion-tutorials/master/tutorials/images/dataset_file_creation.png">

<figure>
    <br>
    <p style="font-size: 14px; text-align: center; font-weight: bold;">Click figure to animate</p>
    <img class="animation" width="90%" src="https://raw.githubusercontent.com/ODiogoSilva/TriFusion-tutorials/master/tutorials/images/dataset_name_specification.png" alt="Static Image" data-alt="https://raw.githubusercontent.com/ODiogoSilva/TriFusion-tutorials/master/tutorials/gifs/process_tutorial2_file_selection.gif">
</figure>


In [3]:
%%javascript
var getGif = function() {
    var gif = [];
    $('.animation').each(function() {
        console.log("here")
        var data = $(this).data('alt');
        gif.push(data);
    });
    return gif;
}
var gif = getGif();
console.log(gif)

//Preload all the GIF.
var image = [];
  
$.each(gif, function(index) {
    image[index]     = new Image();
    image[index].src = gif[index];
    });

$('figure').on('click', function() {
     
    var $this   = $(this),
    $index  = $this.index(),
    $img    = $this.children('img'),
    $imgSrc = $img.attr('src'),
    $imgAlt = $img.attr('data-alt'),
    $imgExt = $imgAlt.split('.');
           
    if($imgExt[1] === 'gif') {
        $img.attr('src', $img.data('alt')).attr('data-alt', $imgSrc);
    } else {
        $img.attr('src', $imgAlt).attr('data-alt', $img.data('alt'));
    } 
});

<IPython.core.display.Javascript object>