Skip to content

Commit

Permalink
Finished remaining User's Guide pages
Browse files Browse the repository at this point in the history
  • Loading branch information
spficklin committed Aug 26, 2018
1 parent 9c08b02 commit 5ccbe09
Show file tree
Hide file tree
Showing 29 changed files with 572 additions and 137 deletions.
68 changes: 28 additions & 40 deletions docs/user_guide/bulk_loader.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@

Bulk Loader
===============
===========

The bulk loader is a tool that Tripal provides for loading of data contained in tab delimited files. Tripal supports loading of files in standard formats (e.g. ``FASTA``, ``GFF``, ``OBO``), but Chado can support a variety of different biological data types and there are often no community standard file formats for loading these data. For example, there is no file format for importing genotype and phenotype data. Those data can be stored in the feature, stock and natural diversity tables of Chado. The Bulk Loader was introduced in Tripal v1.1 and provides a web interface for building custom data loader. In short, the site developer creates the bulk loader "template". This template can then be used and re-used for any tab delimited file that follows the format described by the template. Additionally, bulk loading templates can be exported allowing Tripal sites to share loaders with one another. Loading templates that have been shared are available on the Tripal website here: http://tripal.info/extensions/bulk-loader-templates.

Expand All @@ -12,7 +12,7 @@ The following commands can be executed to install the Tripal Bulk Loader using D
drush pm-enable tripal_bulk_loader
Plan How to Store Data
~~~~~~~~~~~~~~~~~~~~~~~~~
----------------------

To demonstrate use of the Bulk Loader, a brief example that imports a list of organisms and associates them with their NCBI taxonomy IDs will be performed. The input tab-delimited file will contains the list of all *Fragaria* (strawberry) species in NCBI at the time of the writing of this document.

Expand All @@ -37,8 +37,10 @@ This file has three columns: NCBI taxonomy ID, genus and species:
To use the bulk loader you must be familiar with the Chado database schema and have an idea for where data should be stored. It is best practice to consult the GMOD website or consult the Chado community (via the `gmod-schema mailing list <https://lists.sourceforge.net/lists/listinfo/gmod-schema>`_) when deciding how to store data. For this example, we want to add the species to Chado, and we want to associate the NCBI taxonomy ID with these organisms. The first step, therefore, is to decide where in Chado these data should go. In Chado, organisms are stored in the **organism** table. This table has the following fields:

.. csv-table:: Chado organism table
:header: "Name", "Type", "Description"
`chado.organism Table Schema`

.. csv-table::
:header: "Name", "Type", "Description"

"organism_id", "serial", "PRIMARY KEY"
"abbreviation", "character varying(255)",
Expand All @@ -53,8 +55,9 @@ We can therefore store the second and third columns of the tab-delimited input f
In order to store a database external reference (such as for the NCBI Taxonomy ID) we need to use the following tables: **db**, **dbxref**, and **organism_dbxref**. The **db** table will house the entry for the NCBI Taxonomy; the **dbxref** table will house the entry for the taxonomy ID; and the **organism_dbxref** table will link the taxonomy ID stored in the **dbxref** table with the organism housed in the **organism** table. For reference, the fields of these tables are as follows:


`chado.db Table Schema`

.. csv-table:: chado.db structure
.. csv-table::
:header: "Name", "Type", "Description"

"db_id", "serial", "PRIMARY KEY"
Expand All @@ -64,7 +67,9 @@ In order to store a database external reference (such as for the NCBI Taxonomy I
"url", "character varying(255)"


.. csv-table:: chado.dbxref structure
`chado.dbxref Table Schema`

.. csv-table::
:header: "Name", "Type", "Description"

"dbxref_id", "serial", "PRIMARY KEY"
Expand All @@ -74,7 +79,9 @@ In order to store a database external reference (such as for the NCBI Taxonomy I
"description", "text"


.. csv-table:: chado.organism_dbxref structure
`chado.organism_dbxref Table Schema`

.. csv-table::
:header: "Name", "Type", "Description"

"organism_dbxref_id", "serial", "PRIMARY KEY"
Expand All @@ -85,13 +92,12 @@ In order to store a database external reference (such as for the NCBI Taxonomy I
For our bulk loader template, we will therefore need to insert values into the **organism**, **db**, **dbxref** and **organism_dbxref** tables. In our input file we have the genus and species and taxonomy ID so we can import these with a bulk loader template. However, we do not have information that will go into the db table (e.g. "NCBI Taxonomy"). This is not a problem as the bulk loader can use existing data to help with import. We simply need to use the "NCBI Taxonomy" database that is currently in the Chado instance of Tripal v3.

Creating a New Bulk Loader Template
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-----------------------------------

Now that we know where all of the data in the input file will go and we have the necessary dependencies in the database (i.e. the NCBI Taxonomy database), we can create a new bulk loader template. Navigate to ``Tripal → Data Loaders → Chado Bulk Loader``, click the **Templates** tab in the top right corner, and finally click the link **Add Template**. The following page appears:

.. image:: ./bulk_loader.1.png


We need to first provide a name for our template. Try to name templates in a way that are meaningful for others. Currently only site administrators can load files using the bulk loader. But, future versions of Tripal will provide functionality to allow other privileged users the ability to use the bulk loader templates. Thus, it is important to name the templates so that others can easily identify the purpose of the template. For this example, enter the name **NCBI Taxonomy Importer (taxid, genus, species)**. The following page appears:

.. image:: ./bulk_loader.2.png
Expand Down Expand Up @@ -129,7 +135,6 @@ Next, we need to add the **species** field to the record. Click the **Add Field*
* Chado Field/Column: species
* Column: 3


We now have two fields for our organism record:

.. image:: ./bulk_loader.5.png
Expand All @@ -151,7 +156,6 @@ To this point, we have built the loader such that it can load two of the three c
* Constant Value: NCBITaxon
* Check "Ensure the value is in the table"


Here we use a field type of **Constant** rather than **Data**. This is because we are providing the value to be used in the record rather than using a value from the input file. The value we are providing is "NCBI Taxonomy" which is the name of the database we added previously. The goal is to match the name "NCBI Taxonomy" with an entry in the **db** table. Click the **Save Changes** button.

We now see a second record on the **Edit Template** page. However, the mode for this record is insert. We do not want to insert this value into the table, we want to select it because we need the corresponding **db_id** for the **dbxref** record. To change this, click the Edit link to the left of the **NCBI Taxonomy DB** record. Here we want to select only the option **SELECT ONCE**. We choose this option because the database entry that will be returned by the record will apply for the entire input file. Therefore, we only need to select it one time. Otherwise, the select statement would execute for each row in the input file causing excess queries. Finally, click **Save Record**. The **NCBI Taxonomy DB** record now has a mode of **select once**. When we created the record, we selected the option to 'SELECT ONCE'. This means that the bulk loader will perform the action one time for that record for the entire import process. Because the field is a constant the bulk loader need not execute that record for every row it imports from our input file. We simply need to select the record once and the record then becomes available for use through the entire import process.
Expand All @@ -174,11 +178,8 @@ Now that we have a record that selects the **db_id** we can now create the **dbx

Click the Save Changes button. The Edit Template page appears.


.. image:: ./bulk_loader.6.png



Again, we need to edit the record to make the loader more fault tolerant. Click the Edit link to the left of the Taxonomy ID record. Select the following:

* Insert
Expand All @@ -196,7 +197,6 @@ To complete this record, we need to add the accession field. Click the Add field

At this state, we should have three records: Organism, NCBI Taxonomy DB, and Taxonomy ID. We can now add the final record that will insert a record into the **organism_dbxref** table. Create this new record with the following details:


* For the record:
* Record: New Record
* Unique Record Name: Taxonomy/Organism Linker
Expand Down Expand Up @@ -227,18 +227,16 @@ Create the second field:

We are now done! We have created a bulk loader template that reads in a file with three columns containing an NCBI taxonomy ID, a genus and species. The loader places the genus and species in the **organism** table, adds the NCBI Taxonomy ID to the **dbxref** table, links it to the NCBI Taxonomy entry in the db table, and then adds an entry to the **organism_dbxref** table that links the organism to the NCBI taxonomy Id. The following screen shots show how the template should appear:


.. image:: ./bulk_loader.7.png


To save the template, click the **Save Template** link at the bottom of the page.

Creating a Bulk Loader Job (importing a file)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
---------------------------------------------

Now that we have created a bulk loader template we can use it to import a file. We will import the **Fragaria**.txt file downloaded previously. To import a file using a bulk loader template, click the **Add Content** link in the administrative menu and click the **Bulk Loading Job**. A bulk loading job is required each time we want to load a file. Below is a screen shot of the page used for creating a bulk loading job.


.. image:: ./bulk_loader.8.png

Provide the following values:

Expand All @@ -250,15 +248,11 @@ Provide the following values:

Click **Save**. The page then appears as follows:


.. image:: ./bulk_loader.8.png

.. image:: ./bulk_loader.9.png

You can see details about constants that are used by the template and the where the fields from the input file will be stored by clicking the **Data Fields** tab in the table of contents on the left sidebar.


.. image:: ./bulk_loader.9.png

.. image:: ./bulk_loader.10.png

Now that we have created a job, we can submit it for execution by clicking the **Submit Job** button. This adds a job to the Tripal Jobs systems and we can launc the job as we have previously in this tutorial:

Expand Down Expand Up @@ -296,32 +290,26 @@ After execution of the job you should see similar output to the terminal window:
Our *Fragaira* species should now be loaded, and we return to the Tripal site to see them. Click on the **Organisms** link in the **Search Data** menu. In the search form that appears, type "Fragaria" in the **Genus** text box and click the **Filter** button. We should see the list of newly added *Fragaria* species.

.. image:: ./bulk_loader.10.png


Before the organisms will have Tripal pages, the Chado records need to be **Published**. You can publish them by navigating to ``admin -> Tripal Content -> Publish Tripal Content``. Select the **organism** table from the dropdown and run the job.

.. image:: ./bulk_loader.11.png

Before the organisms will have Tripal pages, the Chado records need to be **Published**. You can publish them by navigating to **Tripal Content -> Publish Tripal Content**. Select the **organism** table from the dropdown and run the job.

.. note::

In Tripal 2, records were synced by naviating to ``Tripal → Chado Modules → Organisms``.


In Tripal 2, records were synced by naviating to **Tripal → Chado Modules → Organisms**.

Once complete, return to the search form, find a *Fragaria* species that has been published and view its page. You should see a Cross References link in the left table of contents. If you click that link you should see the NCBI Taxonomy ID with a link to the page:

.. image:: ./bulk_loader.11.png
.. image:: ./bulk_loader.12.png


Sharing Your Templates with Others
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
----------------------------------

Now that our template for loading organisms with NCBI Taxonomy IDs is completed we can share our template loader with anyone else that has a Tripal-based site. To do this we simply export the template in text format, place it in a text file or directly in an email and send to a collaborator for import into their site. To do this, navigate to ``Tripal → Chado Data Loaders → Buik Loader`` and click the **Tempalate** tab at the top. Here we find a table of all the tempaltes we have created. We should see our template named **NCBI Taxonomy Importer** (taxid, genus, species). In the far right colum is a link to export that template. Licking that link will redirect you to a page where the template is provided in a serialized PHP array.

.. image:: ./bulk_loader.12.png
Now that our template for loading organisms with NCBI Taxonomy IDs is completed we can share our template loader with anyone else that has a Tripal-based site. To do this we simply export the template in text format, place it in a text file or directly in an email and send to a collaborator for import into their site. To do this, navigate to **Tripal → Chado Data Loaders → Buik Loader** and click the **Tempalate** tab at the top. Here we find a table of all the tempaltes we have created. We should see our template named **NCBI Taxonomy Importer** (taxid, genus, species). In the far right colum is a link to export that template. Licking that link will redirect you to a page where the template is provided in a serialized PHP array.

.. image:: ./bulk_loader.13.png

Simply cut-and-paste all of the text in the **Export** field and send it to a collaborator.
Cut-and-paste all of the text in the **Export** field and send it to a collaborator.

To import a template that may have been created by someone else, navigate to ``Tripal → Chado Data Loaders → Buik Loader`` and click the **Tempalate** tab. A link titled Import Template appears above the table of existing importers. The page that appears when that link is clicked will allow you to import any template shared with you.
To import a template that may have been created by someone else, navigate to **Tripal → Chado Data Loaders → Buik Loader** and click the **Tempalate** tab. A link titled Import Template appears above the table of existing importers. The page that appears when that link is clicked will allow you to import any template shared with you.
15 changes: 8 additions & 7 deletions docs/user_guide/configuring_page_display.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ Configuring Page Display

This is one of the many new exciting features of Tripal v3.x. In this version of Tripal we have taken integration with Drupal Fields to a whole new level representing each piece of content (in Chado or otherwise) as a Drupal Field. What this means for site builders is unprecendented control over content display and arrangement through the administrative user interface --No more editing PHP template files to change the order, grouping or wording of content!

You can configure the display of a given Tripal Content Type by navigating to ``Structure → Tripal Content Types`` and then selecting the "Manage Display" link beside the content type you would like to configure.
You can configure the display of a given Tripal Content Type by navigating to **Structure → Tripal Content Types** and then selecting the **Manage Display** link beside the content type you would like to configure.

.. image:: ./configuring_page_display.1.png

Expand All @@ -17,15 +17,15 @@ The Manage Display User Interface lists each Drupal Field in the order they will


Rearranging Fields
~~~~~~~~~~~~~~~~~~~
------------------

To rearrange the fields within a Tripal pane, simply drag them into the order you would like them. For example, the description is currently within the Summary table --it makes much more sense for it to be below the table but still within the summary. To do this, simply drag the description field to the bottom of the summary table and then move it in one level as shown in the following screenshot. Then click the **Save** button at the botton to save the changes.

.. image:: configuring_page_display.3.rearrange.png


Removing Fields and/or Field Lables
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Removing Fields and/or Field Labels
-----------------------------------

Now say we don't want the label "Description" in front of description content since it's pretty self explanatory. We can do that by changing the drop-down beside "Description" which currently says "Above" to "Hidden". This removes the label for the field assuming it's not within a table.

Expand All @@ -35,21 +35,22 @@ There may also be data you want to collect from your user but don't want to disp

Don't forget to save the configuration often as you are changing it. You will not see changes to the page unless the **Save** button at the bottom of the Manage Display UI is clicked.


Changing Tripal Pane Names
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
--------------------------

The name of a Tripal Pane is displayed both in the header of the Pane itself and in the Table of Contents. To change this name, click the gear button to the far right of the Tripal Pane you would like to change. This will bring up a blue pane of settings. Changing the Field Group Label will change the display name of the pane. For example, the following screenshot shows how you would change the "Cross References" Tripal Pane to be labeled "External Resources" instead if that it what you prefer. Then just click the Update button to see your changes take effect.

.. image:: ./configuring_page_display.4.png


Display/Hide Tripal Panes on Page Load
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
--------------------------------------

You can also easily control which Tripal Panes you would like displayed to the user on initial page load. By default the Summary Pane is the only one configured to show by default. However, if you would prefer for all panes or even a specific subset of panes to show by default, you can simply click the gear button to the far right of each Tripal Pane you want displayed by default and uncheck the "Hide panel on page load" checkbox. This gives you complete control over which panes you want your user to see first. If more then one pane is displayed by default then they will be shown in the order they are listed on the Manage Display UI.

Display/Hide Empty Fields
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-------------------------

By default Tripal v3 hides all empty fields from the user. However like most behaviour in Tripal, this can be configured. If you would prefer to show all fields to the user regardless of whether there is content for that particular page, then navigate to ``Structure → Tripal Content Types`` and then click on the edit link beside the Tripal Content Type you would like to show empty fields for. Near the bottom of this form is a **Field Display** drop-down. Just change this drop-down to "show empty fields" and then click **Save Content Type**. As an example, we have changed this setting for the organism content type and, as you can see below, now you can see all fields (including empty fields like cross references and relationships) available to the organism content type.

Expand Down
Binary file added docs/user_guide/example_genomics/analyses.1.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/user_guide/example_genomics/analyses.2.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 5ccbe09

Please sign in to comment.