Skip to content

Commit

Permalink
Merge c9c1008 into 65298b7
Browse files Browse the repository at this point in the history
  • Loading branch information
mboudet committed Aug 6, 2021
2 parents 65298b7 + c9c1008 commit 1ccc3d6
Show file tree
Hide file tree
Showing 34 changed files with 698 additions and 61 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,3 +15,4 @@ AskOmics is a visual SPARQL query interface supporting both intuitive data integ
## Documentation

All documentation, included installation instruction is [here](https://flaskomics.readthedocs.io/en/latest/)
A Galaxy Training tutorial is available [here](https://training.galaxyproject.org/training-material/topics/transcriptomics/tutorials/rna-seq-analysis-with-askomics-it/tutorial.html)
6 changes: 3 additions & 3 deletions docs/abstraction.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
During integration of TSV/CSV, GFF and BED files, AskOmics create RDF triples that describe the data. This set of triple are called *Abstraction*. *Abstraction* is a set of RDF triples who describes the data. This triples define *Entities*, *Attributes* and *Relations*. Abstraction is used to build the *Query builder*.

Raw RDF can be integrated into AskOmics. In this case, abstraction have to be built manually. The following documentation explain how to write an AskOmics abstraction in turtle format.
Raw RDF can be integrated into AskOmics. In this case, abstraction have to be built manually. The following documentation explain how to write manually write an AskOmics abstraction in turtle format.

# Namespaces

Expand All @@ -19,7 +19,7 @@ PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
```
<br />

!!! info
!!! note "Info"
Namespaces `:` and `askomics:` are defined in the AskOmics config file (`config/askomics.ini`)

# Entity
Expand All @@ -34,7 +34,7 @@ The entity is a class. In the query builder, it is represented with a graph node
```
<br />

!!! info
!!! note "Info"
`:EntityName rdf:type :startPoint` is not mandatory. If the entity have this triple, a query can be started with this this node.

# Attributes
Expand Down
140 changes: 138 additions & 2 deletions docs/data.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,141 @@
In this tutorial, we will learn how to build CSV/TSV file for AskOmics.
# CSV/TSV files

AskOmics will integrate a CSV/TSV file using its header. The *type* of each column will be predicted, but you will be able to modify it before integration.

![CSV/TSV integration](img/csv_convert.png){: .center}

*-- Work in progress --*

## Entity (first column)

### Entity URI

The first column of the file will manage the entity itself : the column name will become the entity name, and the values will become the entity **URI**.
The **URI** will be created as follows :

* If the value is an **URL**, it will be integrated as it is.
* If the value is a [CURIE](https://www.w3.org/TR/2010/NOTE-curie-20101216/), it will be transformed into an URL before integration. The list of managed CURIE formats is available [here](https://github.com/askomics/flaskomics/blob/master/askomics/libaskomics/prefix.cc.json).
* Else, the value will be added to either AskOmics *namespace_data* value, or a custom base URI if specified in the integration form.

!!! Warning
Unless you are trying to merge entities, make sure your URIs are unique across **both your personal and public datasets**.

### Entity type

The entity type can either be "starting entity", or "entity". If "starting entity", it may be used to start a query on the AskOmics homepage. Both types will appear as a node in the AskOmics interface.

### Inheritance

The entity can inherit the attributes and relations of a 'mother' entity. Meaning, you will be able to query the sub-entity on both its own, and its 'mother' attributes and relations. The 'mother' entity however will not have access to any 'daughter' attributes or relations.

To setup inheritance, the column name needs to be formated as follows:
* *daughter_entity_name*<*mother_entity_name* (with the < symbol)

!!! Warning
The values of this column must be an URI of the 'mother' entity

## Attributes

Each column after the first one will be integrated as an *attribute* of the entity. The column name will be set as the name of the attribute.
Several attribute types are available (AskOmics will guess the type of a column based on its name and its values).
The type of an attribute will dictate the way it will be managed in the query form (eg: text field, value selector...)

Attributes can take the following types :

### Base types

- Numeric: if the values are numeric
- Text: if all values are strings
- Date: if all values are dates (using *dateutil.parser*)
- *(Auto-detected terms : 'date', 'time', 'birthday', 'day')*
- Category: if there is a limited number of repeated values
- Boolean: if the values are binary ("True" and "False", or "0" and "1")

!!! Warning
If the date format is ambiguous (eg: 01/01/2020), AskOmics will interpret it as *day/month/year*

### FALDO types

If the entity describe a locatable element on a genome (based on the FALDO ontology):

- [Reference](http://biohackathon.org/resource/faldo#reference): chromosome *(Auto-detected terms : 'chr', 'ref')*
- [Strand](http://biohackathon.org/resource/faldo#StrandedPosition): strand *(Auto-detected terms : 'strand')*
- Start: start position *(Auto-detected terms : 'start', 'begin')*
- End: end position *(Auto-detected terms : 'end', 'stop')*

!!! Warning
To mark an entity as a *FALDO entity*, you need to provide **at least** a *'Start'* and *'End'* columns.
'Reference' and/or 'Strand' are optional, but will enable more specific queries (eg: *Same reference* or *Same strand*)

### Relations

A column can also symbolize a relation to another entity. In this case, the column name must be of the form :

* *relationName@RelatedEntityName* (with the @ symbol)

Two types are available :

- Directed: Relation from this entity to the targeted one *(e.g. A is B’s father, but B is not A’s father)*
- Symetric: Relation that works in both directions *(e.g. A loves B, and B loves A)*

!!! Warning
The content of the column must be URIs of the related entity.
*(The related entity and its URIs may be created afterwards)*

Linked URIs must match one of these three formats :

- Full URI
- CURIE
- Simple value (the value will transformed into an URI with AskOmics *namespace_data* value)

This link between entities will show up in the query screen, allowing users to query related entities.

!!! note "Info"
**All** FALDO entities will be automatically linked with the *included_in* relation, without needing an explicit link.
You can still specify your own relations.

!!! Warning
For federated queries, the syntax is slightly different. Please refer to [this page](abstraction.md#linking-your-own-data) for more information.


# GFF files

!!! Warning
Only the *GFF3* format is managed by AskOmics.

Each GFF file can be integrated into several entities. You will be able to select the entities you wish to integrate beforehand. Available entities are the values of the 'type' column of the GFF file. The relations between entities (eg: *Parents* or *Derives_from*) will also be integrated.

![Integration interface for GFF files](img/gff_preview.png){: .center}

Extracted attributes are the following :

- Reference
- Strand
- Start
- End
- Any attribute in the "attributes" column
- "Parents" and "Derives_from" will be converted in relations

!!! note "Info"
All entities extracted from GFF files are *FALDO entities*, and will be linked implicitly with the *included_in* relation.

# BED files

Each BED file will be integrated into one entity (the default entity name will be the file name, but it can be customized).

Extracted attributes are the following :

- Reference
- Strand
- Start
- End
- Score

!!! note "Info"
All entities extracted from BED files are *FALDO entities*, and will be linked implicitly with the *included_in* relation.

# TTL Files

You can integrate TTL files in AskOmics, either to integrate your own data, or to enable [federated queries](federation.md) to remote endpoints.
In both case, you will need to generate or convert your data in AskOmics's format.

This can be done either [manually](abstraction.md) or [automatically](federation.md#auto-generate-external-abstraction-with-abstractor)
2 changes: 1 addition & 1 deletion docs/docs.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
all the documentation (including what you are reading) can be found [here](https://flaskomics.readthedocs.io). Files are on the [AskOmics repository](https://github.com/askomics/flaskomics/tree/master/docs).
All the documentation (including what you are reading) can be found [here](https://flaskomics.readthedocs.io). Files are on the [AskOmics repository](https://github.com/askomics/flaskomics/tree/master/docs).

# Serve doc locally

Expand Down
26 changes: 10 additions & 16 deletions docs/federation.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,39 +15,33 @@ abstractor -e <endpoint_url> -p <entity_prefix> -o <output_file>
```

!!! Warning
Abstractor scan all things in the SPARQL endpoint. You may review the generated file to delete unwanted things.
Abstractor scan all things in the SPARQL endpoint. You may wish to review the generated file to delete unwanted things.


## Integrate external abstraction into AskOmics

Once external endpoint's abstraction is generated, its time to add it into AskOmis. Upload it and integrate it.
![integrate_external](img/integrate_external.png)
![Integrating an external abstraction](img/integrate_external.png){: .center}

!!! Warning
Check that `advanced options` > `Distant endpoint` contain URL of the external endpoint



# Query external endpoint

## Simple query
## Starting entities

If AskOmics contain local data, external startpoint are not displayed by default on the start page. Use the `Source` dropdown button to display external entities.

![external_startpoint](img/external_startpoint.png)


## Federated query


External entities can be interrogate just as local entities. But to link a local dataset to the external endpoint, the file must be structured in a certain way.
![External startpoint](img/external_startpoint.png){: .center}

### Build file
## Linking to your own data

The input file must describe the relation with the external entity. It goes through the header, who must contain the URI of the targeted entity. Content of the file must also be the exact uri of the targeted entity.
To link a local dataset to the external endpoint, the file must be structured in a certain way.

The input file must describe the relation with the external entity. Much like a 'normal' relation, it goes through the header. In this case however, the column name must contain either the full URI or the CURIE of the entity (e.g *http://nextprot.org/rdf#Gene*). The values of the column must also be the exact uri (full URI of CURIE) of the targeted entity.

For example, the file below describe en entity *gene* who is linked to an external entity *Gene*. The external one is prefixed with the full uri used in the external endpoint. In the content of the file, full URI have to be used to.
For example, the file below describe en entity *gene* who is linked to an external entity *Gene*. The external one is prefixed with the full uri used in the external endpoint. In the content of the file, full URI / CURIE have to be used too.


gene|value|concern@http://nextprot.org/rdf#Gene
Expand All @@ -56,6 +50,6 @@ gene_1|0|http://nextprot.org/rdf/gene/ENSG00000169594
gene_2|1|http://nextprot.org/rdf/gene/ENSG00000156603


### Perform a federated query
## Perform a federated query

Once the relations are well described, link between local and distant entities are automatically done by AskOmics. The Query is distributed to the concerned endpoint and results are returned like a classic query.
Once the relations are described, links between local and distant entities are automatically done by AskOmics. The query is distributed to the external endpoint and results are returned like a classic query.
17 changes: 7 additions & 10 deletions docs/galaxy.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ On your Galaxy account, go to the top menu *User* → *API Keys* and copy your A

On AskOmics, got to <navbar><i class="fa fa-user"></i> Your Name</navbar> → <navbar><i class="fa fa-cog"></i> Account management</navbar> → **Connect a Galaxy account** and enter the Galaxy URL and API Key.

![askogalaxy](img/askogalaxy.png)
![](img/askogalaxy.png){: .center}

Once a Galaxy account is added to AskOmics, you can access to all your Galaxy Datasets from AskOmics.

Expand Down Expand Up @@ -51,30 +51,30 @@ Galaxy Interactive Tools (GxITs) are a method to run containerized tools that ar

Search for the AskOmics Interactive tool using the search bar.

![Search a Galaxy Tool](img/galaxy_search_tool.png)
![](img/galaxy_search_tool.png){: .center}


Choose input files to automatically upload them into AskOmics

![Input files](img/galaxy_input_data.png)
![](img/galaxy_input_data.png){: .center}

!!! Tip
You will able to add more input files later

A dedicated AskOmics instance will be deployed into the Cluster. Wait few minutes and go to the instance using the `click here to display` link.

![Galaxy](img/galaxy_execute_it.png)
![](img/galaxy_execute_it.png){: .center}

Once you are into your AskOmics instance, you can see your uploaded files into the <navbar><i class="fa fa-file"></i> Files</navbar> tab.

![Galaxy](img/galaxy_askomics_files.png)
![](img/galaxy_askomics_files.png){: .center}

## Upload additional files

in addition to the <navbar><i class="fa fa-upload"></i> Computer</navbar> and <navbar><i class="fa fa-upload"></i> URL</navbar> buttons, you can now use the <navbar><i class="fa fa-upload"></i> galaxy</navbar> button to import datasets from your galaxy histories


![Galaxy](img/galaxy_import_from_galaxy.png)
![](img/galaxy_import_from_galaxy.png){: .center}

## Integrate and Query

Expand All @@ -84,7 +84,4 @@ follow the [tutorial](/tutorial#data-integration) to integrate and query your da

Once you have your result, Use the `Send result to Galaxy` to export a TSV file into your last recently used Galaxy history.

![Galaxy](img/galaxy_history_result.png)



![](img/galaxy_history_result.png){: .center}
Binary file modified docs/img/askogalaxy.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/askograph.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/attribute_box.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/attributes.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/csv_convert.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/custom_nodes.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/img/external_startpoint.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/faldo.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/filters.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/form.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/form_edit.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/form_example.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/img/gff.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/img/gff_preview.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/minus.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/sparql.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/img/startpoint.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/template.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/union.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/union_duplicated.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
11 changes: 10 additions & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,17 @@
AskOmics is a visual SPARQL query interface supporting both intuitive data integration and querying while shielding the user from most of the technical difficulties underlying RDF and SPARQL
AskOmics is a visual SPARQL query interface supporting both intuitive data integration and querying while shielding the user from most of the technical difficulties underlying RDF and SPARQL.

AskOmics allows heterogeneous data (formatted as tabular files, GFF, BED or native RDF files) to be loaded into an RDF triplestore, to be transparently and interactively queried through a user-friendly interface.

AskoMics also support federated queries to external SPARQL endpoints.

![AskOmics](img/askograph.png){: .center}

- AskOmics Tutorials
- [Overview tutorial](tutorial.md): How to use AskOmics with example data
- [Prepare your data](data.md): How to format your own data for AskOmics
- [Building a query](query.md): Learn how the query building interface works
- [Results interface](results.md): In-depth guide of the results interface
- [Template & forms](template.md): An overview of the templates & forms functionalities
- [Build an RDF abstraction](abstraction.md): Learn how to build an RDF abstraction for RDF data
- [Perform federated queries](federation.md): How to query your own data with external resources
- [Use AskOmics with Galaxy](galaxy.md): How to connect AskOmics with your Galaxy history
Expand Down

0 comments on commit 1ccc3d6

Please sign in to comment.