# Simple query to OcéanIA Platform
## This example shows how to extract arbitrary biological sub-sequences from a FASTA file available in the OcéanIA Platform

### 1. Install oceania-query-fasta package

In [1]:
!pip install oceania-query-fasta



### 2. Prepare request params

In [2]:
# TARA_SAMPLE_KEY is the key used to identify a file in the OcéanIA services
TARA_SAMPLE_KEY = "data/raw/tara/OM-RGC_v2/assemblies/TARA_A100000171.scaftig.gz"

# REQUEST_PARAMS is a list of tuples that identify subsequences to extract
# each tuple must have the values (sequence_id, start_index, stop_index, sequence_type)
# sequence type accepted values are [raw, complement, reverse_complement], optional value if ommited defaults to "raw".
REQUEST_PARAMS = [
            ("TARA_A100000171_G_scaffold48_1", 10, 50, "complement"),
            ("TARA_A100000171_G_scaffold48_1", 10, 50),
            ("TARA_A100000171_G_scaffold48_1", 10, 50, "reverse_complement"),
            ("TARA_A100000171_G_scaffold181_1", 0, 50),
            ("TARA_A100000171_G_scaffold181_1", 100, 200),
            ("TARA_A100000171_G_scaffold181_1", 200, 230),
            ("TARA_A100000171_G_scaffold493_2", 54, 76),
            ("TARA_A100000171_G_scaffold50396_2", 87, 105),
            ("TARA_A100000171_G_C2001995_1", 20, 635),
            ("TARA_A100000171_G_C2026460_1", 0, 100),
        ]

### 3. Perform call to the OcéanIA services and print results

In [3]:
from oceania import get_sequences_from_fasta

request_result = get_sequences_from_fasta(
    TARA_SAMPLE_KEY,
    REQUEST_PARAMS
)

# request_result is loaded as a pandas.DataFrame
print(f"Result loaded into a {type(request_result).__name__}")
print(request_result)

[29-06-2021 15:02:36] Sending request for fasta sequences
[29-06-2021 15:02:37] Request accepted
[29-06-2021 15:02:37] Waiting for results...
[29-06-2021 15:02:48] Done. Elapsed time: 11.784561458975077 seconds


Result loaded into a DataFrame
                                  id  start  end                type  \
0     TARA_A100000171_G_scaffold48_1     10   50          complement   
1     TARA_A100000171_G_scaffold48_1     10   50                 raw   
2     TARA_A100000171_G_scaffold48_1     10   50  reverse_complement   
3    TARA_A100000171_G_scaffold181_1      0   50                 raw   
4    TARA_A100000171_G_scaffold181_1    100  200                 raw   
5    TARA_A100000171_G_scaffold181_1    200  230                 raw   
6    TARA_A100000171_G_scaffold493_2     54   76                 raw   
7  TARA_A100000171_G_scaffold50396_2     87  105                 raw   
8       TARA_A100000171_G_C2001995_1     20  635                 raw   
9       TARA_A100000171_G_C2026460_1      0  100                 raw   

                                            sequence  
0           ACCGTAACGTAGGCCATATTATTTTCATGGTCTTCCACAA  
1           TGGCATTGCATCCGGTATAATAAAAGTACCAGAAGGTGTT  
2          