# Sort HSPs by Query Offset

This notebook shows how to use Pandas/Dataframes to sort the HSPs by the query (or subject) offset.

## Installation
First, install the the API library into your virtual environment:

In [None]:
%pip install --quiet ncbi-cloudblast-api

For this demo, you also need to install `pandas`:

In [None]:
%pip install --quiet pandas==0.24.2

## Before you start
To use this libray, you must provide the address for a CloudBlast API service endpoint:

In [None]:
API_ADDRESS = ""  # set the API service address, e.g. "35.245.159.177:5000"

## Perform a Blast Search

In [None]:
from ncbi_cloudblast_api.api_client import APIClient

if not API_ADDRESS:
    raise ValueError("Please set value for API_ADDRESS in the previous step.")

client=APIClient(API_ADDRESS)

In [None]:
query="u93237"

print (f'Running Blast search for: {query} ...')

# "search" method will wait for the Blast search to complete
# and then returns the result.
res = client.search(accession=query)
print ("Done.")

## Sort Search Result using pandas.DataFrame

In [None]:
from pandas import DataFrame

# A list of fields to get from the search result
fields = ["qaccver", "saccver", "pident", "length", "mismatch",
          "gapopen", "qstart", "qend", "sstart", "send",
          "evalue", "bitscore"]

# A slice of search result for the above fields
df = res.as_dataframe()[fields]

In [None]:
# First 20 HSP's (in default sort order)
df.head(20)

In [None]:
# Sort by subject sequence accession.version and query start position
df.sort_values(["saccver", "qstart"])