# Download Advanced Search matches - Spectra Intelligence
This notebook offers the option of downloading multiple files matched by the Advanced Search API.

### Used Spectra Intelligence classes
- **AdvancedSearch** (*TCA-0320*)
- **FileDownload** (*TCA-0201*)

### Credentials
Credentials are loaded from a local file instead of being written here in plain text.
To learn how to creat the credentials file, see the **Storing and using the credentials** section in the [README file](./README.md)

In [None]:
import json
import os
from ReversingLabs.SDK.ticloud import AdvancedSearch, FileDownload

CREDENTIALS = json.load(open("credentials.json"))
USERNAME = CREDENTIALS.get("ticloud").get("username")
PASSWORD = CREDENTIALS.get("ticloud").get("password")

cwd = os.getcwd()
upper_level = os.path.dirname(cwd)
USER_AGENT = json.load(open(os.path.join(upper_level, "user_agent.json")))["user_agent"]

advanced_search = AdvancedSearch(
    host="https://data.reversinglabs.com",
    username=USERNAME,
    password=PASSWORD,
    user_agent=USER_AGENT
)

file_download = FileDownload(
    host="https://data.reversinglabs.com",
    username=USERNAME,
    password=PASSWORD,
    user_agent=USER_AGENT
)

### 1. Batch-download function
This function will be used for downloading each batch of matched samples. Execute it so you can use it further on in the notebook.

In [None]:
def download_batch(sample_list, download_path):
    for sample in sample_list:
        response = file_download.download_sample(sample_hash=sample.get("sha1"))
        
        with open(os.path.join(download_path, sample.get("sha1")), "wb") as file_handle:
            file_handle.write(response.content)

### 2. Main function
To retrieve batches of samples matched by the Advanced Search API and download them as files, use the `download_advanced_search_matches` function.  
This function accepts the following parameters:
- `query_string`: advanced search query
- `download_path`: needs to be a full path to the existing folder
- `sorting_criteria`: field used to sort the results
- `sorting_order`: 'desc' or 'asc'
- `records_per_page`: matches are retrieved in pages. this number defines how big each page will be
- `max_results`: the maximum number of matches you want retrieved and downloaded

Execute the function so you can use it multiple times.

In [None]:
def download_advanced_search_matches(query_string, download_path, sorting_criteria=None, sorting_order="desc", 
                                     records_per_page=100, max_results=500):
    page_number = 1
    more_pages = True
    result_count = 0
    
    while more_pages:
        response = advanced_search.search(
            query_string=query_string,
            page_number=page_number,
            records_per_page=records_per_page,
            sorting_criteria=sorting_criteria,
            sorting_order=sorting_order
        )
        
        resp_json = response.json()
        results = resp_json.get("rl", {}).get("web_search_api", {}).get("entries", [])
        more_pages = resp_json.get("rl", {}).get("web_search_api", {}).get("more_pages")
        
        result_count += len(results)
        page_number += 1
        
        if max_results:
            
            if result_count >= max_results:
                excesss = result_count - max_results
                cutoff = len(results) - excesss
                
                results = results[:cutoff]
                download_batch(sample_list=results, download_path=download_path)
                
                break
                
        download_batch(sample_list=results, download_path=download_path)
        
        if not more_pages:
            break

### 3. Run

In [None]:
download_advanced_search_matches(
    query_string="change_me",
    download_path="/change/me",
    records_per_page=20,
    max_results=100
)