<a href="https://colab.research.google.com/github/atlasfutures/memex/blob/docs_private/docs/tutorial/tutorials/clinical-trials-matching/Clinical_Trials_Matching.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Match Patients to Clinical Trials
### An implementation of [Zero-Shot Clinical Trial Patient Matching with LLMs](https://arxiv.org/pdf/2402.05125.pdf) from Stanford Medicine's Shah Lab using Memex

# Introduction
This tutorial outlines the process of matching patients to clinical trials following Stanford's SOTA research on the topic using the Memex SDK in a Python environment. It begins with the installation of the Memex SDK and the loading of synthetic patient data and clinical trials data from Google Drive. The document then details the process of uploading this data to the Memex platform.

The below data diagram illustrates the full analytics pipeline this notebook implements.

[![](https://mermaid.ink/img/pako:eNp9VNty2jAQ_ZUd9bGEAYMvuJ3MkAC5kWlufWjrPihGttXIEmPJaUgm_961TAy4M9aTjs7Zs961tG8kVitGQpII9TfOaGHgYRZJwKXLx7Sg6wy4XJcGVtTQmqjW9FdE7jfSZIxG5PfufI7nseCSx1SYglOh-6l6biRMrurNFI6OjuGkcjFFGZuyYLCmhjO5zfQRMUchKheonL-YgsYGrC_EBTcMd43yxFqeonChipwaoI2jLvOcFptGemqlsyq7ZfjrLnvGtVF72oXVnlW2ghrDZJMZqFwhUFrDH8UlGNVKyJlubGbW5rwq45mKkpomowaaUi61-b-kszqoBucWXKDDNE0Llu5ZANWaaZ1Xbk3whdVfVkXGCturTcFkajJQSRP39Ri27cSGxdne98I0DMNHUbItPEGIWZnc4tMWniFWBZXpR8D8MH7R5s9aBudtwUVLcHmIY4FFz1gCdRAkXIjwUzLxWrTgaWYONUnS0ljbLTtIJpFsvYAlS5ubC3BVtfR2Cbff53c_9m7_Es9v7r5d3zzAZ0DBHnXVqgWWrWqtfb3F1_N0bzaCwaD6Z-qJ2T5-abPDTtbpZEed7LiTdTtZr5P1O9nggCU9kjN8yHyF4-mt0kYE503OIhLi9lHQ-CkikXxHIS2NwnEUkxCnCeuRco1DhM04xf-XkzDBOYSnbMXxJlzXA8_OvR5ZU_lTqZ0GMQnfyAsJg0l_OPI833V93_Uno6BHNiR0g74_dB1_PBp745HjOOP3Hnm1DoN-4A0dJxi6g_HAGzl-8P4PPLGuSw?type=png)](https://mermaid-js.github.io/mermaid-live-editor/edit#pako:eNp9VNty2jAQ_ZUd9bGEAYMvuJ3MkAC5kWlufWjrPihGttXIEmPJaUgm_961TAy4M9aTjs7Zs961tG8kVitGQpII9TfOaGHgYRZJwKXLx7Sg6wy4XJcGVtTQmqjW9FdE7jfSZIxG5PfufI7nseCSx1SYglOh-6l6biRMrurNFI6OjuGkcjFFGZuyYLCmhjO5zfQRMUchKheonL-YgsYGrC_EBTcMd43yxFqeonChipwaoI2jLvOcFptGemqlsyq7ZfjrLnvGtVF72oXVnlW2ghrDZJMZqFwhUFrDH8UlGNVKyJlubGbW5rwq45mKkpomowaaUi61-b-kszqoBucWXKDDNE0Llu5ZANWaaZ1Xbk3whdVfVkXGCturTcFkajJQSRP39Ri27cSGxdne98I0DMNHUbItPEGIWZnc4tMWniFWBZXpR8D8MH7R5s9aBudtwUVLcHmIY4FFz1gCdRAkXIjwUzLxWrTgaWYONUnS0ljbLTtIJpFsvYAlS5ubC3BVtfR2Cbff53c_9m7_Es9v7r5d3zzAZ0DBHnXVqgWWrWqtfb3F1_N0bzaCwaD6Z-qJ2T5-abPDTtbpZEed7LiTdTtZr5P1O9nggCU9kjN8yHyF4-mt0kYE503OIhLi9lHQ-CkikXxHIS2NwnEUkxCnCeuRco1DhM04xf-XkzDBOYSnbMXxJlzXA8_OvR5ZU_lTqZ0GMQnfyAsJg0l_OPI833V93_Uno6BHNiR0g74_dB1_PBp745HjOOP3Hnm1DoN-4A0dJxi6g_HAGzl-8P4PPLGuSw)

In [None]:
# @title pip install memex
!pip install -q memexdata==0.1.0a218

In [None]:
# @title connect to MemexSession
import os 
from getpass import getpass
from memex import MemexSession

api_key = os.environ.get('MEMEX_API_KEY')
if not api_key:
    api_key = getpass("Enter Memex API key")
mx = MemexSession("https://xyz.memexdata.com", api_key=api_key)

In [None]:
# @title imports
import os
import json
import zipfile
import requests
from pathlib import Path

## Download synthetic patient data and clinicaltrials.gov data from Google Drive

You can view the notebook that uses Synthea to generate the synthetic patient population used in this notebook here: [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://drive.google.com/file/d/1cQa0usjjJ8GAS25-j3WqfSJZG1NV48D8/view?usp=sharing)


In [None]:
# @title download data and project files

os.makedirs('data', exist_ok=True)

def download(file):
  DOWNLOAD_DIR = './'
  BASE_URL = "https://sample.memexdata.com/clinical_trials/"
  open(f'{DOWNLOAD_DIR}{file}', 'wb').write(requests.get(BASE_URL + file, allow_redirects=True).content)

# synthetic patient population created with synthea
download('data/synthea501.zip')

with zipfile.ZipFile('data/synthea501.zip', 'r') as zip_ref:
    zip_ref.extractall('.')

# psychiatric clinical trials downloaded from clinicaltrials.gov
download('data/ctg.json')

download('code.zip')

with zipfile.ZipFile('code.zip', 'r') as zip_ref:
    zip_ref.extractall('.')

SQL_DIR = './queries/'
FUNCTION_DIR = './functions/'
csv_files = [Path(os.path.join(root, file)) for root, _, files in os.walk('synthea501') for file in files if file.endswith('.csv')]
sql_files = [Path(os.path.join(root, file)) for root, _, files in os.walk(SQL_DIR) for file in files if file.endswith('.sql')]
function_files = [Path(os.path.join(root, file)) for root, _, files in os.walk(FUNCTION_DIR) for file in files if file.endswith('.json')]


In [None]:
# @title load project files
print("Loading datasets\n-----------------\n")

# Clinical trials data
trialpath = Path('data/ctg.json')
print(f"Clinical trials data:\n- {trialpath.stem}")
with trialpath.open() as file:
    mx.upload_dataset(file, trialpath.name)

print("\nPatient data (OMOP format):")
# Upload synthea files
for fpath in csv_files:
    if fpath.stem not in set(['claims_transactions', 'observations', 'procedures']):
      print(f"- {fpath.stem}")
      with fpath.open() as file:
          mx.upload_dataset(file, fpath.name)

print("\n\nLoading queries\n---------------")
# Load queries
for querypath in sql_files:
    print(f"- {querypath.stem}")
    with querypath.open() as file:
        mx.save_query(query=file.read(), name=querypath.stem, overwrite=True)

print("\n\nLoading functions\n---------------")
# Load functions
for functionpath in function_files:
    print(f"- {functionpath.stem}")
    with functionpath.open() as file:
        mx.save_function({
            "function": json.load(file),
            "overwrite": True})



Patient data (OMOP format):
- medications


- providers


- payer_transitions


- imaging_studies


- supplies


- payers


- claims


- allergies


- organizations


- conditions


- careplans


- encounters


- devices


- immunizations


- patients




Loading queries
---------------
- join_pts_criteria


- extract_criteria


- match_pts


- eval_patients


- format_patient_data


- summarize_pts


- aggregate_criteria


- patients_nested




Loading functions
---------------
- summarize


- shah_individual_py


- match_pts


- format_patient_history


- extract_criteria_py


In [None]:
# @title Define helper functions
from pydantic import BaseModel
from typing import List, Union, Optional

def pprint_function(f):
    def recreate_pydantic_class(schema):
        class_definition = f"class {schema['title']}(BaseModel):\n"
        for prop_name, prop_attrs in schema['properties'].items():
            # Handle 'anyOf' types, which include multiple possible types
            if 'anyOf' in prop_attrs:
                types = []
                for type_option in prop_attrs['anyOf']:
                    if type_option['type'] == 'string':
                        types.append('str')
                    elif type_option['type'] == 'boolean':
                        types.append('bool')
                    elif type_option['type'] == 'number':
                        types.append('float')
                    elif type_option['type'] == 'integer':
                        types.append('int')
                    elif type_option['type'] == 'null':
                        types.append('None')
                class_definition += f"    {prop_name}: Optional[Union[{', '.join(types)}]]\n"
            else:
                # Basic types handling
                if prop_attrs['type'] == 'string':
                    class_definition += f"    {prop_name}: str\n"
                elif prop_attrs['type'] == 'boolean':
                    class_definition += f"    {prop_name}: bool\n"
                elif prop_attrs['type'] == 'array':
                    class_definition += f"    {prop_name}: List[str]\n"

        return class_definition

    # Check for a single output Pydantic type
    schema = f.get('return_type', {}).get('schema', None)
    if schema:
        pydantic_class_definition = recreate_pydantic_class(schema)
        print(pydantic_class_definition)
    else:
        # Print the Pydantic classes for each parameter type that is a Pydantic schema
        for param in f['parameter_types']:
            if param['kind'] == 'generic' and param['origin'] == 'list':
                for arg in param['args']:
                    if arg['kind'] == 'pydantic':
                        schema = arg['schema']
                        pydantic_class_definition = recreate_pydantic_class(schema)
                        print(pydantic_class_definition)

    # Print function content and source if available
    content = f['content']
    print(content)
    source = f.get('source', None)
    if source:
        print(f['source'])


## Inspecting project assets
Let's take a look at some of the project assets we just imported.

All of these are also available in the UI.

In [None]:
pprint_function(mx.get_function("shah_individual_py"))

In [None]:
print(mx.get_query("eval_patients")["query"])

# Transform patient data for data synthesis


[![](https://mermaid.ink/img/pako:eNp9VN1vmzAQ_1dO3uPSKJAECJsqpU3Sr1T93MM29uA4BryCHWHTNq36v-9wUkKYFB6Qz78P353h3glTS05CEmfqhaW0MPA4iSTgo8tFUtBVCkKuSgNLaugGqJ7x74g8rKVJOY3In93-FPdZJqRgNDOFoJnuJuq5pnC5bJnDTnuN2plipQYVg0mFbmBWw5kRSlboPoBZgFSGL5R62stmDEdHx3BS5WqKkpmy4LCiRnC5radJPrHkU5tEkVMDtObqMs9psd6jn1r6pPK2qHjbeWPuRjX4ddlTFKFqhqrpqykoM2C7BKwQhuOqljTfM3vUWZVZRo3hsqYDlUsMlNbwVwkJRrVyFlzXlhNrc16d_Uyzkpo6YQ00oUJq838eZxvRJji3wQU6jJOk4EnDAqjWXOu8cqvFF5Z_WfWIKey9NgWXiUmrK_zUfT-GbQ-w5yxt5AvjMAwXWcm34QmGeCqX2_i0FU8wVgWVyadguq-ftfGzlsF5m3DRIlzuxyzDoic8ho0IYpFl4Zd45LXglxSb-onGcQu1hlu0F48i2fpD5jypvx-Aq6qZd3O4-zG9_9n4IOe4f3t_c337CF8BCQ3oqlUFzFt1WvvNEn_dpwezzjj0qttST9x28FsbdQ6i7kG0fxAdHESHB1HvIOofRIM9lHRIznEKiCXOxveKGxEcMzmPSIjLRUYZjppIfiCRlkbhLGQkxCHDO6Rc4WzhE0Hx_nISxjgEcZcvBc6E6820tUO3Q1ZU_lJqx8GYhO_klYTBqOv0Pc8fDn1_6I_6QYesSTgMur4zdP1Bf-AN-q7rDj465M069LqB57hu4HhBz-n5juN__AN0h81n?type=png)](https://mermaid-js.github.io/mermaid-live-editor/edit#pako:eNp9VN1vmzAQ_1dO3uPSKJAECJsqpU3Sr1T93MM29uA4BryCHWHTNq36v-9wUkKYFB6Qz78P353h3glTS05CEmfqhaW0MPA4iSTgo8tFUtBVCkKuSgNLaugGqJ7x74g8rKVJOY3In93-FPdZJqRgNDOFoJnuJuq5pnC5bJnDTnuN2plipQYVg0mFbmBWw5kRSlboPoBZgFSGL5R62stmDEdHx3BS5WqKkpmy4LCiRnC5radJPrHkU5tEkVMDtObqMs9psd6jn1r6pPK2qHjbeWPuRjX4ddlTFKFqhqrpqykoM2C7BKwQhuOqljTfM3vUWZVZRo3hsqYDlUsMlNbwVwkJRrVyFlzXlhNrc16d_Uyzkpo6YQ00oUJq838eZxvRJji3wQU6jJOk4EnDAqjWXOu8cqvFF5Z_WfWIKey9NgWXiUmrK_zUfT-GbQ-w5yxt5AvjMAwXWcm34QmGeCqX2_i0FU8wVgWVyadguq-ftfGzlsF5m3DRIlzuxyzDoic8ho0IYpFl4Zd45LXglxSb-onGcQu1hlu0F48i2fpD5jypvx-Aq6qZd3O4-zG9_9n4IOe4f3t_c337CF8BCQ3oqlUFzFt1WvvNEn_dpwezzjj0qttST9x28FsbdQ6i7kG0fxAdHESHB1HvIOofRIM9lHRIznEKiCXOxveKGxEcMzmPSIjLRUYZjppIfiCRlkbhLGQkxCHDO6Rc4WzhE0Hx_nISxjgEcZcvBc6E6820tUO3Q1ZU_lJqx8GYhO_klYTBqOv0Pc8fDn1_6I_6QYesSTgMur4zdP1Bf-AN-q7rDj465M069LqB57hu4HhBz-n5juN__AN0h81n)

One key aspect of this process is the creation of nested patient data, which consolidates all relevant patient information into a structured format that is conducive to analysis and matching with clinical trials criteria. This step is crucial for the subsequent utilization of machine learning models that can process this structured data to make accurate matching predictions.

The patients_nested query is used to create a nested representation of each patient's data, including their medical encounters, conditions, medications, allergies, and other relevant health information. By executing this query and saving the result as a table called patients_nested, we create a valuable resource that can be used for further processing and analysis.

In [None]:
patients_nested = mx.get_query('patients_nested')
mx.save_as_table('patients_nested', patients_nested["query"], overwrite=True)

In [None]:
format_patient_data = mx.get_query("format_patient_data")
mx.save_as_table('pt_histories', format_patient_data["query"], overwrite=True)

In [None]:
# This query uses an LLM call to summarize patients:
# summarize = mx.get_function("summarize")
summarize_pts = mx.get_query("summarize_pts")
mx.save_as_table('pt_summaries', summarize_pts["query"], model="gpt-3.5-turbo", temperature=0.1, max_tokens=4000, use_cache=True, overwrite=True)

## Format clinical trials criteria

[![](https://mermaid.ink/img/pako:eNp9VNtymzAQ_ZUd9bGOx2AbMO1kxont3Oxpbn1oSx9kLEANSB4kkjiZ_HsXYWObzKAHRqtz9mh3xe47CeWKEZ9EqXwJE5preJwEAnCpYhnndJ0AF-tCw4pqWgHlGv8JyMNG6ITRgPzdn0_xPEy54CFNdc5pqrqxfK4pTKwa4nDgCycnJ6cwQ4npq85pqMFIQJhzzXB3dNMMSvIFkmcp1ZqJmgZUrNCQSsE_yQVoCWuqORMar80ymnOmjqQWpYgMCwUyAp1wdRCUCZaFmktRoscApg9CaraU8ulzjtV3bOI8K-ul8yLURc7qcExNd35nhnhuYskzqoE2wt7U1HNDnZSaVUJve00MX0vD3ccwMfzLsrDPNC2orukKaEy5UPpzkS8qp8q4NMYVKozjOGfxgQRQpZhSWalWO18Z_nUZYSgxY6VzJmKdlDXc-X0_he0DY7ZhcvgoY9_3l2nBtuYZmngrE1v7vGFP0JY5FfHOYXrsP2viFw2ByybhqkG4PrbDFJOesAgqJ4h4mvpfopHTgF8SLOoOjaIGagS3aC8aBaLRG3MW1z8TwE1ZzLs53P2c3v86-H_neH57_2Nx-whfAQkH0E0jC5g38jTy1Rab9ulBb1IGvfK15BMzFfzWRK1W1G5F-63ooBUdtqJOK-q2ot4RSjokY9h_fIVT8b3kBgT7PGMB8XG7TGmIvR6IDyTSQkucgiHxsbVZhxRr7Gg24RTfLyN-hOMPT9mKY0cuqjlrxm2HrKn4LeWegzbx38kr8b1R1-o7jjscuu7QHfW9DtkQf-h1XWtou4P-wBn0bdsefHTIm1HodT3Hsm3Pcrye1XMty_34D-TjzB0?type=png)](https://mermaid-js.github.io/mermaid-live-editor/edit#pako:eNp9VNtymzAQ_ZUd9bGOx2AbMO1kxont3Oxpbn1oSx9kLEANSB4kkjiZ_HsXYWObzKAHRqtz9mh3xe47CeWKEZ9EqXwJE5preJwEAnCpYhnndJ0AF-tCw4pqWgHlGv8JyMNG6ITRgPzdn0_xPEy54CFNdc5pqrqxfK4pTKwa4nDgCycnJ6cwQ4npq85pqMFIQJhzzXB3dNMMSvIFkmcp1ZqJmgZUrNCQSsE_yQVoCWuqORMar80ymnOmjqQWpYgMCwUyAp1wdRCUCZaFmktRoscApg9CaraU8ulzjtV3bOI8K-ul8yLURc7qcExNd35nhnhuYskzqoE2wt7U1HNDnZSaVUJve00MX0vD3ccwMfzLsrDPNC2orukKaEy5UPpzkS8qp8q4NMYVKozjOGfxgQRQpZhSWalWO18Z_nUZYSgxY6VzJmKdlDXc-X0_he0DY7ZhcvgoY9_3l2nBtuYZmngrE1v7vGFP0JY5FfHOYXrsP2viFw2ByybhqkG4PrbDFJOesAgqJ4h4mvpfopHTgF8SLOoOjaIGagS3aC8aBaLRG3MW1z8TwE1ZzLs53P2c3v86-H_neH57_2Nx-whfAQkH0E0jC5g38jTy1Rab9ulBb1IGvfK15BMzFfzWRK1W1G5F-63ooBUdtqJOK-q2ot4RSjokY9h_fIVT8b3kBgT7PGMB8XG7TGmIvR6IDyTSQkucgiHxsbVZhxRr7Gg24RTfLyN-hOMPT9mKY0cuqjlrxm2HrKn4LeWegzbx38kr8b1R1-o7jjscuu7QHfW9DtkQf-h1XWtou4P-wBn0bdsefHTIm1HodT3Hsm3Pcrye1XMty_34D-TjzB0)

In [None]:
# extract_criteria_py = mx.get_function("extract_criteria_py")
extract_criteria = mx.get_query("extract_criteria")
mx.save_as_table('extracted_criteria_100', extract_criteria["query"], model="gpt-3.5-turbo", max_tokens=1000, temperature=0.1, use_cache=True, overwrite=True)

In [None]:
join_pts_criteria = mx.get_query("join_pts_criteria")
mx.save_as_table("pt_criteria_and_summary", join_pts_criteria["query"], overwrite=True)

[![](https://mermaid.ink/img/pako:eNp9VNtymzAQ_ZUd9bGOx2AbMO1kxont3Jxpbn1oSx9kLEANSB4kkjiZ_HsXYUNMZuCB0eqcs9pdafeNhHLNiE-iVD6HCc01PMwCAfipYhXndJMAF5tCw5pqWgHlN_0TkPut0AmjAfnb7M9xP0y54CFNdc5pqvqxfKopTKxbziEQjfoa1QsZFgpkBDrhChrMqFiouRQleghgHCCkZispHw_iaVbncHR0DBd4wjSOcxZTzWBDNWdCA1WKKZXhUh2oL4zmssw1lDkDpXMmYp2UAey134_x-DJTyKgOE6Y-J1v9p8bZSelM50Woi7wJwBR3rzsxxFNTixydAq15qsgymm9r6qmhzkqfBuGvjU8sn5aG28QwM_xz5M-faFp8qIECGlMulIYw55phQvUhZ5WoMuZooLUoXbzonIZ6l_0n2cLIzso0Uqo1EzUFqFijIZWCf5IL0LKVIN9VcXcTU9_3V2nBduYJmniDbA-ftuwZ2jKnIt4L5of6RRs_azk4bxMuWoTLQztM8QHNWASVCCKepv6XaOK04OcE89-jUdRCjcMdOogm--zrTlmyuH5RAFflnd8u4fbn_O7Xh0e7xP2bux_XNw_wFZDwAbpqZQHLVp7GfbXEFn6819uUwaB89fKRmQp-a6NWJ2p3osNOdNSJjjtRpxN1O1HvACU9kjFsQr7GGflWcgOCwyZjAfFxuUppiAMnEO9IpIWWOBND4mN_sx4pNtjWbMYp3l9G_AiHIe6yNce2vK6mrhm-PbKh4reUDQdt4r-RF-J7k741dBx3PHbdsTsZej2yJf7Y67vW2HZHw5EzGtq2PXrvkVfjYdD3HMu2PcvxBtbAtSz3_T_LA897?type=png)](https://mermaid-js.github.io/mermaid-live-editor/edit#pako:eNp9VNtymzAQ_ZUd9bGOx2AbMO1kxont3Jxpbn1oSx9kLEANSB4kkjiZ_HsXYUNMZuCB0eqcs9pdafeNhHLNiE-iVD6HCc01PMwCAfipYhXndJMAF5tCw5pqWgHlN_0TkPut0AmjAfnb7M9xP0y54CFNdc5pqvqxfKopTKxbziEQjfoa1QsZFgpkBDrhChrMqFiouRQleghgHCCkZispHw_iaVbncHR0DBd4wjSOcxZTzWBDNWdCA1WKKZXhUh2oL4zmssw1lDkDpXMmYp2UAey134_x-DJTyKgOE6Y-J1v9p8bZSelM50Woi7wJwBR3rzsxxFNTixydAq15qsgymm9r6qmhzkqfBuGvjU8sn5aG28QwM_xz5M-faFp8qIECGlMulIYw55phQvUhZ5WoMuZooLUoXbzonIZ6l_0n2cLIzso0Uqo1EzUFqFijIZWCf5IL0LKVIN9VcXcTU9_3V2nBduYJmniDbA-ftuwZ2jKnIt4L5of6RRs_azk4bxMuWoTLQztM8QHNWASVCCKepv6XaOK04OcE89-jUdRCjcMdOogm--zrTlmyuH5RAFflnd8u4fbn_O7Xh0e7xP2bux_XNw_wFZDwAbpqZQHLVp7GfbXEFn6819uUwaB89fKRmQp-a6NWJ2p3osNOdNSJjjtRpxN1O1HvACU9kjFsQr7GGflWcgOCwyZjAfFxuUppiAMnEO9IpIWWOBND4mN_sx4pNtjWbMYp3l9G_AiHIe6yNce2vK6mrhm-PbKh4reUDQdt4r-RF-J7k741dBx3PHbdsTsZej2yJf7Y67vW2HZHw5EzGtq2PXrvkVfjYdD3HMu2PcvxBtbAtSz3_T_LA897)

In [None]:
# This query uses the top performing prompt from the Shah Lab paper on clinicals trials matching (https://arxiv.org/pdf/2402.05125.pdf):
shah_individual_py = mx.get_function("shah_individual_py")
# pprint_function(shah_individual_py)
eval_patients = mx.get_query("eval_patients")
mx.save_as_table("evaled_pts", eval_patients["query"], model="gpt-3.5-turbo", max_tokens=2000, temperature=0.1, use_cache=True, overwrite=True)

In [None]:
aggregate_criteria = mx.get_query("aggregate_criteria")
mx.save_as_table("assessed_pts", aggregate_criteria["query"], overwrite=True)

In [None]:
match_pts = mx.get_query("match_pts")
mx.save_as_table("pt_trial_match_scores", match_pts["query"], overwrite=True)