In [None]:
from opensearchpy import OpenSearch
from angle_emb import AnglE, Prompts

In [None]:
# Initialize AnglE embedding model
angle = AnglE.from_pretrained('WhereIsAI/UAE-Large-V1', pooling_strategy='cls').cuda()
# Enable Prompt.C for retrieval optimized embeddings
angle.set_prompt(prompt=Prompts.C)   

In [None]:
# OpenSearch instance parameters
host = 'localhost'
port = 9200
auth = ('admin', 'admin')

# Create the client with SSL/TLS enabled and disable warnings
client = OpenSearch(
    hosts = [{'host': host, 'port': port}],    
    http_auth = auth,
    use_ssl = True,
    verify_certs = False,
    ssl_show_warn = False,
)

In [None]:
# Testing connection to OpenSearch
client.info()

In [None]:
# Query embedding
query = "where does hypotonia typically appear?"
query_emb = angle.encode({'text': query})

In [None]:
# Defining the knn query parameters
search_query_desne = {    
    "query": {
        "knn": {
            "embedding": {
                "vector": query_emb[0].tolist(),
                "k": 3
            }
        }
    }
}

search_query_sparse = {
  "query": {
    "match": {
      "chunk": "where hypotonia is typical?"
    }
  }
}

In [None]:
# Send a knn query to OpenSearch
results_dense = client.search(index="pubmed_500_200", body=search_query_desne)
results_sparse = client.search(index="pubmed_500_200", body=search_query_sparse)

In [None]:
# Helper function to view the reposes easily
def pretty_response(response):
    for hit in response['hits']['hits']:
        id = hit['_id']
        score = hit['_score']
        pmid = hit['_source']['pmid']
        chunk_id = hit['_source']['chunk_id']  
        chunk = hit['_source']['chunk']      
        pretty_output = (f"\nID: {id}\nPMID: {pmid}\nChunk ID: {chunk_id}\nText: {chunk}")
        print(pretty_output)

In [18]:
pretty_response(results_dense)


ID: owtVOI0BrWW60NSubD18
PMID: 23333410
Chunk ID: 6
Text: VIQ score in the severe delay range. Abnormal muscle tonicity was found in 35% (hypotonicity 33%, hypertonicity 2%). Need for ECMO, prolonged ventilation, hypotonicity, and other surrogate markers of disease severity (P<0.05) were associated with borderline or delayed neurological outcome. CONCLUSION: The majority of CDH children are functioning in the average range at early preschool and preschool age. Neuromuscular hypotonicity is common in CDH survivors. CDH severity appears to be predictive

ID: DQtVOI0BrWW60NSuZDy8
PMID: 20301331
Chunk ID: 1
Text: In infancy, hypotonia is typical, and acquisition of developmental motor milestones is often both aberrant in pattern and delayed. Intelligence and life span are usually near normal, although craniocervical junction compression increases the risk of death in infancy. Additional complications include obstructive sleep apnea, middle ear dysfunction, kyphosis, and spinal stenosis. D

In [19]:
pretty_response(results_sparse)


ID: DAtVOI0BrWW60NSuZDy8
PMID: 20301331
Chunk ID: 0
Text: CLINICAL CHARACTERISTICS: Achondroplasia is the most common cause of disproportionate short stature. Affected individuals have rhizomelic shortening of the limbs, macrocephaly, and characteristic facial features with frontal bossing and midface retrusion. In infancy, hypotonia is typical, and acquisition of developmental motor milestones is often both aberrant in pattern and delayed. Intelligence and life span are usually near normal, although craniocervical junction compression increases the

ID: DQtVOI0BrWW60NSuZDy8
PMID: 20301331
Chunk ID: 1
Text: In infancy, hypotonia is typical, and acquisition of developmental motor milestones is often both aberrant in pattern and delayed. Intelligence and life span are usually near normal, although craniocervical junction compression increases the risk of death in infancy. Additional complications include obstructive sleep apnea, middle ear dysfunction, kyphosis, and spinal stenosis. DIA