## 1. Subscribe to the model package

To subscribe to the model package:
1. Open the model package listing page: [LOINC Clinical Terminology Mapper](https://aws.amazon.com/marketplace/pp/prodview-idfcekbznwtlq)
1. On the AWS Marketplace listing, click on the **Continue to subscribe** button.
1. On the **Subscribe to this software** page, review and click on **"Accept Offer"** if you and your organization agrees with EULA, pricing, and support terms. 
1. Once you click on **Continue to configuration button** and then choose a **region**, you will see a **Product Arn** displayed. This is the model package ARN that you need to specify while creating a deployable model using Boto3. Copy the ARN corresponding to your region and specify the same in the following cell.

## LOINC Clinical Terminology Mapper

- **Model**: `loinc_vdb_resolver`
- **Model Description**: This pretrained pipeline extracts clinical entities from clinical text and maps them to their corresponding Logical Observation Identifiers Names and Codes (LOINC) codes.

In [1]:
model_package_arn = "<Customer to specify Model package ARN corresponding to their AWS region>"

In [None]:
import json
import os
import boto3
import pandas as pd
import sagemaker as sage
from sagemaker import ModelPackage
from sagemaker import get_execution_role
from IPython.display import display
from urllib.parse import urlparse


In [3]:
sagemaker_session = sage.Session()
s3_bucket = sagemaker_session.default_bucket()
region = sagemaker_session.boto_region_name
account_id = boto3.client("sts").get_caller_identity().get("Account")
role = get_execution_role()

sagemaker = boto3.client("sagemaker")
s3_client = sagemaker_session.boto_session.client("s3")
ecr = boto3.client("ecr")
sm_runtime = boto3.client("sagemaker-runtime")

# Set display options
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
pd.set_option('display.max_colwidth', None)

In [4]:
model_name = "loinc-vdb-resolver"

real_time_inference_instance_type = "ml.m4.xlarge"
batch_transform_inference_instance_type = "ml.m4.2xlarge"

## 2. Create a deployable model from the model package.

In [6]:
model = ModelPackage(
    role=role, 
    model_package_arn=model_package_arn,
    sagemaker_session=sagemaker_session,
)

### Input Format

To use the model, you need to provide input in one of the following supported formats:

#### JSON Format

Provide input as JSON. We support two variations within this format:

1. **Array of Text Documents**: 
   Use an array containing multiple text documents. Each element represents a separate text document.

   ```json
   {
       "text": [
           "Text document 1",
           "Text document 2",
           ...
       ]
   }

    ```

2. **Single Text Document**:
   Provide a single text document as a string.


   ```json
    {
        "text": "Single text document"
    }
   ```

#### JSON Lines (JSONL) Format

Provide input in JSON Lines format, where each line is a JSON object representing a text document.

```
{"text": "Text document 1"}
{"text": "Text document 2"}
```

## 3. Create an endpoint and perform real-time inference

If you want to understand how real-time inference with Amazon SageMaker works, see [Documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/how-it-works-hosting.html).

### A. Deploy the SageMaker model to an endpoint

In [7]:
predictor = model.deploy(
    initial_instance_count=1,
    instance_type=real_time_inference_instance_type, 
    endpoint_name=model_name,
)

-------------!

Once endpoint has been created, you would be able to perform real-time inference.

In [8]:
def invoke_realtime_endpoint(record, content_type="application/json", accept="application/json"):
    response = sm_runtime.invoke_endpoint(
        EndpointName=model_name,
        ContentType=content_type,
        Accept=accept,
        Body=json.dumps(record) if content_type == "application/json" else record,
    )

    response_body = response["Body"].read().decode("utf-8")

    if accept == "application/json":
        return json.loads(response_body)
    elif accept == "application/jsonlines":
        return response_body
    else:
        raise ValueError(f"Unsupported accept type: {accept}")

### Initial Setup

In [9]:
docs = [
    "The patient is a 22-year-old female with a history of obesity. She has a Body mass index (BMI) of 33.5 kg/m2, aspartate aminotransferase 64, and alanine aminotransferase 126.", 
    "Final diagnosis:  : DNA was extracted from the peripheral blood specimen and a polymerase chain reaction (PCR)-based assay performed that is designed to detect the presence of two separate mutations in the FLT3 gene: (1) within a susceptible region that includes coding sequence for the intracellular juxtamembrane domain and (2)   Negative. Neither expansion of the region susceptible to nor changes consistent with mutation of the codon for ASP835 were identified.",
]

sample_text = """A 65-year-old woman presents to the office with generalized fatigue for the last 4 months. She used to walk 1 mile each evening but now gets tired after 1-2 blocks. She has a history of Crohn disease and hypertension for which she receives appropriate medications. She is married and lives with her husband. She eats a balanced diet that includes chicken, fish, pork, fruits, and vegetables. She rarely drinks alcohol and denies tobacco use. Her vital signs are within normal limits. A physical examination is unremarkable. Laboratory studies show the following: Hemoglobin: 9.8 g/dL, Hematocrit: 32%, Mean Corpuscular Volume: 110 μm3"""

### JSON

In [10]:
input_json_data = {"text": sample_text}
response_json = invoke_realtime_endpoint(input_json_data, content_type="application/json", accept="application/json")
pd.DataFrame(response_json["predictions"][0])

Unnamed: 0,begin,end,ner_chunk,ner_label,ner_confidence,concept_code,resolution,score,all_codes,concept_name_detailed,domain_id,all_resolutions,all_score
0,442,456,Her vital signs,Test,0.78679997,8716-3,vital signs,0.850316,"[8716-3, LP75862-0, 29274-8, 52481-9, LP133943-3]","[Vital signs [Vital signs], Vital signs - acute [Vital signs - acute], Vital signs measurements [Vital signs measurements], Vital signs - acute [CARE] [Vital signs - acute [CARE]], EMS vital signs [EMS vital signs]]","[Observation, Observation, Observation, Observation, Observation]","[vital signs, vital signs - acute, vital signs measurements, vital signs - acute [care], ems vital signs]","[0.8503159284591675, 0.7810258865356445, 0.7806203961372375, 0.7770128846168518, 0.7719455361366272]"
1,484,505,A physical examination,Test,0.91120005,LP7801-6,physical exam,0.909418,"[LP7801-6, LP94385-9, 29271-4, 100223-7, 19793-9]","[Physical exam [Physical exam], Physical examination by body areas [Physical examination by body areas], Eye physical examination [Eye physical examination], Examination [Physical findings of Retina Narrative], examination ?? [Examination extent landmark [Description] Biliary duct Narrative ERCP]]","[Measurement, Observation, Observation, Observation, Observation]","[physical exam, physical examination by body areas, eye physical examination, examination, examination ??]","[0.9094181060791016, 0.8629760146141052, 0.8507091999053955, 0.840781033039093, 0.839738130569458]"
2,524,541,Laboratory studies,Test,0.7098,LP74124-6,laboratory studies,1.000001,"[LP74124-6, 26436-6, 11502-2, H&P.HX.LAB, 34075-2]","[Laboratory studies [Laboratory studies], Laboratory studies (set) [Laboratory studies (set)], Laboratory report [Laboratory report], History for laboratory studies [History for laboratory studies], Lab tests [FDA package insert Laboratory tests section]]","[Observation, Observation, Observation, Observation, Observation]","[laboratory studies, laboratory studies (set), laboratory report, history for laboratory studies, lab tests]","[1.0000007152557373, 0.8880069255828857, 0.8414250016212463, 0.8363252878189087, 0.8293611407279968]"
3,563,572,Hemoglobin,Test,0.9994,LP14449-0,hemoglobin,1.0,"[LP14449-0, 10346-5, 15082-1, LP16434-0, LP30932-5]","[Hemoglobin [Hemoglobin], Haemoglobin [Hemoglobin A [Units/volume] in Blood by Electrophoresis], Hemiglobin [Methemoglobin [Mass/volume] in Blood], Hemoglobin H [Hemoglobin H], Hemoglobin N [Hemoglobin N]]","[Observation, Observation, Observation, Observation, Observation]","[hemoglobin, haemoglobin, hemiglobin, hemoglobin h, hemoglobin n]","[1.0000001192092896, 0.9589886665344238, 0.9433583617210388, 0.8850629329681396, 0.8679291009902954]"
4,579,582,g/dL,Test,0.6455,18286-5,dl,0.829474,"[18286-5, 76650-1, 43740-0, 2367-1, LA19932-5]","[DL [Donath Landsteiner Ab [Presence] in Blood], Gd [Gadolinium [Mass/mass] in Hair], G/I [Glucose/Insulin [Mass Ratio] in Serum or Plasma], GLD [Glutamate dehydrogenase [Enzymatic activity/volume] in Serum or Plasma], G/G (wild type) [G/G (wild type)]]","[Observation, Observation, Observation, Observation, Meas Value]","[dl, gd, g/i, gld, g/g (wild type)]","[0.8294735550880432, 0.8090059757232666, 0.8026497960090637, 0.7987967729568481, 0.7837276458740234]"
5,585,594,Hematocrit,Test,0.9983,LP15101-6,hematocrit,0.999999,"[LP15101-6, LP308151-2, LP392494-3, LP392484-4, LP392480-2]","[Hematocrit [Hematocrit], Hematocrit/Hemoglobin [Hematocrit/Hemoglobin], Hematocrit | Stem cell product | Hematology and Cell counts [Hematocrit | Stem cell product | Hematology and Cell counts], Hematocrit | Blood venous | Hematology and Cell counts [Hematocrit | Blood venous | Hematology and Cell counts], Hematocrit | Blood arterial | Hematology and Cell counts [Hematocrit | Blood arterial | Hematology and Cell counts]]","[Observation, Observation, Measurement, Measurement, Measurement]","[hematocrit, hematocrit/hemoglobin, hematocrit | stem cell product | hematology and cell counts, hematocrit | blood venous | hematology and cell counts, hematocrit | blood arterial | hematology and cell counts]","[0.999998927116394, 0.9189162850379944, 0.9059693813323975, 0.8906344175338745, 0.8904138803482056]"
6,602,624,Mean Corpuscular Volume,Test,0.7142666,LP15191-7,erythrocyte mean corpuscular volume,0.882469,"[LP15191-7, LP66395-2, 51641-9, LP393360-5, LP393361-3]","[Erythrocyte mean corpuscular volume [Erythrocyte mean corpuscular volume], Mean sphered cell volume [Mean sphered cell volume], Mean sphered cell volume [Entitic volume] in Red Blood Cells [Mean sphered cell volume [Entitic volume] in Red Blood Cells], Erythrocyte mean corpuscular volume | Blood cord | Hematology and Cell counts [Erythrocyte mean corpuscular volume | Blood cord | Hematology and Cell counts], Erythrocyte mean corpuscular volume | Red Blood Cells | Hematology and Cell counts [Erythrocyte mean corpuscular volume | Red Blood Cells | Hematology and Cell counts]]","[Observation, Observation, Observation, Measurement, Measurement]","[erythrocyte mean corpuscular volume, mean sphered cell volume, mean sphered cell volume [entitic volume] in red blood cells, erythrocyte mean corpuscular volume | blood cord | hematology and cell counts, erythrocyte mean corpuscular volume | red blood cells | hematology and cell counts]","[0.8824691772460938, 0.8562332391738892, 0.8337935209274292, 0.828377902507782, 0.8251137733459473]"


### JSON Lines

In [12]:
def create_jsonl(records):
    if isinstance(records, str):
        records = [records]
    json_records = [{"text": text} for text in records]
    json_lines = "\n".join(json.dumps(record) for record in json_records)
    return json_lines

In [13]:
input_jsonl_data = create_jsonl(sample_text)
data = invoke_realtime_endpoint(input_jsonl_data, content_type="application/jsonlines" , accept="application/jsonlines" )
print(data)

{"predictions": [{"begin": 442, "end": 456, "ner_chunk": "Her vital signs", "ner_label": "Test", "ner_confidence": "0.78679997", "concept_code": "8716-3", "resolution": "vital signs", "score": 0.8503159284591675, "all_codes": ["8716-3", "LP75862-0", "29274-8", "52481-9", "LP133943-3"], "concept_name_detailed": ["Vital signs [Vital signs]", "Vital signs - acute [Vital signs - acute]", "Vital signs measurements [Vital signs measurements]", "Vital signs - acute [CARE] [Vital signs - acute [CARE]]", "EMS vital signs [EMS vital signs]"], "domain_id": ["Observation", "Observation", "Observation", "Observation", "Observation"], "all_resolutions": ["vital signs", "vital signs - acute", "vital signs measurements", "vital signs - acute [care]", "ems vital signs"], "all_score": [0.8503159284591675, 0.7810258865356445, 0.7806203961372375, 0.7770128846168518, 0.7719455361366272]}, {"begin": 484, "end": 505, "ner_chunk": "A physical examination", "ner_label": "Test", "ner_confidence": "0.91120005", 

### B. Delete the endpoint

Now that you have successfully performed a real-time inference, you do not need the endpoint any more. You can terminate the endpoint to avoid being charged.

In [15]:
model.sagemaker_session.delete_endpoint(model_name)
model.sagemaker_session.delete_endpoint_config(model_name)

## 4. Batch inference

In [16]:
validation_json_file_name = "input.json"
validation_jsonl_file_name = "input.jsonl"

validation_input_json_path = f"s3://{s3_bucket}/{model_name}/validation-input/json/"
validation_output_json_path = f"s3://{s3_bucket}/{model_name}/validation-output/json/"

validation_input_jsonl_path = f"s3://{s3_bucket}/{model_name}/validation-input/jsonl/"
validation_output_jsonl_path = f"s3://{s3_bucket}/{model_name}/validation-output/jsonl/"

def upload_to_s3(input_data, file_name):
    file_format = os.path.splitext(file_name)[1].lower()
    s3_client.put_object(
        Bucket=s3_bucket,
        Key=f"{model_name}/validation-input/{file_format[1:]}/{file_name}",
        Body=input_data.encode("UTF-8"),
    )

In [17]:
# Create JSON and JSON Lines data
input_jsonl_data = create_jsonl(docs)
input_json_data = json.dumps({"text": docs})

# Upload JSON and JSON Lines data to S3
upload_to_s3(input_json_data, validation_json_file_name)
upload_to_s3(input_jsonl_data, validation_jsonl_file_name)

### JSON

In [None]:
transformer = model.transformer(
    instance_count=1,
    instance_type=batch_transform_inference_instance_type,
    accept="application/json",
    output_path=validation_output_json_path
)

transformer.transform(validation_input_json_path, content_type="application/json")
transformer.wait()

In [None]:
def retrieve_json_output_from_s3(validation_file_name):
    parsed_url = urlparse(transformer.output_path)
    file_key = f"{parsed_url.path[1:]}{validation_file_name}.out"
    response = s3_client.get_object(Bucket=s3_bucket, Key=file_key)

    data = json.loads(response["Body"].read().decode("utf-8"))
    display(data)

In [21]:
retrieve_json_output_from_s3(validation_json_file_name)

{'predictions': [[{'begin': 90,
    'end': 93,
    'ner_chunk': 'BMI)',
    'ner_label': 'Test',
    'ner_confidence': '0.698',
    'concept_code': '39156-5',
    'resolution': 'bmi',
    'score': 0.9296666383743286,
    'all_codes': ['39156-5', 'LP35925-4', '89270-3', '39156-5', '88087-2'],
    'concept_name_detailed': ['BMI [Body mass index (BMI) [Ratio]]',
     'Body mass index (BMI) [Body mass index (BMI)]',
     'BMI Est [Body mass index (BMI) [Ratio] Estimated]',
     'Body mass index (BMI) [Ratio] [Body mass index (BMI) [Ratio]]',
     'Estimated BMI >40 [Estimated BMI greater than 40]'],
    'domain_id': ['Observation',
     'Observation',
     'Observation',
     'Observation',
     'Observation'],
    'all_resolutions': ['bmi',
     'body mass index (bmi)',
     'bmi est',
     'body mass index (bmi) [ratio]',
     'estimated bmi >40'],
    'all_score': [0.9296666383743286,
     0.860340416431427,
     0.8320395946502686,
     0.8205130696296692,
     0.8036540150642395]},
  

### JSON Lines

In [None]:
transformer = model.transformer(
    instance_count=1,
    instance_type=batch_transform_inference_instance_type,
    accept="application/jsonlines",
    output_path=validation_output_jsonl_path
)
transformer.transform(validation_input_jsonl_path, content_type="application/jsonlines")
transformer.wait()

In [None]:
def retrieve_jsonlines_output_from_s3(validation_file_name):

    parsed_url = urlparse(transformer.output_path)
    file_key = f"{parsed_url.path[1:]}{validation_file_name}.out"
    response = s3_client.get_object(Bucket=s3_bucket, Key=file_key)

    data = response["Body"].read().decode("utf-8")
    print(data)

In [24]:
retrieve_jsonlines_output_from_s3(validation_jsonl_file_name)

{"predictions": [{"begin": 90, "end": 93, "ner_chunk": "BMI)", "ner_label": "Test", "ner_confidence": "0.698", "concept_code": "39156-5", "resolution": "bmi", "score": 0.9296666383743286, "all_codes": ["39156-5", "LP35925-4", "89270-3", "39156-5", "88087-2"], "concept_name_detailed": ["BMI [Body mass index (BMI) [Ratio]]", "Body mass index (BMI) [Body mass index (BMI)]", "BMI Est [Body mass index (BMI) [Ratio] Estimated]", "Body mass index (BMI) [Ratio] [Body mass index (BMI) [Ratio]]", "Estimated BMI >40 [Estimated BMI greater than 40]"], "domain_id": ["Observation", "Observation", "Observation", "Observation", "Observation"], "all_resolutions": ["bmi", "body mass index (bmi)", "bmi est", "body mass index (bmi) [ratio]", "estimated bmi >40"], "all_score": [0.9296666383743286, 0.860340416431427, 0.8320395946502686, 0.8205130696296692, 0.8036540150642395]}, {"begin": 110, "end": 135, "ner_chunk": "aspartate aminotransferase", "ner_label": "Test", "ner_confidence": "0.8905", "concept_cod

In [25]:
model.delete_model()

INFO:sagemaker:Deleting model with name: loinc-vdb-resolver-2025-08-15-14-29-41-686


### Unsubscribe to the listing (optional)

If you would like to unsubscribe to the model package, follow these steps. Before you cancel the subscription, ensure that you do not have any [deployable model](https://console.aws.amazon.com/sagemaker/home#/models) created from the model package or using the algorithm. Note - You can find this information by looking at the container name associated with the model. 

**Steps to unsubscribe to product from AWS Marketplace**:
1. Navigate to __Machine Learning__ tab on [__Your Software subscriptions page__](https://aws.amazon.com/marketplace/ai/library?productType=ml&ref_=mlmp_gitdemo_indust)
2. Locate the listing that you want to cancel the subscription for, and then choose __Cancel Subscription__  to cancel the subscription.

