## 1. Subscribe to the model package

To subscribe to the model package:
1. Open the model package listing page <font color='red'> For Seller to update:[Title_of_your_product](Provide link to your marketplace listing of your product).</font>
1. On the AWS Marketplace listing, click on the **Continue to subscribe** button.
1. On the **Subscribe to this software** page, review and click on **"Accept Offer"** if you and your organization agrees with EULA, pricing, and support terms. 
1. Once you click on **Continue to configuration button** and then choose a **region**, you will see a **Product Arn** displayed. This is the model package ARN that you need to specify while creating a deployable model using Boto3. Copy the ARN corresponding to your region and specify the same in the following cell.

## Pipeline for ICD-O

- **Model**: `icdo_resolver_pipeline`
- **Model Description**: This pipeline extracts oncological entities from clinical texts and map them to their corresponding ICD-O codes using `sbiobert_base_cased_mli` Sentence Bert Embeddings.

In [None]:
model_package_arn = "<Customer to specify Model package ARN corresponding to their AWS region>"

In [8]:
import base64
import json
import uuid
from sagemaker import ModelPackage
import sagemaker as sage
from sagemaker import get_execution_role
import boto3
from IPython.display import Image, display
from PIL import Image as ImageEdit
import numpy as np

In [9]:
sagemaker_session = sage.Session()
s3_bucket = sagemaker_session.default_bucket()
region = sagemaker_session.boto_region_name
account_id = boto3.client("sts").get_caller_identity().get("Account")
role = get_execution_role()

sagemaker = boto3.client("sagemaker")
s3_client = sagemaker_session.boto_session.client("s3")
ecr = boto3.client("ecr")
sm_runtime = boto3.client("sagemaker-runtime")

## 2. Create an endpoint and perform real-time inference

If you want to understand how real-time inference with Amazon SageMaker works, see [Documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/how-it-works-hosting.html).

In [10]:
model_name = "icdo-resolver-pipeline"

real_time_inference_instance_type = "ml.m4.xlarge"
batch_transform_inference_instance_type = "ml.m4.xlarge"


### A. Create an endpoint

In [11]:
# create a deployable model from the model package.
model = ModelPackage(
    role=role, model_package_arn=model_package_arn, sagemaker_session=sagemaker_session
)

# Deploy the model
predictor = model.deploy(1, real_time_inference_instance_type, endpoint_name=model_name)

----------!

Once endpoint has been created, you would be able to perform real-time inference.

In [12]:
import json
import pandas as pd
import os
import boto3

# Set display options
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
pd.set_option('display.max_colwidth', None)

def process_data_and_invoke_realtime_endpoint(data, content_type, accept):

    content_type_to_format = {'application/json': 'json', 'application/jsonlines': 'jsonl'}
    input_format = content_type_to_format.get(content_type)
    if content_type not in content_type_to_format.keys() or accept not in content_type_to_format.keys():
        raise ValueError("Invalid content_type or accept. It should be either 'application/json' or 'application/jsonlines'.")

    i = 1
    input_dir = f'inputs/real-time/{input_format}'
    output_dir = f'outputs/real-time/{input_format}'
    s3_input_dir = f"{model_name}/validation-input/real-time/{input_format}"
    s3_output_dir = f"{model_name}/validation-output/real-time/{input_format}"

    input_file_name = f'{input_dir}/input{i}.{input_format}'
    output_file_name = f'{output_dir}/{os.path.basename(input_file_name)}.out'

    while os.path.exists(input_file_name) or os.path.exists(output_file_name):
        i += 1
        input_file_name = f'{input_dir}/input{i}.{input_format}'
        output_file_name = f'{output_dir}/{os.path.basename(input_file_name)}.out'

    os.makedirs(os.path.dirname(input_file_name), exist_ok=True)
    os.makedirs(os.path.dirname(output_file_name), exist_ok=True)

    input_data = json.dumps(data) if content_type == 'application/json' else data

    # Write input data to file
    with open(input_file_name, 'w') as f:
        f.write(input_data)

    # Upload input data to S3
    s3_client.put_object(Bucket=s3_bucket, Key=f"{s3_input_dir}/{os.path.basename(input_file_name)}", Body=bytes(input_data.encode('UTF-8')))

    # Invoke the SageMaker endpoint
    response = sm_runtime.invoke_endpoint(
        EndpointName=model_name,
        ContentType=content_type,
        Accept=accept,
        Body=input_data,
    )

    # Read response data
    response_data = json.loads(response["Body"].read().decode("utf-8")) if accept == 'application/json' else response['Body'].read().decode('utf-8')

    # Save response data to file
    with open(output_file_name, 'w') as f_out:
        if accept == 'application/json':
            json.dump(response_data, f_out, indent=4)
        else:
            for item in response_data.split('\n'):
                f_out.write(item + '\n')

    # Upload response data to S3
    output_s3_key = f"{s3_output_dir}/{os.path.basename(output_file_name)}"
    if accept == 'application/json':
        s3_client.put_object(Bucket=s3_bucket, Key=output_s3_key, Body=json.dumps(response_data).encode('UTF-8'))
    else:
        s3_client.put_object(Bucket=s3_bucket, Key=output_s3_key, Body=response_data)

    return response_data

### Initial Setup

In [13]:
docs = [
    "A few studies have demonstrated that ANGPTL1 functions as a tumor suppressor gene in breast cancer  , hepatocellular carcinoma  , colorectal cancer and parathyroid carcinoma.", 
    "TRAF6 is a putative oncogene in a variety of cancers including  bladder cancer , and skin cancer. WWP2 appears to regulate the expression of the well characterized tumor suppressor phosphatase and tensin homolog (PTEN) in endometrial cancer and squamous cell carcinoma.",
]


sample_text = "TRIM50 has only been shown to act as a tumor suppressor in hepatocellular carcinoma and ovary cancer. HOXA10 exerts an oncogenic role in several tumors endometrial adenocarcinoma ."

### JSON

#### Example 1

  **Input format**:
  
  
```json
{
    "text": "Single text document"
}
```

In [14]:
input_json_data = {"text": sample_text}

data =  process_data_and_invoke_realtime_endpoint(input_json_data, content_type="application/json" , accept="application/json" )
pd.DataFrame(data["predictions"])

Unnamed: 0,0,1,2,3,4,5
0,"{'ner_chunk': 'tumor', 'begin': 39, 'end': 43, 'ner_label': 'Tumor_Finding', 'ner_confidence': '0.9416', 'code': '8000/1', 'resolution': 'tumor', 'all_k_codes': '8000/1:::8040/1:::8001/1:::9365/3:::8000/6:::8103/0:::9364/3:::8940/0:::8561/0:::9230/1:::8000/3:::9365/3-C76.1:::8100/0:::8158/3:::800:::8711/0:::9135/1:::8935/1:::8010/3:::8815/1:::8960/3:::8312/3:::8153/3', 'all_k_resolutions': 'tumor:::tumorlet:::tumor cells:::askin tumor:::tumor, secondary:::pilar tumor:::ewing tumor:::mixed tumor:::warthin tumor:::codman tumor:::cancer:::askin tumor of thorax:::brooke tumor:::acth-producing tumor:::neoplasms:::glomus tumor:::dabska tumor:::stromal tumor:::carcinoma:::localized fibrous tumor:::wilms tumor:::grawitz tumor:::g cell tumor', 'all_k_distances': '0.0000:::6.2854:::7.2306:::8.0490:::8.2619:::9.2572:::9.3351:::9.5498:::9.8335:::10.1951:::10.2648:::10.3364:::10.7574:::10.8951:::10.9171:::10.9504:::10.9568:::11.1854:::11.2265:::11.2477:::11.2602:::11.2972:::11.3218'}","{'ner_chunk': 'hepatocellular', 'begin': 59, 'end': 72, 'ner_label': 'Histological_Type', 'ner_confidence': '0.9981', 'code': 'C22.0', 'resolution': 'liver', 'all_k_codes': 'C22.0:::C22:::C22.1:::8140/0-C22.0:::9080/1-C22.0:::8010/3-C22.0:::C24.0:::8140/3-C22.0:::8170/3:::C24.9:::8440/0-C22.0:::C21.2:::C18.3:::C71:::C42.3:::C75.9:::C42.2:::C26.0:::8576/3:::8001/3-C22.0:::8000/3-C22.0:::8800/3-C22.0:::C25.3', 'all_k_resolutions': 'liver:::liver and intrahepatic bile ducts:::intrahepatic bile duct:::adenoma, of liver:::teratoma, of liver:::carcinoma, of liver:::extrahepatic bile duct:::adenocarcinoma, of liver:::hepatoma:::billiary tract:::cystadenoma, of liver:::cloacogenic zone:::hepatic flexure of colon:::brain:::reticuloendothelial system:::endocrine gland:::spleen:::intestinal tract:::hepatoid carcinoma:::tumor cells, malignant of liver:::neoplasm, malignant of liver:::sarcoma, of liver:::pancreatic duct', 'all_k_distances': '8.9670:::9.5345:::10.0203:::10.5846:::11.4080:::11.4539:::11.5473:::11.7515:::12.0136:::12.1413:::12.1659:::12.4287:::12.5463:::12.5747:::12.5802:::12.6025:::12.6953:::12.6954:::12.8245:::12.8977:::12.9020:::12.9039:::12.9249'}","{'ner_chunk': 'carcinoma', 'begin': 74, 'end': 82, 'ner_label': 'Cancer_Dx', 'ner_confidence': '0.9491', 'code': '8010/3', 'resolution': 'carcinoma', 'all_k_codes': '8010/3:::8010/9:::8420/3:::8480/3:::8240/3:::8550/3:::8140/3:::8010/6:::8530/3:::8980/3:::8070/3:::8010/3-C06.9:::8054/3:::8520/3:::8010/2:::8244/3:::8141/3:::8051/3:::8575/3:::8481/3:::8110/3:::8441/3', 'all_k_resolutions': 'carcinoma:::carcinomatosis:::ceruminous carcinoma:::mucous carcinoma:::carcinoid:::acinar carcinoma:::adenocarcinoma:::secondary carcinoma:::inflammatory carcinoma:::carcinosarcoma:::squamous carcinoma:::carcinoma, of mouth:::warty carcinoma:::lobular carcinoma:::carcinoma in situ:::composite carcinoid:::scirrhous carcinoma:::verrucous carcinoma:::metaplastic carcinoma:::mucin-producing carcinoma:::matrical carcinoma:::serous carcinoma', 'all_k_distances': '0.0000:::3.8145:::6.1551:::6.1916:::6.2402:::6.7504:::6.9016:::6.9301:::7.1193:::7.1926:::7.2897:::7.3397:::7.4187:::7.4689:::7.4910:::7.6357:::7.6536:::7.7765:::7.7787:::7.7825:::7.8719:::7.8739'}","{'ner_chunk': 'ovary cancer', 'begin': 88, 'end': 99, 'ner_label': 'Oncological', 'ner_confidence': '0.84344995', 'code': '8010/3-C56.9', 'resolution': 'carcinoma, of ovary', 'all_k_codes': '8010/3-C56.9:::8140/3-C56.9:::8980/3-C56.9:::8441/3-C56.9:::8230/3-C56.9:::8051/3-C56.9:::8560/3-C56.9:::8021/3-C56.9:::8010/2-C56.9:::8510/3-C56.9:::8440/3-C56.9:::8050/3-C56.9:::8410/3-C56.9:::8575/3-C56.9:::8070/3-C56.9:::8262/3-C56.9:::8474/3-C56.9:::8140/2-C56.9:::8144/3-C56.9:::9081/3-C56.9:::8640/3-C56.9:::8246/3-C56.9:::9070/3-C56.9:::8245/3-C56.9:::8141/3-C56.9', 'all_k_resolutions': 'carcinoma, of ovary:::adenocarcinoma, of ovary:::carcinosarcoma, of ovary:::serous carcinoma, of ovary:::solid carcinoma, of ovary:::verrucous carcinoma, of ovary:::adenosquamous carcinoma of ovary:::carcinoma, anaplastic, of ovary:::carcinoma in situ, of ovary:::medullary carcinoma, of ovary:::cystadenocarcinoma, of ovary:::papillary carcinoma, of ovary:::sebaceous carcinoma of ovary:::metaplastic carcinoma, of ovary:::squamous cell carcinoma, of ovary:::villous adenocarcinoma of ovary:::seromucinous carcinoma of ovary:::adenocarcinoma in situ, of ovary:::adenocarcinoma, intestinal type of ovary:::teratocarcinoma of ovary:::sertoli cell carcinoma of ovary:::neuroendocrine carcinoma, of ovary:::embryonal carcinoma, of ovary:::adenocarcinoid tumor of ovary:::scirrhous adenocarcinoma of ovary', 'all_k_distances': '6.0432:::6.5364:::6.7072:::6.7773:::6.8416:::7.0828:::7.0911:::7.1390:::7.1420:::7.1727:::7.1923:::7.2456:::7.2635:::7.3015:::7.3667:::7.4366:::7.4374:::7.5513:::7.5794:::7.6882:::7.6929:::7.6994:::7.7832:::7.8821:::7.9171'}","{'ner_chunk': 'tumors', 'begin': 145, 'end': 150, 'ner_label': 'Tumor_Finding', 'ner_confidence': '0.8977', 'code': '8000/1', 'resolution': 'tumor', 'all_k_codes': '8000/1:::8040/1:::8001/1:::8000/6:::9365/3:::8940/0:::8103/0:::800:::8000/3:::8561/0:::9364/3:::8010/3:::9230/1:::8600/0:::8100/0:::8810/0:::8001/3:::C76.1:::9365/3-C76.1:::801-804:::881-883', 'all_k_resolutions': 'tumor:::tumorlet:::tumor cells:::tumor, secondary:::askin tumor:::mixed tumor:::pilar tumor:::neoplasms:::cancer:::warthin tumor:::ewing tumor:::carcinoma:::codman tumor:::thecoma:::brooke tumor:::fibroma:::tumor cells, malignant:::thorax:::askin tumor of thorax:::epithelial neoplasms:::fibromatous neoplasms', 'all_k_distances': '4.6173:::7.7560:::8.0597:::8.7563:::9.1611:::9.5313:::9.7827:::9.8473:::9.9463:::10.0686:::10.1560:::10.6913:::10.7213:::10.9294:::10.9957:::11.0306:::11.0859:::11.0929:::11.2195:::11.2438:::11.2929'}","{'ner_chunk': 'endometrial adenocarcinoma', 'begin': 152, 'end': 177, 'ner_label': 'Cancer_Dx', 'ner_confidence': '0.92715', 'code': '8380/3', 'resolution': 'endometrioid adenocarcinoma', 'all_k_codes': '8380/3:::8560/3-C54.1:::8141/3-C54.1:::8262/3-C54.1:::8380/3-C57.9:::8480/3-C54.1:::8260/3-C54.1:::8211/3-C54.1:::8380/3-C54.0:::8380/3-C54.1:::8382/3-C54.1:::8380/3-C57.4:::8440/3-C54.1:::8380/3-C53.0:::8380/0:::8245/3-C54.1:::8323/3-C54.1:::8480/2-C54.1:::8481/3-C54.1:::8441/2-C54.1:::8382/3-C57.9:::8380/3-C53.8:::8141/3-C53.0:::8933/3-C54.1', 'all_k_resolutions': 'endometrioid adenocarcinoma:::adenosquamous carcinoma of endometrium:::scirrhous adenocarcinoma of endometrium:::villous adenocarcinoma of endometrium:::endometrioid adenocarcinoma, of female genital tract:::mucinous adenocarcinoma of endometrium:::papillary adenocarcinoma, of endometrium:::tubular adenocarcinoma of endometrium:::endometrioid adenocarcinoma, of isthmus uteri:::endometrioid adenocarcinoma, of endometrium:::endometrioid adenocarcinoma, secretory variant of endometrium:::endometrioid adenocarcinoma, of uterine adnexa:::cystadenocarcinoma, of endometrium:::endometrioid adenocarcinoma, of endocervix:::endometrioid adenoma:::adenocarcinoid tumor of endometrium:::mixed cell adenocarcinoma of endometrium:::mucinous adenocarcinoma in situ of endometrium:::mucin-producing adenocarcinoma of endometrium:::serous intraepithelial carcinoma of endometrium:::endometrioid adenocarcinoma, secretory variant of female genital tract:::endometrioid adenocarcinoma, of overlapping lesion of cervix uteri:::scirrhous adenocarcinoma of endocervix:::adenosarcoma of endometrium', 'all_k_distances': '3.1721:::4.4318:::4.5417:::4.7098:::4.7277:::4.8140:::5.0247:::5.1204:::5.1665:::5.2567:::5.3635:::5.4178:::5.4381:::5.4606:::5.4638:::5.5854:::5.5956:::5.6376:::5.6406:::5.6458:::5.6582:::5.6823:::5.6906:::5.7367'}"


#### Example 2

  **Input format**:
  
  
```json
{
    "text": [
        "Text document 1",
        "Text document 2",
        ...
    ]
}
```

In [15]:
input_json_data = {"text": docs}

data =  process_data_and_invoke_realtime_endpoint(input_json_data, content_type="application/json" , accept="application/json" )
pd.DataFrame(data["predictions"])

Unnamed: 0,0,1,2,3,4,5,6
0,"{'ner_chunk': 'tumor', 'begin': 60, 'end': 64, 'ner_label': 'Tumor_Finding', 'ner_confidence': '0.9633', 'code': '8000/1', 'resolution': 'tumor', 'all_k_codes': '8000/1:::8040/1:::8001/1:::9365/3:::8000/6:::8103/0:::9364/3:::8940/0:::8561/0:::9230/1:::8000/3:::9365/3-C76.1:::8100/0:::8158/3:::800:::8711/0:::9135/1:::8935/1:::8010/3:::8815/1:::8960/3:::8312/3:::8153/3', 'all_k_resolutions': 'tumor:::tumorlet:::tumor cells:::askin tumor:::tumor, secondary:::pilar tumor:::ewing tumor:::mixed tumor:::warthin tumor:::codman tumor:::cancer:::askin tumor of thorax:::brooke tumor:::acth-producing tumor:::neoplasms:::glomus tumor:::dabska tumor:::stromal tumor:::carcinoma:::localized fibrous tumor:::wilms tumor:::grawitz tumor:::g cell tumor', 'all_k_distances': '0.0000:::6.2854:::7.2306:::8.0490:::8.2619:::9.2572:::9.3351:::9.5498:::9.8335:::10.1951:::10.2648:::10.3364:::10.7574:::10.8951:::10.9171:::10.9504:::10.9568:::11.1854:::11.2265:::11.2477:::11.2602:::11.2972:::11.3218'}","{'ner_chunk': 'breast cancer', 'begin': 85, 'end': 97, 'ner_label': 'Oncological', 'ner_confidence': '0.939', 'code': '8010/3-C50.9', 'resolution': 'carcinoma, of breast', 'all_k_codes': '8010/3-C50.9:::8140/3-C50.9:::8980/3-C50.9:::8230/3-C50.9:::8520/3-C50.9:::8575/3-C50.9:::8510/3-C50.9:::8051/3-C50.9:::8440/3-C50.9:::8201/3-C50.9:::8410/3-C50.9:::8070/3-C50.9:::8050/3-C50.9:::8550/3-C50.9:::8530/3-C50.9:::8560/3-C50.9:::8501/3-C50.9:::8800/3-C50.9:::8810/3-C50.9:::8010/2-C50.9:::8010/3-C50.6:::8145/3-C50.9:::8500/3-C50.9:::8021/3-C50.9:::9010/0-C50.9', 'all_k_resolutions': 'carcinoma, of breast:::adenocarcinoma, of breast:::carcinosarcoma, of breast:::solid carcinoma, of breast:::lobular carcinoma, of breast:::metaplastic carcinoma, of breast:::medullary carcinoma, of breast:::verrucous carcinoma, of breast:::cystadenocarcinoma, of breast:::cribriform carcinoma, of breast:::sebaceous carcinoma of breast:::squamous cell carcinoma, of breast:::papillary carcinoma, of breast:::acinar cell carcinoma of breast:::inflammatory carcinoma of breast:::adenosquamous carcinoma of breast:::comedocarcinoma, of breast:::sarcoma, of breast:::fibrosarcoma, of breast:::carcinoma in situ, of breast:::carcinoma, of axillary tail of breast:::carcinoma, diffuse type of breast:::infiltrating duct carcinoma, of breast:::carcinoma, anaplastic, of breast:::fibroadenoma, of breast', 'all_k_distances': '5.7740:::6.7579:::7.2199:::7.2368:::7.2713:::7.4459:::7.4703:::7.5246:::7.7778:::7.8426:::7.8444:::7.8521:::7.8565:::7.8621:::7.8664:::7.8862:::7.8897:::7.9308:::8.0135:::8.1355:::8.2741:::8.2747:::8.3045:::8.3297:::8.3516'}","{'ner_chunk': 'hepatocellular', 'begin': 102, 'end': 115, 'ner_label': 'Histological_Type', 'ner_confidence': '0.9995', 'code': 'C22.0', 'resolution': 'liver', 'all_k_codes': 'C22.0:::C22:::C22.1:::8140/0-C22.0:::9080/1-C22.0:::8010/3-C22.0:::C24.0:::8140/3-C22.0:::8170/3:::C24.9:::8440/0-C22.0:::C21.2:::C18.3:::C71:::C42.3:::C75.9:::C42.2:::C26.0:::8576/3:::8001/3-C22.0:::8000/3-C22.0:::8800/3-C22.0:::C25.3', 'all_k_resolutions': 'liver:::liver and intrahepatic bile ducts:::intrahepatic bile duct:::adenoma, of liver:::teratoma, of liver:::carcinoma, of liver:::extrahepatic bile duct:::adenocarcinoma, of liver:::hepatoma:::billiary tract:::cystadenoma, of liver:::cloacogenic zone:::hepatic flexure of colon:::brain:::reticuloendothelial system:::endocrine gland:::spleen:::intestinal tract:::hepatoid carcinoma:::tumor cells, malignant of liver:::neoplasm, malignant of liver:::sarcoma, of liver:::pancreatic duct', 'all_k_distances': '8.9670:::9.5345:::10.0203:::10.5846:::11.4080:::11.4539:::11.5473:::11.7515:::12.0136:::12.1413:::12.1659:::12.4287:::12.5463:::12.5747:::12.5802:::12.6025:::12.6953:::12.6954:::12.8245:::12.8977:::12.9020:::12.9039:::12.9249'}","{'ner_chunk': 'carcinoma', 'begin': 117, 'end': 125, 'ner_label': 'Cancer_Dx', 'ner_confidence': '0.971', 'code': '8010/3', 'resolution': 'carcinoma', 'all_k_codes': '8010/3:::8010/9:::8420/3:::8480/3:::8240/3:::8550/3:::8140/3:::8010/6:::8530/3:::8980/3:::8070/3:::8010/3-C06.9:::8054/3:::8520/3:::8010/2:::8244/3:::8141/3:::8051/3:::8575/3:::8481/3:::8110/3:::8441/3', 'all_k_resolutions': 'carcinoma:::carcinomatosis:::ceruminous carcinoma:::mucous carcinoma:::carcinoid:::acinar carcinoma:::adenocarcinoma:::secondary carcinoma:::inflammatory carcinoma:::carcinosarcoma:::squamous carcinoma:::carcinoma, of mouth:::warty carcinoma:::lobular carcinoma:::carcinoma in situ:::composite carcinoid:::scirrhous carcinoma:::verrucous carcinoma:::metaplastic carcinoma:::mucin-producing carcinoma:::matrical carcinoma:::serous carcinoma', 'all_k_distances': '0.0000:::3.8145:::6.1551:::6.1916:::6.2402:::6.7504:::6.9016:::6.9301:::7.1193:::7.1926:::7.2897:::7.3397:::7.4187:::7.4689:::7.4910:::7.6357:::7.6536:::7.7765:::7.7787:::7.7825:::7.8719:::7.8739'}","{'ner_chunk': 'colorectal cancer', 'begin': 130, 'end': 146, 'ner_label': 'Oncological', 'ner_confidence': '0.9151', 'code': '8010/3-C18.9', 'resolution': 'carcinoma, of colon', 'all_k_codes': '8010/3-C18.9:::8140/3-C18.9:::8010/3-C18.6:::8140/3-C18.6:::8144/3-C18.9:::8070/3-C18.9:::8144/3-C18.4:::8230/3-C18.9:::8230/3-C18.6:::8980/3-C18.9:::8010/3-C18.4:::8010/3-C18.2:::8144/3-C18.3:::8140/3-C18.4:::8010/3-C17.1:::8145/3-C18.9:::8070/3-C18.6:::8010/3-C26.0:::8144/3-C18.0:::8230/3-C18.4:::8145/3-C18.4:::8140/3-C26.0:::8144/3-C18.2:::8140/3-C18.3:::8010/3-C18.0', 'all_k_resolutions': 'carcinoma, of colon:::adenocarcinoma, of colon:::carcinoma, of descending colon:::adenocarcinoma, of descending colon:::adenocarcinoma, intestinal type of colon:::squamous cell carcinoma, of colon:::adenocarcinoma, intestinal type of transverse colon:::solid carcinoma, of colon:::solid carcinoma, of descending colon:::carcinosarcoma, of colon:::carcinoma, of transverse colon:::carcinoma, of ascending colon:::adenocarcinoma, intestinal type of hepatic flexure of colon:::adenocarcinoma, of transverse colon:::carcinoma, of jejunum:::carcinoma, diffuse type of colon:::squamous cell carcinoma, of descending colon:::carcinoma, of intestinal tract:::adenocarcinoma, intestinal type of cecum:::solid carcinoma, of transverse colon:::carcinoma, diffuse type of transverse colon:::adenocarcinoma, of intestinal tract:::adenocarcinoma, intestinal type of ascending colon:::adenocarcinoma, of hepatic flexure of colon:::carcinoma, of cecum', 'all_k_distances': '7.9763:::8.4326:::8.4762:::8.6332:::8.6565:::8.6715:::8.6837:::8.7222:::8.8114:::8.8851:::8.9076:::8.9230:::8.9351:::8.9363:::9.0841:::9.0984:::9.1072:::9.1194:::9.1574:::9.1843:::9.1883:::9.1909:::9.1956:::9.2173:::9.2272'}","{'ner_chunk': 'parathyroid carcinoma', 'begin': 152, 'end': 172, 'ner_label': 'Oncological', 'ner_confidence': '0.8334', 'code': '8010/3-C75.0', 'resolution': 'carcinoma, of parathyroid gland', 'all_k_codes': '8010/3-C75.0:::8140/3-C75.0:::8141/3-C75.0:::8320/3-C75.0:::8010/2-C75.0:::8022/3-C75.0:::8021/3-C75.0:::8015/3-C75.0:::8140/2-C75.0:::8290/3-C75.0:::9081/3-C75.0:::8323/3-C75.0:::8000/3-C75.0:::9070/3-C75.0:::8012/3-C75.0:::8001/3-C75.0:::9060/3-C75.0:::8147/3-C75.0:::9064/3-C75.0:::8000/6-C75.0:::8011/3-C75.0:::8321/0-C75.0:::8143/3-C75.0:::8255/3-C75.0:::9503/3-C75.0', 'all_k_resolutions': 'carcinoma, of parathyroid gland:::adenocarcinoma, of parathyroid gland:::scirrhous adenocarcinoma of parathyroid gland:::granular cell carcinoma of parathyroid gland:::carcinoma in situ, of parathyroid gland:::pleomorphic carcinoma of parathyroid gland:::carcinoma, anaplastic, of parathyroid gland:::glassy cell carcinoma of parathyroid gland:::adenocarcinoma in situ, of parathyroid gland:::oxyphilic adenocarcinoma of parathyroid gland:::teratocarcinoma of parathyroid gland:::mixed cell adenocarcinoma of parathyroid gland:::neoplasm, malignant of parathyroid gland:::embryonal carcinoma, of parathyroid gland:::large cell carcinoma, of parathyroid gland:::tumor cells, malignant of parathyroid gland:::dysgerminoma of parathyroid gland:::basal cell adenocarcinoma of parathyroid gland:::germinoma of parathyroid gland:::neoplasm, metastatic of parathyroid gland:::epithelioma, malignant of parathyroid gland:::chief cell adenoma of parathyroid gland:::superficial spreading adenocarcinoma of parathyroid gland:::adenocarcinoma with mixed subtypes of parathyroid gland:::neuroepithelioma, of parathyroid gland', 'all_k_distances': '5.6365:::5.9059:::6.3216:::6.3662:::6.4008:::6.4764:::6.5107:::6.5738:::6.9338:::7.1116:::7.1136:::7.1309:::7.3423:::7.3451:::7.4478:::7.4911:::7.5117:::7.5150:::7.5195:::7.6392:::7.6936:::7.8961:::7.9731:::8.0687:::8.1319'}",
1,"{'ner_chunk': 'cancers', 'begin': 45, 'end': 51, 'ner_label': 'Oncological', 'ner_confidence': '0.962', 'code': '8000/3', 'resolution': 'cancer', 'all_k_codes': '8000/3:::8010/3:::8010/9:::800:::8420/3:::8140/3:::8010/3-C76.0:::8010/6:::8010/3-C44.5:::8010/3-C26.0:::8010/3-C76.1:::8000/1:::8240/3:::8010/3-C06.9:::8021/3:::8010/9-C44.9:::8530/3:::8550/3:::8001/1:::8010/3-C77.8:::8230/3:::8010/3-C21.0:::8070/3:::8010/3-C44.9', 'all_k_resolutions': 'cancer:::carcinoma:::carcinomatosis:::neoplasms:::ceruminous carcinoma:::adenocarcinoma:::carcinoma, of head, face or neck:::secondary carcinoma:::carcinoma, of skin of trunk:::carcinoma, of intestinal tract:::carcinoma, of thorax:::neoplasm:::carcinoid:::carcinoma, of mouth:::carcinoma, anaplastic:::carcinomatosis of skin:::inflammatory carcinoma:::acinar carcinoma:::tumor cells:::carcinoma, of lymph nodes of multiple regions:::solid carcinoma:::carcinoma, of anus:::squamous carcinoma:::carcinoma, of skin', 'all_k_distances': '4.5858:::8.0682:::8.9112:::9.9677:::10.0723:::10.1706:::10.3023:::10.3163:::10.3208:::10.3436:::10.3452:::10.3817:::10.3965:::10.4648:::10.4772:::10.5091:::10.5263:::10.6105:::10.6343:::10.6414:::10.6562:::10.6711:::10.6929:::10.7156'}","{'ner_chunk': 'bladder cancer', 'begin': 64, 'end': 77, 'ner_label': 'Oncological', 'ner_confidence': '0.92655003', 'code': '8010/3-C67.9', 'resolution': 'carcinoma, of bladder', 'all_k_codes': '8010/3-C67.9:::8010/3-C67.5:::8230/3-C67.9:::8140/3-C67.9:::8441/3-C67.9:::8120/3-C67.9:::8070/3-C67.9:::8980/3-C67.9:::8140/3-C67.5:::8230/3-C67.5:::8051/3-C67.9:::8510/3-C67.9:::8050/3-C67.9:::8051/3-C67.5:::8560/3-C67.9:::8010/2-C67.9:::8070/3-C67.5:::8120/3-C67.5:::8010/3-C67.1:::8130/3-C67.9:::8120/2-C67.9:::8510/3-C67.5:::8050/3-C67.5:::8120/3:::8980/3-C67.5', 'all_k_resolutions': 'carcinoma, of bladder:::carcinoma, of bladder neck:::solid carcinoma, of bladder:::adenocarcinoma, of bladder:::serous carcinoma, of bladder:::transitional cell carcinoma, of bladder:::squamous cell carcinoma, of bladder:::carcinosarcoma, of bladder:::adenocarcinoma, of bladder neck:::solid carcinoma, of bladder neck:::verrucous carcinoma, of bladder:::medullary carcinoma, of bladder:::papillary carcinoma, of bladder:::verrucous carcinoma, of bladder neck:::adenosquamous carcinoma of bladder:::carcinoma in situ, of bladder:::squamous cell carcinoma, of bladder neck:::transitional cell carcinoma, of bladder neck:::carcinoma, of dome of bladder:::papillary urothelial carcinoma of bladder:::urothelial carcinoma in situ of bladder:::medullary carcinoma, of bladder neck:::papillary carcinoma, of bladder neck:::urothelial carcinoma:::carcinosarcoma, of bladder neck', 'all_k_distances': '6.5813:::6.8966:::7.0982:::7.1115:::7.1915:::7.2715:::7.3158:::7.5210:::7.6405:::7.6936:::7.7054:::7.7917:::7.8292:::7.8487:::7.9305:::8.0157:::8.0769:::8.1050:::8.1611:::8.1740:::8.2709:::8.2711:::8.2749:::8.2969:::8.3491'}","{'ner_chunk': 'skin cancer', 'begin': 85, 'end': 95, 'ner_label': 'Oncological', 'ner_confidence': '0.8052', 'code': '8010/3-C44.9', 'resolution': 'carcinoma, of skin', 'all_k_codes': '8010/3-C44.9:::8010/9-C44.9:::8070/3-C44.9:::8140/3-C44.9:::8980/3-C44.9:::8010/3-C44.5:::8409/3-C44.9:::8560/3-C44.9:::8051/3-C44.9:::8010/2-C44.9:::8201/3-C44.9:::8575/3-C44.9:::8390/3:::8230/3-C44.9:::8070/3:::8094/3-C44.9:::8410/3-C44.9:::8110/3-C44.9:::8010/3:::8070/3-C44.5:::8010/3-C44.4:::8051/3-C44.5:::8247/3-C44.9:::8440/3-C44.9', 'all_k_resolutions': 'carcinoma, of skin:::carcinomatosis of skin:::squamous cell carcinoma, of skin:::adenocarcinoma, of skin:::carcinosarcoma, of skin:::carcinoma, of skin of trunk:::porocarcinoma, of skin:::adenosquamous carcinoma of skin:::verrucous carcinoma, of skin:::carcinoma in situ, of skin:::cribriform carcinoma, of skin:::metaplastic carcinoma, of skin:::skin appendage carcinoma:::solid carcinoma, of skin:::squamous carcinoma:::basosquamous carcinoma of skin:::sebaceous carcinoma of skin:::pilomatrical carcinoma of skin:::carcinoma:::squamous cell carcinoma, of skin of trunk:::carcinoma, of skin of scalp and neck:::verrucous carcinoma, of skin of trunk:::merkel cell carcinoma of skin:::cystadenocarcinoma, of skin', 'all_k_distances': '7.0142:::7.0919:::7.6409:::7.9804:::7.9894:::8.1074:::8.2439:::8.2525:::8.3252:::8.3588:::8.3853:::8.4741:::8.4997:::8.5081:::8.5755:::8.7097:::8.8165:::8.8351:::8.8808:::8.9103:::8.9701:::9.1659:::9.1903:::9.2141'}","{'ner_chunk': 'tumor', 'begin': 164, 'end': 168, 'ner_label': 'Oncological', 'ner_confidence': '0.9592', 'code': '8000/1', 'resolution': 'tumor', 'all_k_codes': '8000/1:::8040/1:::8001/1:::9365/3:::8000/6:::8103/0:::9364/3:::8940/0:::8561/0:::9230/1:::8000/3:::9365/3-C76.1:::8100/0:::8158/3:::800:::8711/0:::9135/1:::8935/1:::8010/3:::8815/1:::8960/3:::8312/3:::8153/3', 'all_k_resolutions': 'tumor:::tumorlet:::tumor cells:::askin tumor:::tumor, secondary:::pilar tumor:::ewing tumor:::mixed tumor:::warthin tumor:::codman tumor:::cancer:::askin tumor of thorax:::brooke tumor:::acth-producing tumor:::neoplasms:::glomus tumor:::dabska tumor:::stromal tumor:::carcinoma:::localized fibrous tumor:::wilms tumor:::grawitz tumor:::g cell tumor', 'all_k_distances': '0.0000:::6.2854:::7.2306:::8.0490:::8.2619:::9.2572:::9.3351:::9.5498:::9.8335:::10.1951:::10.2648:::10.3364:::10.7574:::10.8951:::10.9171:::10.9504:::10.9568:::11.1854:::11.2265:::11.2477:::11.2602:::11.2972:::11.3218'}","{'ner_chunk': 'endometrial cancer', 'begin': 222, 'end': 239, 'ner_label': 'Oncological', 'ner_confidence': '0.96370006', 'code': '8380/3', 'resolution': 'endometrioid carcinoma', 'all_k_codes': '8380/3:::8010/3-C54.1:::8380/3-C57.9:::8575/3-C54.1:::8560/3-C54.1:::8441/3-C54.1:::8140/3-C54.1:::8051/3-C54.1:::8384/3-C54.1:::8230/3-C54.1:::8440/3-C54.1:::8021/3-C54.1:::8010/2-C54.1:::8070/3-C54.1:::8380/3-C53.0:::8262/3-C54.1:::8575/3-C53.0:::8201/3-C54.1:::8120/3-C54.1:::8980/3-C54.1:::8050/3-C54.1:::8380/3-C54.1:::8510/3-C54.1:::8140/2-C54.1', 'all_k_resolutions': 'endometrioid carcinoma:::carcinoma, of endometrium:::endometrioid adenocarcinoma, of female genital tract:::metaplastic carcinoma, of endometrium:::adenosquamous carcinoma of endometrium:::serous carcinoma, of endometrium:::adenocarcinoma, of endometrium:::verrucous carcinoma, of endometrium:::adenocarcinoma, endocervical type, of endometrium:::solid carcinoma, of endometrium:::cystadenocarcinoma, of endometrium:::carcinoma, anaplastic, of endometrium:::carcinoma in situ, of endometrium:::squamous cell carcinoma, of endometrium:::endometrioid adenocarcinoma, of endocervix:::villous adenocarcinoma of endometrium:::metaplastic carcinoma, of endocervix:::cribriform carcinoma, of endometrium:::transitional cell carcinoma, of endometrium:::carcinosarcoma, of endometrium:::papillary carcinoma, of endometrium:::endometrioid adenocarcinoma, of endometrium:::medullary carcinoma, of endometrium:::adenocarcinoma in situ, of endometrium', 'all_k_distances': '5.9497:::6.3493:::6.4270:::6.7552:::6.9049:::6.9146:::6.9304:::6.9807:::7.0144:::7.0688:::7.0777:::7.0950:::7.1113:::7.1223:::7.1493:::7.2301:::7.2546:::7.2696:::7.2941:::7.3189:::7.3300:::7.3586:::7.3971:::7.3997'}","{'ner_chunk': 'squamous cell', 'begin': 245, 'end': 257, 'ner_label': 'Histological_Type', 'ner_confidence': '0.9893', 'code': '805-808', 'resolution': 'squamous cell neoplasms', 'all_k_codes': '805-808:::8070/3:::C08.1:::C32.2:::9084/0:::C09.1:::8075/3:::C44.0:::C69.0:::8720/0:::C44.4:::C44:::C06.0:::C33:::8074/3:::C32:::C72.0:::8074/3-C76.0:::8077/0:::C02.2:::8070/3-C44.9:::8075/3-C44.9:::C09:::C00.5', 'all_k_resolutions': 'squamous cell neoplasms:::squamous carcinoma:::sublingual gland:::subglottis:::dermoid:::tonsillar pillar:::squamous cell carcinoma, adenoid:::skin of lip:::conjunctiva:::nevus:::skin of scalp and neck:::skin:::cheeck mucosa:::trachea:::squamous cell carcinoma, spindle cell:::larynx:::spinal cord:::squamous cell carcinoma, spindle cell of head, face or neck:::low grade squamous intraepithelial lesion:::ventral surface of tongue:::squamous cell carcinoma, of skin:::squamous cell carcinoma, adenoid of skin:::tonsil:::mucosa of lip', 'all_k_distances': '9.6168:::10.4551:::11.0612:::11.5796:::11.7520:::11.9531:::12.0110:::12.0516:::12.0576:::12.0851:::12.0955:::12.0970:::12.1025:::12.1131:::12.1507:::12.1573:::12.1589:::12.3159:::12.4133:::12.4633:::12.4834:::12.5242:::12.5655:::12.5964'}","{'ner_chunk': 'carcinoma', 'begin': 259, 'end': 267, 'ner_label': 'Cancer_Dx', 'ner_confidence': '0.994', 'code': '8010/3', 'resolution': 'carcinoma', 'all_k_codes': '8010/3:::8010/9:::8420/3:::8480/3:::8240/3:::8550/3:::8140/3:::8010/6:::8530/3:::8980/3:::8070/3:::8010/3-C06.9:::8054/3:::8520/3:::8010/2:::8244/3:::8141/3:::8051/3:::8575/3:::8481/3:::8110/3:::8441/3', 'all_k_resolutions': 'carcinoma:::carcinomatosis:::ceruminous carcinoma:::mucous carcinoma:::carcinoid:::acinar carcinoma:::adenocarcinoma:::secondary carcinoma:::inflammatory carcinoma:::carcinosarcoma:::squamous carcinoma:::carcinoma, of mouth:::warty carcinoma:::lobular carcinoma:::carcinoma in situ:::composite carcinoid:::scirrhous carcinoma:::verrucous carcinoma:::metaplastic carcinoma:::mucin-producing carcinoma:::matrical carcinoma:::serous carcinoma', 'all_k_distances': '0.0000:::3.8145:::6.1551:::6.1916:::6.2402:::6.7504:::6.9016:::6.9301:::7.1193:::7.1926:::7.2897:::7.3397:::7.4187:::7.4689:::7.4910:::7.6357:::7.6536:::7.7765:::7.7787:::7.7825:::7.8719:::7.8739'}"


### JSON Lines

In [16]:
import json

def create_jsonl(records):
    json_records = []

    for text in records:
        record = {
            "text": text
        }
        json_records.append(record)

    json_lines = '\n'.join(json.dumps(record) for record in json_records)

    return json_lines

input_jsonl_data = create_jsonl(docs)

#### Example 1

  **Input format**:
  
```json
{"text": "Text document 1"}
{"text": "Text document 2"}
```

In [17]:
data = process_data_and_invoke_realtime_endpoint(input_jsonl_data, content_type="application/jsonlines" , accept="application/jsonlines" )
print(data)

{"predictions": [{"ner_chunk": "tumor", "begin": 60, "end": 64, "ner_label": "Tumor_Finding", "ner_confidence": "0.9633", "code": "8000/1", "resolution": "tumor", "all_k_codes": "8000/1:::8040/1:::8001/1:::9365/3:::8000/6:::8103/0:::9364/3:::8940/0:::8561/0:::9230/1:::8000/3:::9365/3-C76.1:::8100/0:::8158/3:::800:::8711/0:::9135/1:::8935/1:::8010/3:::8815/1:::8960/3:::8312/3:::8153/3", "all_k_resolutions": "tumor:::tumorlet:::tumor cells:::askin tumor:::tumor, secondary:::pilar tumor:::ewing tumor:::mixed tumor:::warthin tumor:::codman tumor:::cancer:::askin tumor of thorax:::brooke tumor:::acth-producing tumor:::neoplasms:::glomus tumor:::dabska tumor:::stromal tumor:::carcinoma:::localized fibrous tumor:::wilms tumor:::grawitz tumor:::g cell tumor", "all_k_distances": "0.0000:::6.2854:::7.2306:::8.0490:::8.2619:::9.2572:::9.3351:::9.5498:::9.8335:::10.1951:::10.2648:::10.3364:::10.7574:::10.8951:::10.9171:::10.9504:::10.9568:::11.1854:::11.2265:::11.2477:::11.2602:::11.2972:::11.3218

### C. Delete the endpoint

Now that you have successfully performed a real-time inference, you do not need the endpoint any more. You can terminate the endpoint to avoid being charged.

In [18]:
model.sagemaker_session.delete_endpoint(model_name)
model.sagemaker_session.delete_endpoint_config(model_name)

## 3. Batch inference

In [19]:
import json
import os

input_dir = 'inputs/batch'
json_input_dir = f"{input_dir}/json"
jsonl_input_dir = f"{input_dir}/jsonl"

output_dir = 'outputs/batch'
json_output_dir = f"{output_dir}/json"
jsonl_output_dir = f"{output_dir}/jsonl"

os.makedirs(json_input_dir, exist_ok=True)
os.makedirs(jsonl_input_dir, exist_ok=True)
os.makedirs(json_output_dir, exist_ok=True)
os.makedirs(jsonl_output_dir, exist_ok=True)

validation_json_file_name = "input.json"

validation_jsonl_file_name = "input.jsonl"

validation_input_json_path = f"s3://{s3_bucket}/{model_name}/validation-input/batch/json/"
validation_output_json_path = f"s3://{s3_bucket}/{model_name}/validation-output/batch/json/"

validation_input_jsonl_path = f"s3://{s3_bucket}/{model_name}/validation-input/batch/jsonl/"
validation_output_jsonl_path = f"s3://{s3_bucket}/{model_name}/validation-output/batch/jsonl/"

def write_and_upload_to_s3(input_data, file_name):
    file_format = os.path.splitext(file_name)[1].lower()
    if file_format == ".json":
        input_data = json.dumps(input_data)

    with open(file_name, "w") as f:
        f.write(input_data)

    s3_client.put_object(
        Bucket=s3_bucket,
        Key=f"{model_name}/validation-input/batch/{file_format[1:]}/{os.path.basename(file_name)}",
        Body=(bytes(input_data.encode("UTF-8"))),
    )

In [20]:
input_jsonl_data = create_jsonl(docs)
input_json_data = {"text": docs}

write_and_upload_to_s3(input_json_data, f"{json_input_dir}/{validation_json_file_name}")

write_and_upload_to_s3(input_jsonl_data, f"{jsonl_input_dir}/{validation_jsonl_file_name}")

### JSON

In [21]:
# Initialize a SageMaker Transformer object for making predictions
transformer = model.transformer(
    instance_count=1,
    instance_type=batch_transform_inference_instance_type,
    accept="application/json",
    output_path=validation_output_json_path
)

transformer.transform(validation_input_json_path, content_type="application/json")
transformer.wait()

INFO:sagemaker:Creating transform job with name: icdo-resolver-pipeline-en-2024-12-03-10-27-47-864


........................................[34m24/12/03 10:34:27 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable[0m
[34mSetting default log level to "WARN".[0m
[34mTo adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).[0m

[34m#015[Stage 0:>                                                          (0 + 1) / 1]#015#015                                                                                #015INFO:     Started server process [7][0m
[34mINFO:     Waiting for application startup.[0m
[34mINFO:     Application startup complete.[0m
[34mINFO:     Uvicorn running on http://0.0.0.0:8080 (Press CTRL+C to quit)[0m
[34m📋 Loading license number 0 from /root/.johnsnowlabs/licenses/license_number_{number}_for_Spark-Healthcare_Spark-OCR.json[0m
[34m👌 Launched #033[92mcpu optimized#033[39m session with with: 🚀Spark-NLP==5.5.0, 💊Spark-Healthcare==5.5.0, running on ⚡ Py

In [22]:
from urllib.parse import urlparse

def process_s3_json_output_and_save(validation_file_name):

    output_file_path = f"{json_output_dir}/{validation_file_name}.out"
    parsed_url = urlparse(transformer.output_path)
    file_key = f"{parsed_url.path[1:]}{validation_file_name}.out"
    response = s3_client.get_object(Bucket=s3_bucket, Key=file_key)

    data = json.loads(response["Body"].read().decode("utf-8"))
    df = pd.DataFrame(data["predictions"])
    display(df)

    # Save the data to the output file
    with open(output_file_path, 'w') as f_out:
        json.dump(data, f_out, indent=4)

In [23]:
process_s3_json_output_and_save(validation_json_file_name)

Unnamed: 0,0,1,2,3,4,5,6
0,"{'ner_chunk': 'tumor', 'begin': 60, 'end': 64, 'ner_label': 'Tumor_Finding', 'ner_confidence': '0.9633', 'code': '8000/1', 'resolution': 'tumor', 'all_k_codes': '8000/1:::8040/1:::8001/1:::9365/3:::8000/6:::8103/0:::9364/3:::8940/0:::8561/0:::9230/1:::8000/3:::9365/3-C76.1:::8100/0:::8158/3:::800:::8711/0:::9135/1:::8935/1:::8010/3:::8815/1:::8960/3:::8312/3:::8153/3', 'all_k_resolutions': 'tumor:::tumorlet:::tumor cells:::askin tumor:::tumor, secondary:::pilar tumor:::ewing tumor:::mixed tumor:::warthin tumor:::codman tumor:::cancer:::askin tumor of thorax:::brooke tumor:::acth-producing tumor:::neoplasms:::glomus tumor:::dabska tumor:::stromal tumor:::carcinoma:::localized fibrous tumor:::wilms tumor:::grawitz tumor:::g cell tumor', 'all_k_distances': '0.0000:::6.2854:::7.2306:::8.0490:::8.2619:::9.2572:::9.3351:::9.5498:::9.8335:::10.1951:::10.2648:::10.3364:::10.7574:::10.8951:::10.9171:::10.9504:::10.9568:::11.1854:::11.2265:::11.2477:::11.2602:::11.2972:::11.3218'}","{'ner_chunk': 'breast cancer', 'begin': 85, 'end': 97, 'ner_label': 'Oncological', 'ner_confidence': '0.939', 'code': '8010/3-C50.9', 'resolution': 'carcinoma, of breast', 'all_k_codes': '8010/3-C50.9:::8140/3-C50.9:::8980/3-C50.9:::8230/3-C50.9:::8520/3-C50.9:::8575/3-C50.9:::8510/3-C50.9:::8051/3-C50.9:::8440/3-C50.9:::8201/3-C50.9:::8410/3-C50.9:::8070/3-C50.9:::8050/3-C50.9:::8550/3-C50.9:::8530/3-C50.9:::8560/3-C50.9:::8501/3-C50.9:::8800/3-C50.9:::8810/3-C50.9:::8010/2-C50.9:::8010/3-C50.6:::8145/3-C50.9:::8500/3-C50.9:::8021/3-C50.9:::9010/0-C50.9', 'all_k_resolutions': 'carcinoma, of breast:::adenocarcinoma, of breast:::carcinosarcoma, of breast:::solid carcinoma, of breast:::lobular carcinoma, of breast:::metaplastic carcinoma, of breast:::medullary carcinoma, of breast:::verrucous carcinoma, of breast:::cystadenocarcinoma, of breast:::cribriform carcinoma, of breast:::sebaceous carcinoma of breast:::squamous cell carcinoma, of breast:::papillary carcinoma, of breast:::acinar cell carcinoma of breast:::inflammatory carcinoma of breast:::adenosquamous carcinoma of breast:::comedocarcinoma, of breast:::sarcoma, of breast:::fibrosarcoma, of breast:::carcinoma in situ, of breast:::carcinoma, of axillary tail of breast:::carcinoma, diffuse type of breast:::infiltrating duct carcinoma, of breast:::carcinoma, anaplastic, of breast:::fibroadenoma, of breast', 'all_k_distances': '5.7740:::6.7579:::7.2199:::7.2368:::7.2713:::7.4459:::7.4703:::7.5246:::7.7778:::7.8426:::7.8444:::7.8521:::7.8565:::7.8621:::7.8664:::7.8862:::7.8897:::7.9308:::8.0135:::8.1355:::8.2741:::8.2747:::8.3045:::8.3297:::8.3516'}","{'ner_chunk': 'hepatocellular', 'begin': 102, 'end': 115, 'ner_label': 'Histological_Type', 'ner_confidence': '0.9995', 'code': 'C22.0', 'resolution': 'liver', 'all_k_codes': 'C22.0:::C22:::C22.1:::8140/0-C22.0:::9080/1-C22.0:::8010/3-C22.0:::C24.0:::8140/3-C22.0:::8170/3:::C24.9:::8440/0-C22.0:::C21.2:::C18.3:::C71:::C42.3:::C75.9:::C42.2:::C26.0:::8576/3:::8001/3-C22.0:::8000/3-C22.0:::8800/3-C22.0:::C25.3', 'all_k_resolutions': 'liver:::liver and intrahepatic bile ducts:::intrahepatic bile duct:::adenoma, of liver:::teratoma, of liver:::carcinoma, of liver:::extrahepatic bile duct:::adenocarcinoma, of liver:::hepatoma:::billiary tract:::cystadenoma, of liver:::cloacogenic zone:::hepatic flexure of colon:::brain:::reticuloendothelial system:::endocrine gland:::spleen:::intestinal tract:::hepatoid carcinoma:::tumor cells, malignant of liver:::neoplasm, malignant of liver:::sarcoma, of liver:::pancreatic duct', 'all_k_distances': '8.9670:::9.5345:::10.0203:::10.5846:::11.4080:::11.4539:::11.5473:::11.7515:::12.0136:::12.1413:::12.1659:::12.4287:::12.5463:::12.5747:::12.5802:::12.6025:::12.6953:::12.6954:::12.8245:::12.8977:::12.9020:::12.9039:::12.9249'}","{'ner_chunk': 'carcinoma', 'begin': 117, 'end': 125, 'ner_label': 'Cancer_Dx', 'ner_confidence': '0.971', 'code': '8010/3', 'resolution': 'carcinoma', 'all_k_codes': '8010/3:::8010/9:::8420/3:::8480/3:::8240/3:::8550/3:::8140/3:::8010/6:::8530/3:::8980/3:::8070/3:::8010/3-C06.9:::8054/3:::8520/3:::8010/2:::8244/3:::8141/3:::8051/3:::8575/3:::8481/3:::8110/3:::8441/3', 'all_k_resolutions': 'carcinoma:::carcinomatosis:::ceruminous carcinoma:::mucous carcinoma:::carcinoid:::acinar carcinoma:::adenocarcinoma:::secondary carcinoma:::inflammatory carcinoma:::carcinosarcoma:::squamous carcinoma:::carcinoma, of mouth:::warty carcinoma:::lobular carcinoma:::carcinoma in situ:::composite carcinoid:::scirrhous carcinoma:::verrucous carcinoma:::metaplastic carcinoma:::mucin-producing carcinoma:::matrical carcinoma:::serous carcinoma', 'all_k_distances': '0.0000:::3.8145:::6.1551:::6.1916:::6.2402:::6.7504:::6.9016:::6.9301:::7.1193:::7.1926:::7.2897:::7.3397:::7.4187:::7.4689:::7.4910:::7.6357:::7.6536:::7.7765:::7.7787:::7.7825:::7.8719:::7.8739'}","{'ner_chunk': 'colorectal cancer', 'begin': 130, 'end': 146, 'ner_label': 'Oncological', 'ner_confidence': '0.9151', 'code': '8010/3-C18.9', 'resolution': 'carcinoma, of colon', 'all_k_codes': '8010/3-C18.9:::8140/3-C18.9:::8010/3-C18.6:::8140/3-C18.6:::8144/3-C18.9:::8070/3-C18.9:::8144/3-C18.4:::8230/3-C18.9:::8230/3-C18.6:::8980/3-C18.9:::8010/3-C18.4:::8010/3-C18.2:::8144/3-C18.3:::8140/3-C18.4:::8010/3-C17.1:::8145/3-C18.9:::8070/3-C18.6:::8010/3-C26.0:::8144/3-C18.0:::8230/3-C18.4:::8145/3-C18.4:::8140/3-C26.0:::8144/3-C18.2:::8140/3-C18.3:::8010/3-C18.0', 'all_k_resolutions': 'carcinoma, of colon:::adenocarcinoma, of colon:::carcinoma, of descending colon:::adenocarcinoma, of descending colon:::adenocarcinoma, intestinal type of colon:::squamous cell carcinoma, of colon:::adenocarcinoma, intestinal type of transverse colon:::solid carcinoma, of colon:::solid carcinoma, of descending colon:::carcinosarcoma, of colon:::carcinoma, of transverse colon:::carcinoma, of ascending colon:::adenocarcinoma, intestinal type of hepatic flexure of colon:::adenocarcinoma, of transverse colon:::carcinoma, of jejunum:::carcinoma, diffuse type of colon:::squamous cell carcinoma, of descending colon:::carcinoma, of intestinal tract:::adenocarcinoma, intestinal type of cecum:::solid carcinoma, of transverse colon:::carcinoma, diffuse type of transverse colon:::adenocarcinoma, of intestinal tract:::adenocarcinoma, intestinal type of ascending colon:::adenocarcinoma, of hepatic flexure of colon:::carcinoma, of cecum', 'all_k_distances': '7.9763:::8.4326:::8.4762:::8.6332:::8.6565:::8.6715:::8.6837:::8.7222:::8.8114:::8.8851:::8.9076:::8.9230:::8.9351:::8.9363:::9.0841:::9.0984:::9.1072:::9.1194:::9.1574:::9.1843:::9.1883:::9.1909:::9.1956:::9.2173:::9.2272'}","{'ner_chunk': 'parathyroid carcinoma', 'begin': 152, 'end': 172, 'ner_label': 'Oncological', 'ner_confidence': '0.8334', 'code': '8010/3-C75.0', 'resolution': 'carcinoma, of parathyroid gland', 'all_k_codes': '8010/3-C75.0:::8140/3-C75.0:::8141/3-C75.0:::8320/3-C75.0:::8010/2-C75.0:::8022/3-C75.0:::8021/3-C75.0:::8015/3-C75.0:::8140/2-C75.0:::8290/3-C75.0:::9081/3-C75.0:::8323/3-C75.0:::8000/3-C75.0:::9070/3-C75.0:::8012/3-C75.0:::8001/3-C75.0:::9060/3-C75.0:::8147/3-C75.0:::9064/3-C75.0:::8000/6-C75.0:::8011/3-C75.0:::8321/0-C75.0:::8143/3-C75.0:::8255/3-C75.0:::9503/3-C75.0', 'all_k_resolutions': 'carcinoma, of parathyroid gland:::adenocarcinoma, of parathyroid gland:::scirrhous adenocarcinoma of parathyroid gland:::granular cell carcinoma of parathyroid gland:::carcinoma in situ, of parathyroid gland:::pleomorphic carcinoma of parathyroid gland:::carcinoma, anaplastic, of parathyroid gland:::glassy cell carcinoma of parathyroid gland:::adenocarcinoma in situ, of parathyroid gland:::oxyphilic adenocarcinoma of parathyroid gland:::teratocarcinoma of parathyroid gland:::mixed cell adenocarcinoma of parathyroid gland:::neoplasm, malignant of parathyroid gland:::embryonal carcinoma, of parathyroid gland:::large cell carcinoma, of parathyroid gland:::tumor cells, malignant of parathyroid gland:::dysgerminoma of parathyroid gland:::basal cell adenocarcinoma of parathyroid gland:::germinoma of parathyroid gland:::neoplasm, metastatic of parathyroid gland:::epithelioma, malignant of parathyroid gland:::chief cell adenoma of parathyroid gland:::superficial spreading adenocarcinoma of parathyroid gland:::adenocarcinoma with mixed subtypes of parathyroid gland:::neuroepithelioma, of parathyroid gland', 'all_k_distances': '5.6365:::5.9059:::6.3216:::6.3662:::6.4008:::6.4764:::6.5107:::6.5738:::6.9338:::7.1116:::7.1136:::7.1309:::7.3423:::7.3451:::7.4478:::7.4911:::7.5117:::7.5150:::7.5195:::7.6392:::7.6936:::7.8961:::7.9731:::8.0687:::8.1319'}",
1,"{'ner_chunk': 'cancers', 'begin': 45, 'end': 51, 'ner_label': 'Oncological', 'ner_confidence': '0.962', 'code': '8000/3', 'resolution': 'cancer', 'all_k_codes': '8000/3:::8010/3:::8010/9:::800:::8420/3:::8140/3:::8010/3-C76.0:::8010/6:::8010/3-C44.5:::8010/3-C26.0:::8010/3-C76.1:::8000/1:::8240/3:::8010/3-C06.9:::8021/3:::8010/9-C44.9:::8530/3:::8550/3:::8001/1:::8010/3-C77.8:::8230/3:::8010/3-C21.0:::8070/3:::8010/3-C44.9', 'all_k_resolutions': 'cancer:::carcinoma:::carcinomatosis:::neoplasms:::ceruminous carcinoma:::adenocarcinoma:::carcinoma, of head, face or neck:::secondary carcinoma:::carcinoma, of skin of trunk:::carcinoma, of intestinal tract:::carcinoma, of thorax:::neoplasm:::carcinoid:::carcinoma, of mouth:::carcinoma, anaplastic:::carcinomatosis of skin:::inflammatory carcinoma:::acinar carcinoma:::tumor cells:::carcinoma, of lymph nodes of multiple regions:::solid carcinoma:::carcinoma, of anus:::squamous carcinoma:::carcinoma, of skin', 'all_k_distances': '4.5858:::8.0682:::8.9112:::9.9677:::10.0723:::10.1706:::10.3023:::10.3163:::10.3208:::10.3436:::10.3452:::10.3817:::10.3965:::10.4648:::10.4772:::10.5091:::10.5263:::10.6105:::10.6343:::10.6414:::10.6562:::10.6711:::10.6929:::10.7156'}","{'ner_chunk': 'bladder cancer', 'begin': 64, 'end': 77, 'ner_label': 'Oncological', 'ner_confidence': '0.92655003', 'code': '8010/3-C67.9', 'resolution': 'carcinoma, of bladder', 'all_k_codes': '8010/3-C67.9:::8010/3-C67.5:::8230/3-C67.9:::8140/3-C67.9:::8441/3-C67.9:::8120/3-C67.9:::8070/3-C67.9:::8980/3-C67.9:::8140/3-C67.5:::8230/3-C67.5:::8051/3-C67.9:::8510/3-C67.9:::8050/3-C67.9:::8051/3-C67.5:::8560/3-C67.9:::8010/2-C67.9:::8070/3-C67.5:::8120/3-C67.5:::8010/3-C67.1:::8130/3-C67.9:::8120/2-C67.9:::8510/3-C67.5:::8050/3-C67.5:::8120/3:::8980/3-C67.5', 'all_k_resolutions': 'carcinoma, of bladder:::carcinoma, of bladder neck:::solid carcinoma, of bladder:::adenocarcinoma, of bladder:::serous carcinoma, of bladder:::transitional cell carcinoma, of bladder:::squamous cell carcinoma, of bladder:::carcinosarcoma, of bladder:::adenocarcinoma, of bladder neck:::solid carcinoma, of bladder neck:::verrucous carcinoma, of bladder:::medullary carcinoma, of bladder:::papillary carcinoma, of bladder:::verrucous carcinoma, of bladder neck:::adenosquamous carcinoma of bladder:::carcinoma in situ, of bladder:::squamous cell carcinoma, of bladder neck:::transitional cell carcinoma, of bladder neck:::carcinoma, of dome of bladder:::papillary urothelial carcinoma of bladder:::urothelial carcinoma in situ of bladder:::medullary carcinoma, of bladder neck:::papillary carcinoma, of bladder neck:::urothelial carcinoma:::carcinosarcoma, of bladder neck', 'all_k_distances': '6.5813:::6.8966:::7.0982:::7.1115:::7.1915:::7.2715:::7.3158:::7.5210:::7.6405:::7.6936:::7.7054:::7.7917:::7.8292:::7.8487:::7.9305:::8.0157:::8.0769:::8.1050:::8.1611:::8.1740:::8.2709:::8.2711:::8.2749:::8.2969:::8.3491'}","{'ner_chunk': 'skin cancer', 'begin': 85, 'end': 95, 'ner_label': 'Oncological', 'ner_confidence': '0.8052', 'code': '8010/3-C44.9', 'resolution': 'carcinoma, of skin', 'all_k_codes': '8010/3-C44.9:::8010/9-C44.9:::8070/3-C44.9:::8140/3-C44.9:::8980/3-C44.9:::8010/3-C44.5:::8409/3-C44.9:::8560/3-C44.9:::8051/3-C44.9:::8010/2-C44.9:::8201/3-C44.9:::8575/3-C44.9:::8390/3:::8230/3-C44.9:::8070/3:::8094/3-C44.9:::8410/3-C44.9:::8110/3-C44.9:::8010/3:::8070/3-C44.5:::8010/3-C44.4:::8051/3-C44.5:::8247/3-C44.9:::8440/3-C44.9', 'all_k_resolutions': 'carcinoma, of skin:::carcinomatosis of skin:::squamous cell carcinoma, of skin:::adenocarcinoma, of skin:::carcinosarcoma, of skin:::carcinoma, of skin of trunk:::porocarcinoma, of skin:::adenosquamous carcinoma of skin:::verrucous carcinoma, of skin:::carcinoma in situ, of skin:::cribriform carcinoma, of skin:::metaplastic carcinoma, of skin:::skin appendage carcinoma:::solid carcinoma, of skin:::squamous carcinoma:::basosquamous carcinoma of skin:::sebaceous carcinoma of skin:::pilomatrical carcinoma of skin:::carcinoma:::squamous cell carcinoma, of skin of trunk:::carcinoma, of skin of scalp and neck:::verrucous carcinoma, of skin of trunk:::merkel cell carcinoma of skin:::cystadenocarcinoma, of skin', 'all_k_distances': '7.0142:::7.0919:::7.6409:::7.9804:::7.9894:::8.1074:::8.2439:::8.2525:::8.3252:::8.3588:::8.3853:::8.4741:::8.4997:::8.5081:::8.5755:::8.7097:::8.8165:::8.8351:::8.8808:::8.9103:::8.9701:::9.1659:::9.1903:::9.2141'}","{'ner_chunk': 'tumor', 'begin': 164, 'end': 168, 'ner_label': 'Oncological', 'ner_confidence': '0.9592', 'code': '8000/1', 'resolution': 'tumor', 'all_k_codes': '8000/1:::8040/1:::8001/1:::9365/3:::8000/6:::8103/0:::9364/3:::8940/0:::8561/0:::9230/1:::8000/3:::9365/3-C76.1:::8100/0:::8158/3:::800:::8711/0:::9135/1:::8935/1:::8010/3:::8815/1:::8960/3:::8312/3:::8153/3', 'all_k_resolutions': 'tumor:::tumorlet:::tumor cells:::askin tumor:::tumor, secondary:::pilar tumor:::ewing tumor:::mixed tumor:::warthin tumor:::codman tumor:::cancer:::askin tumor of thorax:::brooke tumor:::acth-producing tumor:::neoplasms:::glomus tumor:::dabska tumor:::stromal tumor:::carcinoma:::localized fibrous tumor:::wilms tumor:::grawitz tumor:::g cell tumor', 'all_k_distances': '0.0000:::6.2854:::7.2306:::8.0490:::8.2619:::9.2572:::9.3351:::9.5498:::9.8335:::10.1951:::10.2648:::10.3364:::10.7574:::10.8951:::10.9171:::10.9504:::10.9568:::11.1854:::11.2265:::11.2477:::11.2602:::11.2972:::11.3218'}","{'ner_chunk': 'endometrial cancer', 'begin': 222, 'end': 239, 'ner_label': 'Oncological', 'ner_confidence': '0.96370006', 'code': '8380/3', 'resolution': 'endometrioid carcinoma', 'all_k_codes': '8380/3:::8010/3-C54.1:::8380/3-C57.9:::8575/3-C54.1:::8560/3-C54.1:::8441/3-C54.1:::8140/3-C54.1:::8051/3-C54.1:::8384/3-C54.1:::8230/3-C54.1:::8440/3-C54.1:::8021/3-C54.1:::8010/2-C54.1:::8070/3-C54.1:::8380/3-C53.0:::8262/3-C54.1:::8575/3-C53.0:::8201/3-C54.1:::8120/3-C54.1:::8980/3-C54.1:::8050/3-C54.1:::8380/3-C54.1:::8510/3-C54.1:::8140/2-C54.1', 'all_k_resolutions': 'endometrioid carcinoma:::carcinoma, of endometrium:::endometrioid adenocarcinoma, of female genital tract:::metaplastic carcinoma, of endometrium:::adenosquamous carcinoma of endometrium:::serous carcinoma, of endometrium:::adenocarcinoma, of endometrium:::verrucous carcinoma, of endometrium:::adenocarcinoma, endocervical type, of endometrium:::solid carcinoma, of endometrium:::cystadenocarcinoma, of endometrium:::carcinoma, anaplastic, of endometrium:::carcinoma in situ, of endometrium:::squamous cell carcinoma, of endometrium:::endometrioid adenocarcinoma, of endocervix:::villous adenocarcinoma of endometrium:::metaplastic carcinoma, of endocervix:::cribriform carcinoma, of endometrium:::transitional cell carcinoma, of endometrium:::carcinosarcoma, of endometrium:::papillary carcinoma, of endometrium:::endometrioid adenocarcinoma, of endometrium:::medullary carcinoma, of endometrium:::adenocarcinoma in situ, of endometrium', 'all_k_distances': '5.9497:::6.3493:::6.4270:::6.7552:::6.9049:::6.9146:::6.9304:::6.9807:::7.0144:::7.0688:::7.0777:::7.0950:::7.1113:::7.1223:::7.1493:::7.2301:::7.2546:::7.2696:::7.2941:::7.3189:::7.3300:::7.3586:::7.3971:::7.3997'}","{'ner_chunk': 'squamous cell', 'begin': 245, 'end': 257, 'ner_label': 'Histological_Type', 'ner_confidence': '0.9893', 'code': '805-808', 'resolution': 'squamous cell neoplasms', 'all_k_codes': '805-808:::8070/3:::C08.1:::C32.2:::9084/0:::C09.1:::8075/3:::C44.0:::C69.0:::8720/0:::C44.4:::C44:::C06.0:::C33:::8074/3:::C32:::C72.0:::8074/3-C76.0:::8077/0:::C02.2:::8070/3-C44.9:::8075/3-C44.9:::C09:::C00.5', 'all_k_resolutions': 'squamous cell neoplasms:::squamous carcinoma:::sublingual gland:::subglottis:::dermoid:::tonsillar pillar:::squamous cell carcinoma, adenoid:::skin of lip:::conjunctiva:::nevus:::skin of scalp and neck:::skin:::cheeck mucosa:::trachea:::squamous cell carcinoma, spindle cell:::larynx:::spinal cord:::squamous cell carcinoma, spindle cell of head, face or neck:::low grade squamous intraepithelial lesion:::ventral surface of tongue:::squamous cell carcinoma, of skin:::squamous cell carcinoma, adenoid of skin:::tonsil:::mucosa of lip', 'all_k_distances': '9.6168:::10.4551:::11.0612:::11.5796:::11.7520:::11.9531:::12.0110:::12.0516:::12.0576:::12.0851:::12.0955:::12.0970:::12.1025:::12.1131:::12.1507:::12.1573:::12.1589:::12.3159:::12.4133:::12.4633:::12.4834:::12.5242:::12.5655:::12.5964'}","{'ner_chunk': 'carcinoma', 'begin': 259, 'end': 267, 'ner_label': 'Cancer_Dx', 'ner_confidence': '0.994', 'code': '8010/3', 'resolution': 'carcinoma', 'all_k_codes': '8010/3:::8010/9:::8420/3:::8480/3:::8240/3:::8550/3:::8140/3:::8010/6:::8530/3:::8980/3:::8070/3:::8010/3-C06.9:::8054/3:::8520/3:::8010/2:::8244/3:::8141/3:::8051/3:::8575/3:::8481/3:::8110/3:::8441/3', 'all_k_resolutions': 'carcinoma:::carcinomatosis:::ceruminous carcinoma:::mucous carcinoma:::carcinoid:::acinar carcinoma:::adenocarcinoma:::secondary carcinoma:::inflammatory carcinoma:::carcinosarcoma:::squamous carcinoma:::carcinoma, of mouth:::warty carcinoma:::lobular carcinoma:::carcinoma in situ:::composite carcinoid:::scirrhous carcinoma:::verrucous carcinoma:::metaplastic carcinoma:::mucin-producing carcinoma:::matrical carcinoma:::serous carcinoma', 'all_k_distances': '0.0000:::3.8145:::6.1551:::6.1916:::6.2402:::6.7504:::6.9016:::6.9301:::7.1193:::7.1926:::7.2897:::7.3397:::7.4187:::7.4689:::7.4910:::7.6357:::7.6536:::7.7765:::7.7787:::7.7825:::7.8719:::7.8739'}"


### JSON Lines

In [24]:
transformer = model.transformer(
    instance_count=1,
    instance_type=batch_transform_inference_instance_type,
    accept="application/jsonlines",
    output_path=validation_output_jsonl_path
)
transformer.transform(validation_input_jsonl_path, content_type="application/jsonlines")
transformer.wait()

INFO:sagemaker:Creating model with name: icdo-resolver-pipeline-en-2024-12-03-10-36-16-203
INFO:sagemaker:Creating transform job with name: icdo-resolver-pipeline-en-2024-12-03-10-36-16-939


............................................[34m24/12/03 10:43:38 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable[0m
[34mSetting default log level to "WARN".[0m
[34mTo adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).[0m
[34m#015[Stage 0:>                                                          (0 + 1) / 1]#015#015                                                                                #015#015[Stage 5:>                                                          (0 + 0) / 1]#015#015                                                                                #015#015[Stage 23:>                                                         (0 + 0) / 4]#015#015                                                                                #015INFO:     Started server process [7][0m
[34mINFO:     Waiting for application startup.[0m
[34mINFO:     Application sta

In [25]:
from urllib.parse import urlparse

def process_s3_jsonlines_output_and_save(validation_file_name):

    output_file_path = f"{jsonl_output_dir}/{validation_file_name}.out"
    parsed_url = urlparse(transformer.output_path)
    file_key = f"{parsed_url.path[1:]}{validation_file_name}.out"
    response = s3_client.get_object(Bucket=s3_bucket, Key=file_key)

    data = response["Body"].read().decode("utf-8")
    print(data)

    # Save the data to the output file
    with open(output_file_path, 'w') as f_out:
        for item in data.split('\n'):
            f_out.write(item + '\n')

In [26]:
process_s3_jsonlines_output_and_save(validation_jsonl_file_name)

{"predictions": [{"ner_chunk": "tumor", "begin": 60, "end": 64, "ner_label": "Tumor_Finding", "ner_confidence": "0.9633", "code": "8000/1", "resolution": "tumor", "all_k_codes": "8000/1:::8040/1:::8001/1:::9365/3:::8000/6:::8103/0:::9364/3:::8940/0:::8561/0:::9230/1:::8000/3:::9365/3-C76.1:::8100/0:::8158/3:::800:::8711/0:::9135/1:::8935/1:::8010/3:::8815/1:::8960/3:::8312/3:::8153/3", "all_k_resolutions": "tumor:::tumorlet:::tumor cells:::askin tumor:::tumor, secondary:::pilar tumor:::ewing tumor:::mixed tumor:::warthin tumor:::codman tumor:::cancer:::askin tumor of thorax:::brooke tumor:::acth-producing tumor:::neoplasms:::glomus tumor:::dabska tumor:::stromal tumor:::carcinoma:::localized fibrous tumor:::wilms tumor:::grawitz tumor:::g cell tumor", "all_k_distances": "0.0000:::6.2854:::7.2306:::8.0490:::8.2619:::9.2572:::9.3351:::9.5498:::9.8335:::10.1951:::10.2648:::10.3364:::10.7574:::10.8951:::10.9171:::10.9504:::10.9568:::11.1854:::11.2265:::11.2477:::11.2602:::11.2972:::11.3218

In [27]:
model.delete_model()

INFO:sagemaker:Deleting model with name: icdo-resolver-pipeline-en-2024-12-03-10-36-16-203


### Unsubscribe to the listing (optional)

If you would like to unsubscribe to the model package, follow these steps. Before you cancel the subscription, ensure that you do not have any [deployable model](https://console.aws.amazon.com/sagemaker/home#/models) created from the model package or using the algorithm. Note - You can find this information by looking at the container name associated with the model. 

**Steps to unsubscribe to product from AWS Marketplace**:
1. Navigate to __Machine Learning__ tab on [__Your Software subscriptions page__](https://aws.amazon.com/marketplace/ai/library?productType=ml&ref_=mlmp_gitdemo_indust)
2. Locate the listing that you want to cancel the subscription for, and then choose __Cancel Subscription__  to cancel the subscription.

