[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/spark-nlp-workshop/blob/master/jupyter/enterprise/healthcare/EntityResolution_ICDO.ipynb)

<img src="https://nlp.johnsnowlabs.com/assets/images/logo.png" width="180" height="50" style="float: left;">

# COLAB ENVIRONMENT SETUP

In [12]:
import json

with open('keys.json') as f:
    license_keys = json.load(f)

license_keys.keys()


dict_keys(['version', 'secret', 'SPARK_NLP_LICENSE', 'JSL_OCR_LICENSE', 'AWS_ACCESS_KEY_ID', 'AWS_SECRET_ACCESS_KEY', 'JSL_OCR_SECRET'])

In [13]:
import os

# Install java
! apt-get update -qq
! apt-get install -y openjdk-8-jdk-headless -qq > /dev/null

secret = license_keys.get("secret",license_keys.get('SPARK_NLP_SECRET', ""))
spark_version = os.environ.get("SPARK_VERSION", license_keys.get("SPARK_VERSION","2.4"))
version = license_keys.get("version",license_keys.get('SPARK_NLP_PUBLIC_VERSION', ""))
jsl_version = license_keys.get("jsl_version",license_keys.get('SPARK_NLP_VERSION', ""))

os.environ["JAVA_HOME"] = "/usr/lib/jvm/java-8-openjdk-amd64"
os.environ["PATH"] = os.environ["JAVA_HOME"] + "/bin:" + os.environ["PATH"]
! java -version

os.environ['SPARK_NLP_LICENSE'] = license_keys['SPARK_NLP_LICENSE']
os.environ['JSL_OCR_LICENSE'] = license_keys['JSL_OCR_LICENSE']
os.environ['AWS_ACCESS_KEY_ID']= license_keys['AWS_ACCESS_KEY_ID']
os.environ['AWS_SECRET_ACCESS_KEY'] = license_keys['AWS_SECRET_ACCESS_KEY']

print(spark_version, version, jsl_version)

! python -m pip install "pyspark==$spark_version".*
! python -m pip install --upgrade spark-nlp-jsl==$jsl_version  --extra-index-url https://pypi.johnsnowlabs.com/$secret

import sparknlp
import sparknlp_jsl
from sparknlp.base import *
from sparknlp.annotator import *
from sparknlp_jsl.annotator import *
from pyspark.ml import Pipeline
from pyspark.sql import SparkSession

print (sparknlp.version())
print (sparknlp_jsl.version())

spark = sparknlp_jsl.start(secret, gpu=False, spark23=(spark_version[:3]=="2.3"))

openjdk version "1.8.0_252"
OpenJDK Runtime Environment (build 1.8.0_252-8u252-b09-1~18.04-b09)
OpenJDK 64-Bit Server VM (build 25.252-b09, mixed mode)
Looking in indexes: https://pypi.org/simple, https://pypi.johnsnowlabs.com/8zvTuUjWPt
Requirement already up-to-date: spark-nlp-jsl==2.5.2 in /usr/local/lib/python3.6/dist-packages (2.5.2)
2.5.2


# ICD-O Entity Resolution - version 2.5.1

## Example for ICD-O Entity Resolution Pipeline
A common NLP problem in medical applications is to identify histology behaviour in documented cancer studies.

In this example we will use Spark-NLP to identify and resolve histology behavior expressions and resolve them to an ICD-O code.

Some cancer related clinical notes (taken from https://www.cancernetwork.com/case-studies):  
https://www.cancernetwork.com/case-studies/large-scrotal-mass-multifocal-intra-abdominal-retroperitoneal-and-pelvic-metastases  
https://oncology.medicinematters.com/lymphoma/chronic-lymphocytic-leukemia/case-study-small-b-cell-lymphocytic-lymphoma-and-chronic-lymphoc/12133054
https://oncology.medicinematters.com/lymphoma/epidemiology/central-nervous-system-lymphoma/12124056
https://oncology.medicinematters.com/lymphoma/case-study-cutaneous-t-cell-lymphoma/12129416

Note 1: Desmoplastic small round cell tumor
<div style="border:2px solid #747474; background-color: #e3e3e3; margin: 5px; padding: 10px"> 
A 35-year-old African-American man was referred to our urology clinic by his primary care physician for consultation about a large left scrotal mass. The patient reported a 3-month history of left scrotal swelling that had progressively increased in size and was associated with mild left scrotal pain. He also had complaints of mild constipation, with hard stools every other day. He denied any urinary complaints. On physical examination, a hard paratesticular mass could be palpated in the left hemiscrotum extending into the left groin, separate from the left testicle, and measuring approximately 10 × 7 cm in size. A hard, lower abdominal mass in the suprapubic region could also be palpated in the midline. The patient was admitted urgently to the hospital for further evaluation with cross-sectional imaging and blood work.

Laboratory results, including results of a complete blood cell count with differential, liver function tests, coagulation panel, and basic chemistry panel, were unremarkable except for a serum creatinine level of 2.6 mg/dL. Typical markers for a testicular germ cell tumor were within normal limits: the beta–human chorionic gonadotropin level was less than 1 mIU/mL and the alpha fetoprotein level was less than 2.8 ng/mL. A CT scan of the chest, abdomen, and pelvis with intravenous contrast was obtained, and it showed large multifocal intra-abdominal, retroperitoneal, and pelvic masses (Figure 1). On cross-sectional imaging, a 7.8-cm para-aortic mass was visualized compressing the proximal portion of the left ureter, creating moderate left hydroureteronephrosis. Additionally, three separate pelvic masses were present in the retrovesical space, each measuring approximately 5 to 10 cm at their largest diameter; these displaced the bladder anteriorly and the rectum posteriorly.

The patient underwent ultrasound-guided needle biopsy of one of the pelvic masses on hospital day 3 for definitive diagnosis. Microscopic examination of the tissue by our pathologist revealed cellular islands with oval to elongated, irregular, and hyperchromatic nuclei; scant cytoplasm; and invading fibrous tissue—as well as three mitoses per high-powered field (Figure 2). Immunohistochemical staining demonstrated strong positivity for cytokeratin AE1/AE3, vimentin, and desmin. Further mutational analysis of the cells detected the presence of an EWS-WT1 fusion transcript consistent with a diagnosis of desmoplastic small round cell tumor.
</div>

Note 2: SLL and CLL
<div style="border:2px solid #747474; background-color: #e3e3e3; margin: 5px; padding: 10px"> 
A 72-year-old man with a history of diabetes mellitus, hypertension, and hypercholesterolemia self-palpated a left submandibular lump in 2012. Complete blood count (CBC) in his internist’s office showed solitary leukocytosis (white count 22) with predominant lymphocytes for which he was referred to a hematologist. Peripheral blood flow cytometry on 04/11/12 confirmed chronic lymphocytic leukemia (CLL)/small lymphocytic lymphoma (SLL): abnormal cell population comprising 63% of CD45 positive leukocytes, co-expressing CD5 and CD23 in CD19-positive B cells. CD38 was negative but other prognostic markers were not assessed at that time. The patient was observed regularly for the next 3 years and his white count trend was as follows: 22.8 (4/2012) --> 28.5 (07/2012) --> 32.2 (12/2012) --> 36.5 (02/2013) --> 42 (09/2013) --> 44.9 (01/2014) --> 75.8 (2/2015). His other counts stayed normal until early 2015 when he also developed anemia (hemoglobin [HGB] 10.9) although platelets remained normal at 215. He had been noticing enlargement of his cervical, submandibular, supraclavicular, and axillary lymphadenopathy for several months since 2014 and a positron emission tomography (PET)/computed tomography (CT) scan done in 12/2014 had shown extensive diffuse lymphadenopathy within the neck, chest, abdomen, and pelvis. Maximum standardized uptake value (SUV max) was similar to low baseline activity within the vasculature of the neck and chest. In the abdomen and pelvis, however, there was mild to moderately hypermetabolic adenopathy measuring up to SUV of 4. The largest right neck nodes measured up to 2.3 x 3 cm and left neck nodes measured up to 2.3 x 1.5 cm. His right axillary lymphadenopathy measured up to 5.5 x 2.6 cm and on the left measured up to 4.8 x 3.4 cm. Lymph nodes on the right abdomen and pelvis measured up to 6.7 cm and seemed to have some mass effect with compression on the urinary bladder without symptoms. He underwent a bone marrow biopsy on 02/03/15, which revealed hypercellular marrow (60%) with involvement by CLL (30%); flow cytometry showed CD38 and ZAP-70 positivity; fluorescence in situ hybridization (FISH) analysis showed 13q deletion/monosomy 13; IgVH was unmutated; karyotype was 46XY.
</div>

Note 3: CNS lymphoma
<div style="border:2px solid #747474; background-color: #e3e3e3; margin: 5px; padding: 10px"> 
A 56-year-old woman began to experience vertigo, headaches, and frequent falls. A computed tomography (CT) scan of the brain revealed the presence of a 1.6 x 1.6 x 2.1 cm mass involving the fourth ventricle (Figure 14.1). A gadolinium-enhanced magnetic resonance imaging (MRI) scan confirmed the presence of the mass, and a stereotactic biopsy was performed that demonstrated a primary central nervous system lymphoma (PCNSL) with a diffuse large B-cell histology. Complete blood count (CBC), lactate dehydrogenase (LDH), and beta-2-microglobulin were normal. Systemic staging with a positron emission tomography (PET)/CT scan and bone marrow biopsy showed no evidence of lymphomatous involvement outside the CNS. An eye exam and lumbar puncture showed no evidence of either ocular or leptomeningeal involvement.
</div>

Note 4: Cutaneous T-cell lymphoma
<div style="border:2px solid #747474; background-color: #e3e3e3; margin: 5px; padding: 10px"> 
An 83-year-old female presented with a progressing pruritic cutaneous rash that started 8 years ago. On clinical exam there were numerous coalescing, infiltrated, scaly, and partially crusted erythematous plaques distributed over her trunk and extremities and a large fungating ulcerated nodule on her right thigh covering 75% of her total body surface area (Figure 10.1). Lymphoma associated alopecia and a left axillary lymphadenopathy were also noted. For the past 3–4 months she reported fatigue, severe pruritus, night sweats, 20 pounds of weight loss, and loss of appetite. 
</div>

In [0]:
import sys, os, time, pandas as pd

from sparknlp.base import *
from sparknlp.annotator import *
from sparknlp_jsl.annotator import *
import pyspark.sql.functions as F
from pyspark.sql.types import StructType, StructField, StringType

pd.set_option('display.max_colwidth', 250)
pd.set_option('display.max_rows', 500)

## Let's create a dataset with all four case studies

In [0]:
notes = []
notes.append("""A 35-year-old African-American man was referred to our urology clinic by his primary care physician for consultation about a large left scrotal mass. The patient reported a 3-month history of left scrotal swelling that had progressively increased in size and was associated with mild left scrotal pain. He also had complaints of mild constipation, with hard stools every other day. He denied any urinary complaints. On physical examination, a hard paratesticular mass could be palpated in the left hemiscrotum extending into the left groin, separate from the left testicle, and measuring approximately 10 × 7 cm in size. A hard, lower abdominal mass in the suprapubic region could also be palpated in the midline. The patient was admitted urgently to the hospital for further evaluation with cross-sectional imaging and blood work.
Laboratory results, including results of a complete blood cell count with differential, liver function tests, coagulation panel, and basic chemistry panel, were unremarkable except for a serum creatinine level of 2.6 mg/dL. Typical markers for a testicular germ cell tumor were within normal limits: the beta–human chorionic gonadotropin level was less than 1 mIU/mL and the alpha fetoprotein level was less than 2.8 ng/mL. A CT scan of the chest, abdomen, and pelvis with intravenous contrast was obtained, and it showed large multifocal intra-abdominal, retroperitoneal, and pelvic masses (Figure 1). On cross-sectional imaging, a 7.8-cm para-aortic mass was visualized compressing the proximal portion of the left ureter, creating moderate left hydroureteronephrosis. Additionally, three separate pelvic masses were present in the retrovesical space, each measuring approximately 5 to 10 cm at their largest diameter; these displaced the bladder anteriorly and the rectum posteriorly.
The patient underwent ultrasound-guided needle biopsy of one of the pelvic masses on hospital day 3 for definitive diagnosis. Microscopic examination of the tissue by our pathologist revealed cellular islands with oval to elongated, irregular, and hyperchromatic nuclei; scant cytoplasm; and invading fibrous tissue—as well as three mitoses per high-powered field (Figure 2). Immunohistochemical staining demonstrated strong positivity for cytokeratin AE1/AE3, vimentin, and desmin. Further mutational analysis of the cells detected the presence of an EWS-WT1 fusion transcript consistent with a diagnosis of desmoplastic small round cell tumor.""")
notes.append("""A 72-year-old man with a history of diabetes mellitus, hypertension, and hypercholesterolemia self-palpated a left submandibular lump in 2012. Complete blood count (CBC) in his internist’s office showed solitary leukocytosis (white count 22) with predominant lymphocytes for which he was referred to a hematologist. Peripheral blood flow cytometry on 04/11/12 confirmed chronic lymphocytic leukemia (CLL)/small lymphocytic lymphoma (SLL): abnormal cell population comprising 63% of CD45 positive leukocytes, co-expressing CD5 and CD23 in CD19-positive B cells. CD38 was negative but other prognostic markers were not assessed at that time. The patient was observed regularly for the next 3 years and his white count trend was as follows: 22.8 (4/2012) --> 28.5 (07/2012) --> 32.2 (12/2012) --> 36.5 (02/2013) --> 42 (09/2013) --> 44.9 (01/2014) --> 75.8 (2/2015). His other counts stayed normal until early 2015 when he also developed anemia (hemoglobin [HGB] 10.9) although platelets remained normal at 215. He had been noticing enlargement of his cervical, submandibular, supraclavicular, and axillary lymphadenopathy for several months since 2014 and a positron emission tomography (PET)/computed tomography (CT) scan done in 12/2014 had shown extensive diffuse lymphadenopathy within the neck, chest, abdomen, and pelvis. Maximum standardized uptake value (SUV max) was similar to low baseline activity within the vasculature of the neck and chest. In the abdomen and pelvis, however, there was mild to moderately hypermetabolic adenopathy measuring up to SUV of 4. The largest right neck nodes measured up to 2.3 x 3 cm and left neck nodes measured up to 2.3 x 1.5 cm. His right axillary lymphadenopathy measured up to 5.5 x 2.6 cm and on the left measured up to 4.8 x 3.4 cm. Lymph nodes on the right abdomen and pelvis measured up to 6.7 cm and seemed to have some mass effect with compression on the urinary bladder without symptoms. He underwent a bone marrow biopsy on 02/03/15, which revealed hypercellular marrow (60%) with involvement by CLL (30%); flow cytometry showed CD38 and ZAP-70 positivity; fluorescence in situ hybridization (FISH) analysis showed 13q deletion/monosomy 13; IgVH was unmutated; karyotype was 46XY.""")
notes.append("A 56-year-old woman began to experience vertigo, headaches, and frequent falls. A computed tomography (CT) scan of the brain revealed the presence of a 1.6 x 1.6 x 2.1 cm mass involving the fourth ventricle (Figure 14.1). A gadolinium-enhanced magnetic resonance imaging (MRI) scan confirmed the presence of the mass, and a stereotactic biopsy was performed that demonstrated a primary central nervous system lymphoma (PCNSL) with a diffuse large B-cell histology. Complete blood count (CBC), lactate dehydrogenase (LDH), and beta-2-microglobulin were normal. Systemic staging with a positron emission tomography (PET)/CT scan and bone marrow biopsy showed no evidence of lymphomatous involvement outside the CNS. An eye exam and lumbar puncture showed no evidence of either ocular or leptomeningeal involvement.") 
notes.append("An 83-year-old female presented with a progressing pruritic cutaneous rash that started 8 years ago. On clinical exam there were numerous coalescing, infiltrated, scaly, and partially crusted erythematous plaques distributed over her trunk and extremities and a large fungating ulcerated nodule on her right thigh covering 75% of her total body surface area (Figure 10.1). Lymphoma associated alopecia and a left axillary lymphadenopathy were also noted. For the past 3–4 months she reported fatigue, severe pruritus, night sweats, 20 pounds of weight loss, and loss of appetite.")

data = spark.createDataFrame([(i,n,) for i,n in enumerate(notes)], 
                             StructType([StructField("doc", StringType()),
                                         StructField("description", StringType())]))

## And let's build a SparkNLP pipeline with the following stages:
- DocumentAssembler: Entry annotator for our pipelines; it creates the data structure for the Annotation Framework
- SentenceDetector: Annotator to pragmatically separate complete sentences inside each document
- Tokenizer: Annotator to separate sentences in tokens (generally words)
- StopWordsCleaner: Annotator to remove words defined as StopWords in SparkML
- WordEmbeddings: Vectorization of word tokens, in this case using word embeddings trained from PubMed, ICD10 and other clinical resources.
- ChunkEmbeddings: Aggregates the WordEmbeddings for each NER Chunk
- BioNLP NER + NerConverter: This annotators return Chunks related to Cancer and Genetics diseases
- ChunkEntityResolver: Annotator that performs search for the KNNs, in this case trained from ICDO Histology Behavior.

In [16]:
#Usual preparation Annotators
docAssembler = DocumentAssembler().setInputCol("description").setOutputCol("document")
sentenceDetector = SentenceDetector().setInputCols("document").setOutputCol("sentence")
tokenizer = Tokenizer().setInputCols("sentence").setOutputCol("token")
stopCleaner = StopWordsCleaner().setOutputCol("clean_token")

#Embeddings and their aggregations per chunk
embeddings = WordEmbeddingsModel.pretrained("embeddings_clinical", "en", "clinical/models")\
    .setInputCols("sentence", "clean_token")\
    .setOutputCol("embeddings")
chunkEmbeddings = ChunkEmbeddings()\
  .setInputCols("ner_chunk", "embeddings")\
  .setOutputCol("chunk_embeddings")

# Annotators responsible for the Canger Genetics Entity Recognition task
nerTagger = NerDLModel.pretrained("ner_bionlp", "en", "clinical/models")\
    .setInputCols("sentence", "clean_token","embeddings")\
    .setOutputCol("ner")
nerConverter = NerConverter().setInputCols("sentence","token","ner").setOutputCol("ner_chunk")\
.setWhiteList(['Cellular_component', 'Multi-tissue_structure', 'Organism_substance',
       'Gene_or_gene_product', 'Organism_subdivision', 'Cancer',
       'Cell'])

embeddings_clinical download started this may take some time.
Approximate size to download 1.6 GB
[OK!]
ner_bionlp download started this may take some time.
Approximate size to download 13.9 MB
[OK!]


In [17]:
chunkResolverCM = ChunkEntityResolverModel.pretrained("chunkresolve_icd10cm_neoplasms_clinical", "en", "clinical/models")\
    .setNeighbours(100).setAlternatives(10)\
    .setEnableLevenshtein(True).setDistanceWeights([1,2,2,0,0,2])\
    .setInputCols("clean_token","chunk_embeddings")\
    .setOutputCol("icdcm_code")

chunkResolverO = ChunkEntityResolverModel.pretrained("chunkresolve_icdo_clinical", "en", "clinical/models")\
    .setNeighbours(100).setAlternatives(10)\
    .setEnableLevenshtein(True).setDistanceWeights([1,2,2,0,0,2])\
    .setInputCols("clean_token","chunk_embeddings")\
    .setOutputCol("icdo_code")

chunkresolve_icd10cm_neoplasms_clinical download started this may take some time.
Approximate size to download 20.4 MB
[OK!]
chunkresolve_icdo_clinical download started this may take some time.
Approximate size to download 8.2 MB
[OK!]


In [0]:
pipelineFull = Pipeline().setStages([
    docAssembler, 
    sentenceDetector, 
    tokenizer, 
    stopCleaner, 
    embeddings, 
    nerTagger,
    nerConverter,
    chunkEmbeddings, 
    chunkResolverCM,
    chunkResolverO])

Let's train our Pipeline and make it ready to start transforming

In [0]:
pipelineModelFull = pipelineFull.fit(data)

In [0]:
output = pipelineModelFull.transform(data).cache()

## The key parts of our model are the **WordEmbeddings and ChunkEntityResolver**: 

### WordEmebeddings:   
Word2Vec model trained on semantically augmented datasets using information from curated Datasets in JSL Data Market.  

### EntityResolver:  
Trained on an augmented ICDO Dataset from JSL Data Market it provides histology codes resolution for the matched expressions. Other than providing the code in the "result" field it provides more metadata about the matching process:  

- all_k_results -> Sorted ResolverLabels in the top `alternatives` that match the distance `threshold`
- all_k_resolutions -> Respective ResolverNormalized strings
- all_k_distances -> Respective distance values after aggregation
- all_k_wmd_distances -> Respective WMD distance values
- all_k_tfidf_distances -> Respective TFIDF Cosinge distance values
- all_k_jaccard_distances -> Respective Jaccard distance values
- all_k_sorensen_distances -> Respective SorensenDice distance values
- all_k_jaro_distances -> Respective JaroWinkler distance values
- all_k_levenshtein_distances -> Respective Levenshtein distance values
- all_k_confidences -> Respective normalized probabilities based in inverse distance values
- all_k_confidence_ratios -> Convenience indicator calculated as the ratio between the i_th and the (i+1)_th probability
- target_text -> The actual searched string
- resolved_text -> The top ResolverNormalized string
- confidence -> Top probability
- confidence_ratio -> Top confidence ratio
- distance -> Top distance value
- sentence -> Sentence index
- chunk -> Chunk Index
- token -> Token index

In [0]:
# analysis_df = output\
# .selectExpr("doc","""explode(arrays_zip(
# ner_chunk.result,
# ner_chunk.metadata,
# icdcm_code.result,
# icdo_code.result,
# icdcm_code.metadata,
# icdo_code.metadata
# )) as arrays""")\
# .selectExpr("doc",
#             "arrays['0'] as chunk",
#             "arrays['1'].entity as entity",
#             "concat_ws('::',arrays['2'],float(arrays['4'].all_k_confidences)) as cmcode_conf",
#             "split(arrays['4'].all_k_resolutions,':::') as cm_res",
#             "concat_ws('::',arrays['3'],float(arrays['5'].all_k_confidences)) as ocode_conf",
#             "split(arrays['5'].all_k_resolutions,':::') as o_res")\
# .where("entity='Cancer'")\
# .toPandas()

In [22]:
#analysis_df

Unnamed: 0,doc,chunk,entity,cmcode_conf,cm_res,ocode_conf,o_res
0,0,testicular germ cell tumor,Cancer,C6290,"[Malignant neoplasm of unspecified testis, unspecified whether descended or undescended, Malignant (primary) neoplasm, unspecified, Malignant neoplasm of unspecified ovary, Malignant neoplasm of right ovary, Malignant neoplasm of left ovary, Beni...",9085/3,"[Mixed germ cell tumor, Germ cell tumor, nonseminomatous, Germ cell tumors with associated hematological malignancy, Androblastoma, malignant, Merkel cell carcinoma, Vipoma, Granulosa cell-theca cell tumor, mal., Granulosa cell tumor, malignant, ..."
1,0,pelvic masses,Cancer,C763,"[Malignant neoplasm of pelvis, Malignant neoplasm of specified parts of peritoneum, Benign neoplasm of peripheral nerves and autonomic nervous system of pelvis, Benign neoplasm of peripheral nerves and autonomic nervous system of abdomen, Maligna...",8312/3,"[Renal cell carcinoma, Craniopharyngioma, Neoplasm, benign, Blue nevus, malignant, Papillary craniopharyngioma, T-cell large granular lymphocytic leukemia, Adamantinomatous craniopharyngioma, Paraganglioma, malignant, Hemangiopericytoma, NOS, Aor..."
2,0,desmoplastic small,Cancer,C439,"[Malignant melanoma of skin, unspecified, Benign neoplasm of unspecified part of small intestine, Neoplasm of uncertain behavior of connective and other soft tissue, Malignant neoplasm of small intestine, unspecified, Secondary malignant neoplasm...",8806/3,"[Desmoplastic small round cell tumor, Desmoplastic melanoma, malignant, Desmoplastic medulloblastoma, Duct carcinoma, desmoplastic type, Malignant lymphoma, non-Hodgkin, Small cell sarcoma, Oat cell carcinoma, Small cell osteosarcoma, Malignant l..."
3,0,cell tumor,Cancer,D489,"[Neoplasm of uncertain behavior, unspecified, Mast cell sarcoma, Malignant neoplasm of unspecified testis, unspecified whether descended or undescended, Merkel cell carcinoma, unspecified, Benign neoplasm of endocrine pancreas, Benign neoplasm of...",8630/3,"[Androblastoma, malignant, Enterochromaffin-like cell tumor, malignant, Granulosa cell-theca cell tumor, mal., Merkel cell carcinoma, Acinar cell carcinoma, Mixed germ cell tumor, Granular cell tumor, malignant, Islet cell carcinoma, Insulinoma, ..."
4,1,chronic lymphocytic leukemia,Cancer,C9110,"[Chronic lymphocytic leukemia of B-cell type not having achieved remission, Chronic myeloid leukemia, BCR/ABL-positive, not having achieved remission, Chronic myelomonocytic leukemia not having achieved remission, Acute lymphoblastic leukemia not...",9805/3,"[Acute biphenotypic leukemia, Precursor T-cell lymphoblastic lymphoma, Juvenile myelomonocytic leukemia, Chronic lymphocytic leukemia/small lymphocytic lymphoma, Burkitt cell leukemia, Precursor cell lymphoblastic leukemia, NOS, Chronic myeloid l..."
5,1,lymphocytic lymphoma,Cancer,C8350,"[Lymphoblastic (diffuse) lymphoma, unspecified site, Small cell B-cell lymphoma, unspecified site, Mantle cell lymphoma, spleen, Waldenstrom macroglobulinemia, Non-Hodgkin lymphoma, unspecified, unspecified site, Enteropathy-type (intestinal) T-c...",9673/3,"[Mantle cell lymphoma, Chronic lymphocytic leukemia/small lymphocytic lymphoma, Follicular lymphoma, NOS, Waldenstrom macroglobulinemia, Malignant lymphoma, non-Hodgkin, Hodgkin lymphoma, lymphocytic deplet., NOS, Follicular lymphoma, grade 1, Se..."
6,1,SLL,Cancer,C8295,"[Follicular lymphoma, unspecified, lymph nodes of inguinal region and lower limb, Follicular lymphoma grade II, unspecified site, Follicular lymphoma grade IIIa, spleen, Follicular lymphoma grade IIIb, spleen, Follicular lymphoma grade III, unspe...",9705/3,"[Angioimmunoblastic T-cell lymphoma, Mantle cell lymphoma, Hodgkin lymphoma, lymphocyte-rich, Primary effusion lymphoma, Hodgkin lymphoma, nodular sclerosis, NOS, Malignant lymphoma, non-Hodgkin, Composite Hodgkin and non-Hodgkin lymphoma, Plasma..."
7,1,CLL,Cancer,C8337,"[Diffuse large B-cell lymphoma, spleen, Other lymphoid leukemia not having achieved remission, Prolymphocytic leukemia of T-cell type not having achieved remission, Prolymphocytic leukemia of B-cell type not having achieved remission, Lymphoid le...",9679/3,"[Mediastinal large B-cell lymphoma, Acute myeloid leukemia, Hairy cell leukemia, Plasma cell leukemia, Prolymphocytic leukemia, NOS, Acute monocytic leukemia, Acute megakaryoblastic leukemia, Burkitt cell leukemia, Chronic myelogenous leukemia, B..."
8,2,central nervous system lymphoma,Cancer,C8589,"[Other specified types of non-Hodgkin lymphoma, extranodal and solid organ sites, Other non-follicular lymphoma, unspecified site, Other non-follicular lymphoma, spleen, Malignant neoplasm of central nervous system, unspecified, Other non-follicu...",9501/3,"[Medulloepithelioma, NOS, Medulloblastoma, WNT-activated, Medulloblastoma, non-WNT/non-SHH, Craniopharyngioma, Medulloblastoma, NOS, Oligodendroglioma, NOS, Medulloblastoma, SHH-activated and TP53-mutant, Oligodendroglioma, anaplastic, Central os..."
9,2,PCNSL,Cancer,C8448,"[Peripheral T-cell lymphoma, not classified, lymph nodes of multiple sites, Peripheral T-cell lymphoma, not classified, intrathoracic lymph nodes, Peripheral T-cell lymphoma, not classified, lymph nodes of inguinal region and lower limb, Peripher...",9679/3,"[Mediastinal large B-cell lymphoma, Hydroa vacciniforme-like lymphoma, Histiocytic sarcoma, Mantle cell lymphoma, Sezary syndrome, Malignant lymphoma, non-Hodgkin, Angioimmunoblastic T-cell lymphoma, Waldenstrom macroglobulinemia, Plasmablastic l..."
