# Identifying reports that describe CRC and extracting the TNM stage


Andres Tamm

2025-07-21

<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Prepare-data" data-toc-modified-id="Prepare-data-1">Prepare data</a></span></li><li><span><a href="#Step-1.-Identify-reports-that-describe-current-colorectal-cancer" data-toc-modified-id="Step-1.-Identify-reports-that-describe-current-colorectal-cancer-2">Step 1. Identify reports that describe current colorectal cancer</a></span></li><li><span><a href="#Step-2.-Extract-TNM-phrases-from-reports" data-toc-modified-id="Step-2.-Extract-TNM-phrases-from-reports-3">Step 2. Extract TNM phrases from reports</a></span></li><li><span><a href="#Step-3.-Extract-TNM-values-from-phrases" data-toc-modified-id="Step-3.-Extract-TNM-values-from-phrases-4">Step 3. Extract TNM values from phrases</a></span></li></ul></div>

In [24]:
import pandas as pd
import numpy as np
from textmining.reports import get_crc_reports
from textmining.tnm.clean import add_tumour_tnm
from textmining.tnm.tnm import get_tnm_phrase, get_tnm_values
from pathlib import Path

## Prepare data

Reports should be in a Pandas DataFrame, where one of the columns contains the report text. 


I would usually load real data by 

```python
data_dir = Path("C:\\path\\to\\folder\\with\\data")
filename = "histopathology_OUH.csv"
df = pd.read_csv(data_dir / filename)
```

In [25]:
# Get reports 
reports = ['Metastatic tumour from colorectal primary, T3 N0',
           'T1 N0 MX (colorectal cancer)',
           'pT3/2/1 N0 Mx. Malignant neoplasm ascending colon',
           'pT2a/b N0 Mx (sigmoid tumour)',
           'T4a & b N0 M1 invasive carcinoma, descending colon',
           'T1-weighted image, ... rectal tumour staged as ymrT2',
           'Colorectal tumour. Stage: T4b / T4a / T3 / T2 / T1',
           'Sigmoid adenocarcinoma, ... Summary: pT1 (sigmoid, txt txt txt txt), N3b M0',
           'Colorectal tumour in situ, Tis N0 M0',
           'Clinical information: T1 N1 (sigmoid tumour)'
           ]
df = pd.DataFrame(reports, columns=['report_text_anon'])
df['subject'] = '01'

pd.set_option('display.max_colwidth', 500, 'display.max_rows', 1000, 'display.min_rows', 1000)
display(df)

Unnamed: 0,report_text_anon,subject
0,"Metastatic tumour from colorectal primary, T3 N0",1
1,T1 N0 MX (colorectal cancer),1
2,pT3/2/1 N0 Mx. Malignant neoplasm ascending colon,1
3,pT2a/b N0 Mx (sigmoid tumour),1
4,"T4a & b N0 M1 invasive carcinoma, descending colon",1
5,"T1-weighted image, ... rectal tumour staged as ymrT2",1
6,Colorectal tumour. Stage: T4b / T4a / T3 / T2 / T1,1
7,"Sigmoid adenocarcinoma, ... Summary: pT1 (sigmoid, txt txt txt txt), N3b M0",1
8,"Colorectal tumour in situ, Tis N0 M0",1
9,Clinical information: T1 N1 (sigmoid tumour),1


## Step 1. Identify reports that describe current colorectal cancer

Main arguments
* `df`: Pandas DataFrame that contains reports (one report per row)
* `col`: name of column in `df` that contains reports

Outputs
* dataframe that contains reports that describe colorectal cancer (a subset of rows of `df`)
* dataframe that contains all matches for colorectal cancer - some of these matches are marked as excluded (`exclusion_indicator = 1`), because they do not correspond to current colorectal cancer


In [26]:
pd.set_option('display.max_colwidth', 2000, 'display.min_rows', 1000, 'display.max_rows', 1000)

In [27]:
# Run
df_crc, matches_crc = get_crc_reports(df=df, col='report_text_anon')


Sites included: ['caecum' 'right (ascending) colon' 'hepatic flexure' 'transverse colon'
 'splenic flexure' 'left (descending) colon' 'sigmoid colon' 'rectum'
 'colon' 'colon and rectum']
Sites excluded: ['liver' 'lung' 'pelvis' 'uterus' 'ovaries' 'bladder' 'spleen'
 'anastomosis' 'adrenal gland' 'kidney' 'bone' 'pleura' 'brain' 'head'
 'prostate']


100%|██████████| 10/10 [00:00<00:00, 2158.45it/s]
100%|██████████| 10/10 [00:00<00:00, 863.93it/s]
100%|██████████| 7/7 [00:00<00:00, 102.65it/s]
100%|██████████| 7/7 [00:00<00:00, 602.88it/s]
100%|██████████| 7/7 [00:00<00:00, 888.57it/s]
100%|██████████| 8/8 [00:00<00:00, 358.59it/s]
100%|██████████| 8/8 [00:00<00:00, 657.69it/s]
100%|██████████| 8/8 [00:00<00:00, 932.20it/s]
100%|██████████| 7/7 [00:00<00:00, 1362.17it/s]
100%|██████████| 7/7 [00:00<00:00, 1536.78it/s]
100%|██████████| 7/7 [00:00<00:00, 1704.41it/s]
100%|██████████| 10/10 [00:00<00:00, 704.43it/s]
100%|██████████| 10/10 [00:00<00:00, 928.50it/s]
100%|██████████| 10/10 [00:00<00:00, 5552.43it/s]
100%|██████████| 10/10 [00:00<00:00, 4415.06it/s]
100%|██████████| 10/10 [00:00<00:00, 3666.99it/s]
100%|██████████| 9/9 [00:00<00:00, 4548.04it/s]
100%|██████████| 10/10 [00:00<00:00, 626.85it/s]
100%|██████████| 10/10 [00:00<00:00, 1129.69it/s]
100%|██████████| 10/10 [00:00<00:00, 5046.08it/s]
100%|██████████| 9/9 [00:00<00

Time elapsed: 0.011885631084442138 minutes





In [28]:
# Get included and excluded matches
matches_incl = matches_crc.loc[matches_crc.exclusion_indicator==0]
matches_excl = matches_crc.loc[matches_crc.exclusion_indicator==1]

In [29]:
# Included matches
#   'row' corresponds to the row number of input dataframe, NB counting from 0 (0 - first row, 1 - second row, ...)
print('{} matches for tumour keywords were excluded'.format(matches_incl.shape[0]))
matches_incl[['row', 'left', 'target', 'right']]

8 matches for tumour keywords were excluded


Unnamed: 0,row,left,target,right
7,1,T1 N0 MX (,colorectal cancer,)
1,2,pT3/2/1 N0 Mx.,Malignant neoplasm,ascending colon
2,3,pT2a/b N0 Mx (sigmoid,tumour,)
3,4,T4a & b N0 M1 invasive,carcinoma,", descending colon"
4,5,"T1-weighted image, ... rectal",tumour,staged as ymrT2
8,6,,Colorectal tumour,. Stage: T4b / T4a / T3 / T2 / T1
5,7,Sigmoid,adenocarcinoma,", ... Summary: pT1 (sigmoid, txt txt txt txt), N3b M0"
9,8,,Colorectal tumour,"in situ, Tis N0 M0"


In [30]:
# Excluded matches
print('{} matches for tumour keywords were excluded'.format(matches_excl.shape[0]))
matches_excl[['row', 'left', 'target', 'right', 'exclusion_indicator', 'exclusion_reason']]

2 matches for tumour keywords were excluded


Unnamed: 0,row,left,target,right,exclusion_indicator,exclusion_reason
0,0,Metastatic,tumour,"from colorectal primary, T3 N0",1,metastatic;
6,9,Clinical information: T1 N1 (sigmoid,tumour,),1,site historic or general;historic;


In [31]:
# If some included matches are not correct after review, they can be manually excluded
# In that case, the CRC reports can be identified as
df['row'] = np.arange(df.shape[0])
df['crc_nlp'] = 0
matches_incl = matches_crc.loc[matches_crc.exclusion_indicator==0]
matches_incl_processed = matches_incl # processed matches, add any processing steps
df.loc[df.row.isin(matches_incl_processed.row), 'crc_nlp'] = 1
df_crc = df.loc[df.crc_nlp == 1]

## Step 2. Extract TNM phrases from reports

I am first running `get_tnm_phrase` to get all TNM sequences (e.g. `T1 N0 M0`) and all phrases with single TNM values (e.g. `stage: T1`). 

I am then running `add_tumour_tnm` to identify tumour keywords that occur near the TNM phrases. This can help decide which tumour the TNM phrase refers to. BUT it is not necessary to run this step.

Main arguments for `get_tnm_phrase`
* `df`  : DataFrame that contains reports
* `col` : column in `df` that contains the report text
* `remove_unusual` : remove unusual TNM phrases from output. For example, if 5 T-values are given in sequence, it is likely a multiple choice option not an actual TNM stage. True by default.
* `remove_historical`: remove TNM phrases that were marked to be historical based on nearby words. False by default, because that part of the code may not be accurate at the moment.
* `remove_falsepos`: remove phrases with single TNM values, if they do not have inclusion keywords or if they have exclusion keywords. For example, `T1-weighted` is removed, as it is not a T-stage. True by default.

Main arguments for `add_tumour_tnm`
* `df`         : dataframe that contains reports
* `matches`    : dataframe that contains matches for TNM phrases - this is the first output of 'get_tnm_phrase()' function
* `col_report` : column in `df` that contains reports


In [32]:
# Display opts
pd.set_option('display.max_colwidth', 500, 'display.max_rows', 1000, 'display.min_rows', 1000)

In [33]:
# Extract TNM phrases
#  remove_historical = False, as the detection of historical TNM phrases is likely not accurate atm
matches, check_phrases, check_cleaning, check_rm = get_tnm_phrase(df=df_crc, col='report_text_anon', 
                                                                  remove_unusual=True, 
                                                                  remove_historical=False, 
                                                                  remove_falsepos=True)


Extracting TNM sequences ...


100%|██████████| 8/8 [00:04<00:00,  1.75it/s]


7 matches for TNM sequences

Extracting single TNM values ...

Extracting individual TNM values for category: T


100%|██████████| 8/8 [00:00<00:00, 270.18it/s]
100%|██████████| 8/8 [00:00<00:00, 2995.13it/s]
100%|██████████| 6/6 [00:00<00:00, 9358.80it/s]
100%|██████████| 6/6 [00:00<00:00, 12716.43it/s]
100%|██████████| 6/6 [00:00<00:00, 11496.49it/s]
100%|██████████| 6/6 [00:00<00:00, 11008.67it/s]
100%|██████████| 6/6 [00:00<00:00, 11565.18it/s]
100%|██████████| 6/6 [00:00<00:00, 3904.10it/s]
100%|██████████| 5/5 [00:00<00:00, 2689.35it/s]
100%|██████████| 4/4 [00:00<00:00, 7476.48it/s]
100%|██████████| 4/4 [00:00<00:00, 7194.35it/s]
100%|██████████| 4/4 [00:00<00:00, 2416.77it/s]
100%|██████████| 3/3 [00:00<00:00, 5716.91it/s]
100%|██████████| 3/3 [00:00<00:00, 17747.41it/s]
100%|██████████| 3/3 [00:00<00:00, 1956.91it/s]
100%|██████████| 2/2 [00:00<00:00, 11618.57it/s]
100%|██████████| 2/2 [00:00<00:00, 1320.21it/s]
100%|██████████| 1/1 [00:00<00:00, 5706.54it/s]
100%|██████████| 1/1 [00:00<00:00, 2000.14it/s]
100%|██████████| 1/1 [00:00<00:00, 13148.29it/s]
100%|██████████| 1/1 [00:00<00:00,

8 matches, 0 marked for exclusion

Extracting individual TNM values for category: N


100%|██████████| 8/8 [00:00<00:00, 224.21it/s]
100%|██████████| 6/6 [00:00<00:00, 25292.29it/s]
100%|██████████| 6/6 [00:00<00:00, 15458.12it/s]
100%|██████████| 6/6 [00:00<00:00, 26434.69it/s]
100%|██████████| 6/6 [00:00<00:00, 24105.20it/s]
100%|██████████| 6/6 [00:00<00:00, 28149.69it/s]
100%|██████████| 6/6 [00:00<00:00, 14812.14it/s]
100%|██████████| 6/6 [00:00<00:00, 3759.46it/s]
100%|██████████| 5/5 [00:00<00:00, 3636.47it/s]
100%|██████████| 4/4 [00:00<00:00, 23109.11it/s]
100%|██████████| 4/4 [00:00<00:00, 16743.73it/s]
100%|██████████| 4/4 [00:00<00:00, 3291.59it/s]
100%|██████████| 3/3 [00:00<00:00, 15947.92it/s]
100%|██████████| 3/3 [00:00<00:00, 5180.28it/s]
100%|██████████| 3/3 [00:00<00:00, 2022.98it/s]
100%|██████████| 2/2 [00:00<00:00, 3597.17it/s]
100%|██████████| 2/2 [00:00<00:00, 1690.57it/s]
100%|██████████| 1/1 [00:00<00:00, 2054.02it/s]
100%|██████████| 1/1 [00:00<00:00, 2007.80it/s]
100%|██████████| 1/1 [00:00<00:00, 1639.04it/s]
100%|██████████| 1/1 [00:00<00:0

6 matches, 1 marked for exclusion

Extracting individual TNM values for category: M


100%|██████████| 8/8 [00:00<00:00, 272.02it/s]
100%|██████████| 6/6 [00:00<00:00, 19925.43it/s]
100%|██████████| 6/6 [00:00<00:00, 16256.99it/s]
100%|██████████| 6/6 [00:00<00:00, 28055.55it/s]
100%|██████████| 6/6 [00:00<00:00, 23808.73it/s]
100%|██████████| 6/6 [00:00<00:00, 24291.34it/s]
100%|██████████| 6/6 [00:00<00:00, 19862.53it/s]
100%|██████████| 6/6 [00:00<00:00, 5154.82it/s]
100%|██████████| 5/5 [00:00<00:00, 4265.97it/s]
100%|██████████| 4/4 [00:00<00:00, 19217.89it/s]
100%|██████████| 4/4 [00:00<00:00, 15932.78it/s]
100%|██████████| 4/4 [00:00<00:00, 17032.71it/s]
100%|██████████| 4/4 [00:00<00:00, 20510.04it/s]
100%|██████████| 4/4 [00:00<00:00, 12865.96it/s]
100%|██████████| 4/4 [00:00<00:00, 2915.24it/s]
100%|██████████| 3/3 [00:00<00:00, 16958.10it/s]
100%|██████████| 3/3 [00:00<00:00, 2864.96it/s]
100%|██████████| 2/2 [00:00<00:00, 4167.22it/s]
100%|██████████| 2/2 [00:00<00:00, 4124.19it/s]
100%|██████████| 2/2 [00:00<00:00, 4116.10it/s]
100%|██████████| 2/2 [00:00<0

6 matches, 1 marked for exclusion

Extracting individual TNM values for category: L


100%|██████████| 8/8 [00:00<00:00, 458.08it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it

0 matches, 0 marked for exclusion

Extracting individual TNM values for category: V


100%|██████████| 8/8 [00:00<00:00, 332.10it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it

0 matches, 0 marked for exclusion

Extracting individual TNM values for category: R


100%|██████████| 8/8 [00:00<00:00, 445.26it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it

0 matches, 0 marked for exclusion

Extracting individual TNM values for category: SM


100%|██████████| 8/8 [00:00<00:00, 339.59it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it

0 matches, 0 marked for exclusion

Extracting individual TNM values for category: H


100%|██████████| 8/8 [00:00<00:00, 373.32it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it

0 matches, 0 marked for exclusion

Extracting individual TNM values for category: G


100%|██████████| 8/8 [00:00<00:00, 469.45it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it

0 matches, 0 marked for exclusion

Extracting individual TNM values for category: Pn


100%|██████████| 8/8 [00:00<00:00, 528.25it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 8/8 [

0 matches, 0 marked for exclusion


100%|██████████| 8/8 [00:00<00:00, 1868.81it/s]
100%|██████████| 8/8 [00:00<00:00, 3247.62it/s]


Time elapsed: 0.28 minutes
0 matches have at least 100 capital letters
0 matches have at least 100 digits


100%|██████████| 8/8 [00:00<00:00, 21.86it/s]


1 matches have at least 4 T values in a sequence


100%|██████████| 7/7 [00:00<00:00, 111.06it/s]


Number of matches after splitting phrases: 7


100%|██████████| 7/7 [00:00<00:00, 404.34it/s]
100%|██████████| 7/7 [00:00<00:00, 427.19it/s]
100%|██████████| 7/7 [00:00<00:00, 13.15it/s]


In [34]:
# Extract tumour keywords that occur near each TNM phrase
# This can help to later decide which tumour the TNM phrase refers to
matches = add_tumour_tnm(df=df_crc, matches=matches, col_report='report_text_anon', targetcol='target_before_clean')

Finding nearby tumour keywords for each TNM phrase

Sites included: ['caecum' 'right (ascending) colon' 'hepatic flexure' 'transverse colon'
 'splenic flexure' 'left (descending) colon' 'sigmoid colon' 'rectum'
 'colon' 'colon and rectum']
Sites excluded: ['liver' 'lung' 'pelvis' 'uterus' 'ovaries' 'bladder' 'spleen'
 'anastomosis' 'adrenal gland' 'kidney' 'bone' 'pleura' 'brain' 'head'
 'prostate']


100%|██████████| 8/8 [00:00<00:00, 1944.51it/s]
100%|██████████| 8/8 [00:00<00:00, 963.85it/s]
100%|██████████| 5/5 [00:00<00:00, 462.45it/s]
100%|██████████| 5/5 [00:00<00:00, 405.18it/s]
100%|██████████| 5/5 [00:00<00:00, 614.87it/s]
100%|██████████| 6/6 [00:00<00:00, 266.53it/s]
100%|██████████| 6/6 [00:00<00:00, 485.59it/s]
100%|██████████| 6/6 [00:00<00:00, 927.94it/s]
100%|██████████| 6/6 [00:00<00:00, 1280.44it/s]
100%|██████████| 5/5 [00:00<00:00, 1338.32it/s]
100%|██████████| 5/5 [00:00<00:00, 1350.39it/s]
100%|██████████| 8/8 [00:00<00:00, 637.30it/s]
100%|██████████| 8/8 [00:00<00:00, 652.52it/s]
100%|██████████| 8/8 [00:00<00:00, 4522.77it/s]
100%|██████████| 8/8 [00:00<00:00, 4014.17it/s]
100%|██████████| 8/8 [00:00<00:00, 6897.11it/s]
100%|██████████| 8/8 [00:00<00:00, 4378.76it/s]
100%|██████████| 8/8 [00:00<00:00, 553.10it/s]
100%|██████████| 8/8 [00:00<00:00, 971.69it/s]
100%|██████████| 8/8 [00:00<00:00, 10321.26it/s]
100%|██████████| 8/8 [00:00<00:00, 6843.65it/s]
10

Time elapsed: 0.00986556609471639 minutes
  Number of matches for tumour keywords that were included: 8
Time elapsed: 0.70 seconds


In [35]:
# View unique values for extracted phrases
check_phrases

Unnamed: 0,target,length
1,pT3/2/1 N0 Mx,13
2,pT2a/2b N0 Mx,13
3,T4a/4b N0 M1,12
4,pT1 N3b M0,10
5,Tis N0 M0,9
0,T1 N0 MX,8
6,ymrT2,5


In [36]:
# Check cleaning of TNM phrases
check_cleaning

Unnamed: 0,target_before_clean,target,length
1,pT3/2/1 N0 Mx,pT3/2/1 N0 Mx,13
2,pT2a/b N0 Mx,pT2a/2b N0 Mx,13
3,T4a & b N0 M1,T4a/4b N0 M1,12
4,"pT1 (sigmoid, txt txt txt txt), N3b M0",pT1 N3b M0,10
5,Tis N0 M0,Tis N0 M0,9
0,T1 N0 MX,T1 N0 MX,8
6,ymrT2,ymrT2,5


In [37]:
# View all included matches - detailed view
cols =  ['sentence', 'left', 'target_before_split', 'target_before_clean', 'target', 
         'right', 'exclusion_indicator', 'exclusion_reason', 'phrase_with_tumour']
matches[cols]

Unnamed: 0,sentence,left,target_before_split,target_before_clean,target,right,exclusion_indicator,exclusion_reason,phrase_with_tumour
0,T1 N0 MX (colorectal cancer),,T1 N0 MX,T1 N0 MX,T1 N0 MX,(colorectal cancer),0,,<<T1 N0 MX>> <<COLORECTAL CANCER>>)
1,pT3/2/1 N0 Mx,,pT3/2/1 N0 Mx,pT3/2/1 N0 Mx,pT3/2/1 N0 Mx,. Malignant neoplasm ascending colon,0,,<<PT3/2/1 N0 MX>>.<<MALIGNANT NEOPLASM>> ascending colon
2,pT2a/b N0 Mx (sigmoid tumour),,pT2a/b N0 Mx,pT2a/b N0 Mx,pT2a/2b N0 Mx,(sigmoid tumour),0,,<<PT2A/B N0 MX>> (sigmoid<<TUMOUR>>)
3,"T4a & b N0 M1 invasive carcinoma, descending colon",,T4a & b N0 M1,T4a & b N0 M1,T4a/4b N0 M1,"invasive carcinoma, descending colon",0,,"<<T4A & B N0 M1>> invasive<<CARCINOMA>>, descending colon"
4,rectal tumour staged as ymrT2,"T1-weighted image, ... rectal tumour staged as",ymrT2,ymrT2,ymrT2,,0,,"t1-weighted image, ... rectal <<TUMOUR>> staged as<<YMRT2>>"
5,"Summary: pT1 (sigmoid, txt txt txt txt), N3b M0","Sigmoid adenocarcinoma, ... Summary:","pT1 (sigmoid, txt txt txt txt), N3b M0","pT1 (sigmoid, txt txt txt txt), N3b M0",pT1 N3b M0,,0,,"sigmoid <<ADENOCARCINOMA>>, ... summary:<<PT1 (SIGMOID, TXT TXT TXT TXT), N3B M0>>"
6,"Colorectal tumour in situ, Tis N0 M0","Colorectal tumour in situ,",Tis N0 M0,Tis N0 M0,Tis N0 M0,,0,,"<<COLORECTAL TUMOUR>> in situ,<<TIS N0 M0>>"


In [38]:
# View all included matches - simpler view
cols =  ['left', 'target_before_clean', 'target', 'right']
matches[cols]

Unnamed: 0,left,target_before_clean,target,right
0,,T1 N0 MX,T1 N0 MX,(colorectal cancer)
1,,pT3/2/1 N0 Mx,pT3/2/1 N0 Mx,. Malignant neoplasm ascending colon
2,,pT2a/b N0 Mx,pT2a/2b N0 Mx,(sigmoid tumour)
3,,T4a & b N0 M1,T4a/4b N0 M1,"invasive carcinoma, descending colon"
4,"T1-weighted image, ... rectal tumour staged as",ymrT2,ymrT2,
5,"Sigmoid adenocarcinoma, ... Summary:","pT1 (sigmoid, txt txt txt txt), N3b M0",pT1 N3b M0,
6,"Colorectal tumour in situ,",Tis N0 M0,Tis N0 M0,


In [39]:
# View matches marked for exclusion
check_rm

Unnamed: 0,row,start,end,left,target,right,exclusion_indicator,exclusion_reason,solitary_indicator,sentence_left,sentence_right,sentence,rank
4,5,,,Colorectal tumour. Stage:,T4b / T4a / T3 / T2 / T1,,1,4 or more T-values;,,,,,


In [40]:
# See if any matches marked for exclusion are among included matches
cols =  ['left', 'target_before_clean', 'target', 'right']
matches.loc[matches.exclusion_indicator==1, cols]

Unnamed: 0,left,target_before_clean,target,right


## Step 3. Extract TNM values from phrases

Arguments for `tnm.get_tnm_values()`:
* `df` : Pandas dataframe that contains reports
* `matches` : TNM phrases that were extracted for each report, output of `tnm.get_tnm_phrases()`
* `col` : name of column in `df` that contains reports
* `pathology_prefix` : if True, the output columns will have 'p' prefix, e.g. 'pT'

In [41]:
# Get TNM values from phrases
df_crc, s = get_tnm_values(df_crc, matches=matches, col='report_text_anon', pathology_prefix=False)




Extracting values from the phrase ...


100%|██████████| 7/7 [00:00<00:00, 463.25it/s]
100%|██████████| 7/7 [00:00<00:00, 345.24it/s]
100%|██████████| 7/7 [00:00<00:00, 793.73it/s]
100%|██████████| 7/7 [00:00<00:00, 1231.76it/s]
100%|██████████| 7/7 [00:00<00:00, 247.52it/s]
100%|██████████| 7/7 [00:00<00:00, 480.53it/s]
100%|██████████| 7/7 [00:00<00:00, 1061.73it/s]
100%|██████████| 7/7 [00:00<00:00, 1111.62it/s]
100%|██████████| 7/7 [00:00<00:00, 1058.52it/s]
100%|██████████| 7/7 [00:00<00:00, 718.24it/s]
100%|██████████| 7/7 [00:00<00:00, 780.48it/s]
100%|██████████| 7/7 [00:00<00:00, 1083.44it/s]
100%|██████████| 7/7 [00:00<00:00, 1118.52it/s]
100%|██████████| 7/7 [00:00<00:00, 1136.71it/s]
100%|██████████| 7/7 [00:00<00:00, 1600.36it/s]
100%|██████████| 7/7 [00:00<00:00, 629.45it/s]
100%|██████████| 19/19 [00:00<00:00, 882.40it/s]


Extracting additional perineural invasion


100%|██████████| 8/8 [00:00<00:00, 25400.78it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]


Time elapsed: 0.03 seconds

Getting minimum and maximum values ...
Time elapsed: 0.01 minutes


In [42]:
# Column names in df after tnm values were added
print('Columns in df_crc:')
for i, c in enumerate(df_crc.columns):
    print('{}: {}'.format(i,c))

Columns in df_crc:
0: report_text_anon
1: subject
2: crc_nlp
3: T_pre
4: T
5: N
6: M
7: V
8: R
9: L
10: Pn
11: SM
12: H
13: G
14: T_pre_min
15: T_min
16: N_min
17: M_min
18: V_min
19: R_min
20: L_min
21: Pn_min
22: SM_min
23: H_min
24: G_min
25: sha


In [43]:
# View subset of output
df_crc[['report_text_anon', 'T_pre', 'T', 'N', 'M', 'T_pre_min', 'T_min', 'N_min', 'M_min']].fillna('')

Unnamed: 0,report_text_anon,T_pre,T,N,M,T_pre_min,T_min,N_min,M_min
0,T1 N0 MX (colorectal cancer),,1,0,X,,1,0,X
1,pT3/2/1 N0 Mx. Malignant neoplasm ascending colon,p,3,0,X,p,1,0,X
2,pT2a/b N0 Mx (sigmoid tumour),p,2b,0,X,p,2a,0,X
3,"T4a & b N0 M1 invasive carcinoma, descending colon",,4b,0,1,,4a,0,1
4,"T1-weighted image, ... rectal tumour staged as ymrT2",mry,2,,,mry,2,,
5,Colorectal tumour. Stage: T4b / T4a / T3 / T2 / T1,,,,,,,,
6,"Sigmoid adenocarcinoma, ... Summary: pT1 (sigmoid, txt txt txt txt), N3b M0",p,1,3b,0,p,1,3b,0
7,"Colorectal tumour in situ, Tis N0 M0",,is,0,0,,is,0,0


## Run the code on multiple cores to speed it up

In [44]:
from textmining.reports import get_crc_reports_par
from textmining.tnm.tnm import get_tnm_phrase_par


In [45]:
# Identify CRC reports, dividing the data into 10 chunks which are processed in parallel if there are at least 10 cores
df_crc, matches_crc = get_crc_reports_par(nchunks=10, njobs=-1, df=df, col='report_text_anon')


Sites included: ['caecum' 'right (ascending) colon' 'hepatic flexure' 'transverse colon'
 'splenic flexure' 'left (descending) colon' 'sigmoid colon' 'rectum'
 'colon' 'colon and rectum']
Sites excluded: ['liver' 'lung' 'pelvis' 'uterus' 'ovaries' 'bladder' 'spleen'
 'anastomosis' 'adrenal gland' 'kidney' 'bone' 'pleura' 'brain' 'head'
 'prostate']

Sites included: ['caecum' 'right (ascending) colon' 'hepatic flexure' 'transverse colon'
 'splenic flexure' 'left (descending) colon' 'sigmoid colon' 'rectum'
 'colon' 'colon and rectum']
Sites excluded: ['liver' 'lung' 'pelvis' 'uterus' 'ovaries' 'bladder' 'spleen'
 'anastomosis' 'adrenal gland' 'kidney' 'bone' 'pleura' 'brain' 'head'
 'prostate']

Sites included: ['caecum' 'right (ascending) colon' 'hepatic flexure' 'transverse colon'
 'splenic flexure' 'left (descending) colon' 'sigmoid colon' 'rectum'
 'colon' 'colon and rectum']

Sites included: ['caecum' 'right (ascending) colon' 'hepatic flexure' 'transverse colon'
 'splenic flexure

100%|██████████| 1/1 [00:00<00:00, 8322.03it/s]
100%|██████████| 1/1 [00:00<00:00, 7307.15it/s]
100%|██████████| 1/1 [00:00<00:00, 649.37it/s]
100%|██████████| 1/1 [00:00<00:00, 604.89it/s]
100%|██████████| 1/1 [00:00<00:00, 7570.95it/s]
100%|██████████| 1/1 [00:00<00:00, 564.66it/s]
100%|██████████| 1/1 [00:00<00:00, 8388.61it/s]
100%|██████████| 1/1 [00:00<00:00, 527.32it/s]
100%|██████████| 1/1 [00:00<00:00, 597.73it/s]
100%|██████████| 1/1 [00:00<00:00, 489.02it/s]
100%|██████████| 1/1 [00:00<00:00, 4922.89it/s]
100%|██████████| 1/1 [00:00<00:00, 623.87it/s]
100%|██████████| 1/1 [00:00<00:00, 5745.62it/s]
100%|██████████| 1/1 [00:00<00:00, 798.00it/s]
100%|██████████| 1/1 [00:00<00:00, 4505.16it/s]
100%|██████████| 1/1 [00:00<00:00, 87.27it/s]
100%|██████████| 1/1 [00:00<00:00, 616.18it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 6978.88it/s]
100%|██████████| 1/1 [00:00<00:00, 530.86it/s]
100%|██████████| 1/1 [00:00<00:00, 657.11it/s]
100%|██████████

Time elapsed: 0.009233347574869792 minutes
Time elapsed: 0.009463131427764893 minutes
Time elapsed: 0.009452847639719646 minutes
Time elapsed: 0.009517800807952882 minutes
Time elapsed: 0.00960391362508138 minutes
Time elapsed: 0.009610132376352946 minutes
Time elapsed: 0.009651696681976319 minutes
Time elapsed: 0.009677565097808838 minutes
Time elapsed: 0.011207131544748943 minutes
Time elapsed: 0.01105191707611084 minutes
Time elapsed: 0.011647884051005046 minutes


In [46]:
# Identify CRC reports, dividing the data into 10 chunks which are processed in parallel if there are at least 10 cores
df_crc, matches_crc = get_crc_reports_par(nchunks=10, njobs=-1, df=df, col='report_text_anon')


Sites included: ['caecum' 'right (ascending) colon' 'hepatic flexure' 'transverse colon'
 'splenic flexure' 'left (descending) colon' 'sigmoid colon' 'rectum'
 'colon' 'colon and rectum']
Sites excluded: ['liver' 'lung' 'pelvis' 'uterus' 'ovaries' 'bladder' 'spleen'
 'anastomosis' 'adrenal gland' 'kidney' 'bone' 'pleura' 'brain' 'head'
 'prostate']

Sites included: ['caecum' 'right (ascending) colon' 'hepatic flexure' 'transverse colon'
 'splenic flexure' 'left (descending) colon' 'sigmoid colon' 'rectum'
 'colon' 'colon and rectum']
Sites excluded: ['liver' 'lung' 'pelvis' 'uterus' 'ovaries' 'bladder' 'spleen'
 'anastomosis' 'adrenal gland' 'kidney' 'bone' 'pleura' 'brain' 'head'
 'prostate']

Sites included: ['caecum' 'right (ascending) colon' 'hepatic flexure' 'transverse colon'
 'splenic flexure' 'left (descending) colon' 'sigmoid colon' 'rectum'
 'colon' 'colon and rectum']

Sites included: ['caecum' 'right (ascending) colon' 'hepatic flexure' 'transverse colon'
 'splenic flexure

100%|██████████| 1/1 [00:00<00:00, 6944.21it/s]
100%|██████████| 1/1 [00:00<00:00, 7639.90it/s]
100%|██████████| 1/1 [00:00<00:00, 11125.47it/s]
100%|██████████| 1/1 [00:00<00:00, 667.99it/s]
100%|██████████| 1/1 [00:00<00:00, 649.68it/s]
100%|██████████| 1/1 [00:00<00:00, 682.67it/s]
100%|██████████| 1/1 [00:00<00:00, 8004.40it/s]
100%|██████████| 1/1 [00:00<00:00, 7943.76it/s]
100%|██████████| 1/1 [00:00<00:00, 998.17it/s]
100%|██████████| 1/1 [00:00<00:00, 725.16it/s]
100%|██████████| 1/1 [00:00<00:00, 567.80it/s]
100%|██████████| 1/1 [00:00<00:00, 572.68it/s]
100%|██████████| 1/1 [00:00<00:00, 607.61it/s]
100%|██████████| 1/1 [00:00<00:00, 696.50it/s]
100%|██████████| 1/1 [00:00<00:00, 680.78it/s]
100%|██████████| 1/1 [00:00<00:00, 7096.96it/s]
100%|██████████| 1/1 [00:00<00:00, 621.56it/s]
100%|██████████| 1/1 [00:00<00:00, 306.94it/s]
100%|██████████| 1/1 [00:00<00:00, 580.77it/s]
100%|██████████| 1/1 [00:00<00:00, 7752.87it/s]
100%|██████████| 1/1 [00:00<00:00, 6105.25it/s]
100%

Time elapsed: 0.007356413205464681 minutes
Time elapsed: 0.007300047079722086 minutes
Time elapsed: 0.007416550318400065 minutes
Time elapsed: 0.007358316580454508 minutes
Time elapsed: 0.007570036252339681 minutes
Time elapsed: 0.007555683453877767 minutes
Time elapsed: 0.007580900192260742 minutes
Time elapsed: 0.007314717769622803 minutes
Time elapsed: 0.0074923157691955565 minutes
Time elapsed: 0.007468012968699137 minutes
Time elapsed: 0.00816035270690918 minutes


100%|██████████| 1/1 [00:00<00:00, 9177.91it/s]
  0%|          | 0/1 [00:00<?, ?it/s]58.95it/s]
100%|██████████| 1/1 [00:00<00:00, 9822.73it/s]
100%|██████████| 1/1 [00:00<00:00, 9238.56it/s]
100%|██████████| 1/1 [00:00<00:00, 6574.14it/s]
100%|██████████| 1/1 [00:00<00:00, 5005.14it/s]
100%|██████████| 1/1 [00:00<00:00, 13148.29it/s]
100%|██████████| 1/1 [00:00<00:00, 8774.69it/s]
100%|██████████| 1/1 [00:00<00:00, 13486.51it/s]
100%|██████████| 1/1 [00:00<00:00, 10433.59it/s]
100%|██████████| 1/1 [00:00<00:00, 10205.12it/s]
100%|██████████| 1/1 [00:00<00:00, 10106.76it/s]
100%|██████████| 1/1 [00:00<00:00, 10979.85it/s]
100%|██████████| 1/1 [00:00<00:00, 10538.45it/s]
100%|██████████| 1/1 [00:00<00:00, 8004.40it/s]
100%|██████████| 1/1 [00:00<00:00, 9915.61it/s]
100%|██████████| 1/1 [00:00<00:00, 10512.04it/s]
100%|██████████| 1/1 [00:00<00:00, 9799.78it/s]
100%|██████████| 1/1 [00:00<00:00, 10407.70it/s]
100%|██████████| 1/1 [00:00<00:00, 854.06it/s]
100%|██████████| 1/1 [00:00<00:0

In [47]:
# Again, can also manually check the matches and identify CRC reports from checked matches
df['row'] = np.arange(df.shape[0])
df['crc_nlp'] = 0
matches_incl = matches_crc.loc[matches_crc.exclusion_indicator==0]
matches_incl_processed = matches_incl # processed matches, add any processing steps
df.loc[df.row.isin(matches_incl_processed.row), 'crc_nlp'] = 1
df_crc = df.loc[df.crc_nlp == 1]

In [48]:
# Extract TNM phrases, dividing the data into 10 chunks which are processed in parallel if there are at least 10 cores
matches, check_phrases, check_cleaning, check_rm = get_tnm_phrase_par(nchunks=10, njobs=-1, 
                                                                      df=df_crc, col='report_text_anon', 
                                                                      remove_unusual=True, 
                                                                      remove_historical=False, 
                                                                      remove_falsepos=True)


Extracting TNM sequences ...

Extracting TNM sequences ...

Extracting TNM sequences ...

Extracting TNM sequences ...

Extracting TNM sequences ...

Extracting TNM sequences ...

Extracting TNM sequences ...

Extracting TNM sequences ...

Extracting TNM sequences ...

Extracting TNM sequences ...


0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]


0 matches for TNM sequences

Extracting single TNM values ...

Extracting individual TNM values for category: T
0 matches for TNM sequences

Extracting single TNM values ...

Extracting individual TNM values for category: T


0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, 

0 matches, 0 marked for exclusion

Extracting individual TNM values for category: N
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: N


0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, 

0 matches, 0 marked for exclusion

Extracting individual TNM values for category: M
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: M


0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, 

0 matches, 0 marked for exclusion

Extracting individual TNM values for category: L
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: L


0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, 

0 matches, 0 marked for exclusion

Extracting individual TNM values for category: V
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: V


0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, 

0 matches, 0 marked for exclusion

Extracting individual TNM values for category: R
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: R


0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, 

0 matches, 0 marked for exclusion

Extracting individual TNM values for category: SM
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: SM


0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, 

0 matches, 0 marked for exclusion

Extracting individual TNM values for category: H
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: H


0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, 

0 matches for TNM sequences

Extracting single TNM values ...
1 matches for TNM sequences

Extracting single TNM values ...
1 matches for TNM sequences

Extracting single TNM values ...

Extracting individual TNM values for category: T

Extracting individual TNM values for category: T

Extracting individual TNM values for category: T
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: G
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: G


100%|██████████| 1/1 [00:00<00:00, 9892.23it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 1119.97it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 14716.86it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 1652.60it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 13357.66it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 1714.76it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 15363.75it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 1653.25it/s]
0it [00:00, ?it/s]
0it [

1 matches for TNM sequences

Extracting single TNM values ...

Extracting individual TNM values for category: T
1 matches for TNM sequences

Extracting single TNM values ...

Extracting individual TNM values for category: T
1 matches for TNM sequences

Extracting single TNM values ...

Extracting individual TNM values for category: T
1 matches for TNM sequences

Extracting single TNM values ...

Extracting individual TNM values for category: T
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: Pn
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: Pn
1 matches for TNM sequences

Extracting single TNM values ...

Extracting individual TNM values for category: T


0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 23.04it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]

0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 496.84it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 1733.18it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 20.74it/s]
100%|██████████| 1/1 [00:08<00:00,  8.66s/it]
100%|██████████| 1/1 [00:00<00:00, 14873.42it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 14122.24it/s]
100%|██████████| 1/1 [00:00<00:00, 16578.28it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 12985.46it/s]
100%|██████████| 1/

0 matches, 0 marked for exclusion
0 matches, 0 marked for exclusion
Time elapsed: 0.29 minutes
Time elapsed: 0.29 minutes


0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 9986.44it/s]
100%|██████████| 1/1 [00:00<00:00, 9822.73it/s]
100%|██████████| 1/1 [00:00<00:00, 14873.42it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 20867.18it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 21290.88it/s]
  0%|          | 0/1 [00:00<?, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 10106.76it/s]
100%|██████████| 1/1 [00:00<00:00, 7463.17it/s]
0it [00:00, ?it/s]/1 [00:00<00:00, 9446.63it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 15141.89it/s]
100%|██████████| 1/1 [00:00<00:00, 21620.12it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 18893.26it/s]
100%|██████████| 1/1 [00:00<00:00, 13530.01it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s

1 matches, 0 marked for exclusion

Extracting individual TNM values for category: N
1 matches, 0 marked for exclusion

Extracting individual TNM values for category: N
1 matches, 0 marked for exclusion

Extracting individual TNM values for category: N


100%|██████████| 1/1 [00:00<00:00, 1610.10it/s]
100%|██████████| 1/1 [00:00<00:00, 2481.84it/s]
100%|██████████| 1/1 [00:00<00:00, 14768.68it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 621.93it/s]
100%|██████████| 1/1 [00:00<00:00, 1784.81it/s]
100%|██████████| 1/1 [00:00<00:00, 1663.75it/s]
100%|██████████| 1/1 [00:00<00:00, 2914.74it/s]
100%|██████████| 1/1 [00:00<00:00, 14716.86it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 13315.25it/s]
100%|██████████| 1/1 [00:00<00:00, 1626.33it/s]
100%|██████████| 1/1 [00:00<00:00, 1916.08it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 2632.96it/s]
100%|██████████| 1/1 [00:00<00:00, 2314.74it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 1612.57it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 13934.56it/s]
100%|██████████| 1/1 [00:00<00:00, 2499.59it/s]
100%|██████████| 1/1 [00:00<00:00, 1760.83it/s]
100%|██████████| 1/1 [00:00<00:00, 1685.81it/s]


1 matches, 0 marked for exclusion

Extracting individual TNM values for category: N
1 matches, 0 marked for exclusion

Extracting individual TNM values for category: N
1 matches, 0 marked for exclusion

Extracting individual TNM values for category: N
1 matches, 0 marked for exclusion

Extracting individual TNM values for category: N
1 matches, 0 marked for exclusion

Extracting individual TNM values for category: N


100%|██████████| 1/1 [00:00<00:00, 12052.60it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 2563.76it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 14873.42it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 103.23it/s]
100%|██████████| 1/1 [00:00<00:00, 28.03it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 3302.60it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 14768.68it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 10866.07it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 23.09it/s]
100%|██████████| 1/1 [00:00<00:00, 87.38it/s]
100%|██████████| 1/1 [00:00<00:00, 5053.38it/s]
100%|██████████| 1/1 [00:00<00:00, 11096.04it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|███

0 matches, 0 marked for exclusion

Extracting individual TNM values for category: M
1 matches, 0 marked for exclusion

Extracting individual TNM values for category: M
1 matches, 1 marked for exclusion

Extracting individual TNM values for category: M


100%|██████████| 1/1 [00:00<00:00, 1426.15it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 4760.84it/s]
100%|██████████| 1/1 [00:00<00:00, 28.97it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 2592.28it/s]
100%|██████████| 1/1 [00:00<00:00, 15363.75it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 1760.83it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 10618.49it/s]
100%|██████████| 1/1 [00:00<00:00, 15141.89it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 9709.04it/s]
100%|██████████| 1/1 [00:00<00:00, 1027.76it/s]
100%|██████████| 1/1 [00:00<00:00, 10010.27it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 1717.57it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 15141.89it/s]
100%|██████████| 1/1 [00:00<00:00, 10618.49it/s]
100%|██████████| 1/1 [00:00<00:00, 2066.16it/s]
100%|██████████| 1/1 [00:00<00:00, 11491.24it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 15141.89it/s]
0it [00

0 matches, 0 marked for exclusion

Extracting individual TNM values for category: M
1 matches, 0 marked for exclusion

Extracting individual TNM values for category: M
1 matches, 0 marked for exclusion

Extracting individual TNM values for category: M
1 matches, 0 marked for exclusion

Extracting individual TNM values for category: M
1 matches, 0 marked for exclusion

Extracting individual TNM values for category: M


0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 3410.00it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 31.80it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 25.53it/s]
100%|██████████| 1/1 [00:00<00:00, 2841.67it/s]
0it [00:00, ?it/s]
0it [00:00, ?i

0 matches, 0 marked for exclusion

Extracting individual TNM values for category: L
1 matches, 0 marked for exclusion

Extracting individual TNM values for category: L
1 matches, 0 marked for exclusion

Extracting individual TNM values for category: L


100%|██████████| 1/1 [00:00<00:00, 14716.86it/s]
100%|██████████| 1/1 [00:00<00:00, 10645.44it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 14513.16it/s]
100%|██████████| 1/1 [00:00<00:00, 10618.49it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 9868.95it/s]
100%|██████████| 1/1 [00:00<00:00, 8701.88it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 9098.27it/s]
100%|██████████| 1/1 [00:00<00:00, 9446.63it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 14716.86it/s]
100%|██████████| 1/1 [00:00<00:00, 9446.63it/s]
100%|██████████| 1/1 [00:00<00:00, 15196.75it/s]
100%|██████████| 1/1 [00:00<00:00, 14463.12it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 9915.61it/s]
100%|██████████| 1/1 [00:00<00:00, 10979.85it/s]
100%|██████████| 1/1 [00:00<00:00, 10727.12it/s]
100%|██████████| 1/1 [00:00<00:00, 10538.45it/s]
0it [00:00, ?

0 matches, 0 marked for exclusion

Extracting individual TNM values for category: V
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: V
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: V
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: L
1 matches, 0 marked for exclusion

Extracting individual TNM values for category: L


0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 13486.51it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 32.50it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, 

1 matches, 0 marked for exclusion

Extracting individual TNM values for category: L
1 matches, 1 marked for exclusion

Extracting individual TNM values for category: L
1 matches, 0 marked for exclusion

Extracting individual TNM values for category: L


0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 2849.39it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 29.15it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 2874.78it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
0it [00:00, ?it/s]
0it

0 matches, 0 marked for exclusion

Extracting individual TNM values for category: R
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: R
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: R
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: V
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: V
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: V


0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 9986.44it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 34.72it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 7037.42it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it

0 matches, 0 marked for exclusion

Extracting individual TNM values for category: V
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: V


0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]

0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00,

0 matches, 0 marked for exclusion

Extracting individual TNM values for category: R
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: SM
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: R
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: R
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: SM
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: SM
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: R
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: R


0it [00:00, ?it/s]
0it [00:00, ?it/s]

0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 9532.51it/s]
100%|██████████| 1/1 [00:00<00:00, 9446.63it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 29.45it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 9892.23it/s]
100%|██████████| 1/1 [00:00<00:00, 10754.63it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]


0 matches, 0 marked for exclusion

Extracting individual TNM values for category: SM
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: H
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: H
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: SM
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: SM
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: H
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: SM
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: SM


0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 38.00it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]

0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 24.68it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?i

0 matches, 0 marked for exclusion

Extracting individual TNM values for category: H
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: H
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: G
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: G
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: H
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: H
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: G
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: H


0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 36.70it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 26.54it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it

0 matches, 0 marked for exclusion

Extracting individual TNM values for category: G
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: Pn
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: G
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: Pn
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: G
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: Pn
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: G
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: G


0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 40.01it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 11915.64it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 37.36it/s]
100%|██████████| 1/1 [00:00<00:00, 39.39it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it

0 matches, 0 marked for exclusion
Time elapsed: 0.41 minutes
0 matches have at least 100 capital letters
0 matches have at least 100 digits
0 matches, 0 marked for exclusion
Time elapsed: 0.41 minutes
0 matches have at least 100 capital letters
0 matches have at least 100 digits
0 matches, 0 marked for exclusion
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: Pn
Time elapsed: 0.41 minutes


0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]/1 [00:00<00:00, 13706.88it/s]

0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 686.80it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|█████

0 matches have at least 100 capital letters
0 matches have at least 100 digits
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: Pn
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: Pn
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: Pn
0 matches, 0 marked for exclusion

Extracting individual TNM values for category: Pn


0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 52.05it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 36.18it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it

0 matches, 0 marked for exclusion
0 matches, 0 marked for exclusion
Time elapsed: 0.42 minutes
Time elapsed: 0.42 minutes
0 matches have at least 100 capital letters
0 matches, 0 marked for exclusion
0 matches have at least 100 digits
0 matches have at least 100 capital letters
0 matches have at least 100 digits
Time elapsed: 0.42 minutes
0 matches have at least 100 capital letters
0 matches have at least 100 digits
0 matches, 0 marked for exclusion
0 matches have at least 4 T values in a sequence
0 matches, 0 marked for exclusion
Time elapsed: 0.43 minutes
0 matches have at least 100 capital letters
0 matches have at least 100 digits
Time elapsed: 0.43 minutes
0 matches have at least 100 capital letters


100%|██████████| 1/1 [00:00<00:00, 843.75it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 13357.66it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 600.30it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 14926.35it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 14122.24it/s]
100%|██████████| 1/1 [00:00<00:00,  2.32it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 14266.34it/s]
100%|██████████| 1/1 [00:00<00:00, 774.14it/s]
100%|██████████| 1/1 [00:00<00:00, 15420.24it/s]
100%|██████████| 1/1 [00:00<00:00, 21732.15it/s]
100%|██████████| 1/1 [00:00<00:00, 856.16it/

0 matches have at least 100 digits
0 matches have at least 4 T values in a sequence
0 matches have at least 4 T values in a sequence
Number of matches after splitting phrases: 1


100%|██████████| 1/1 [00:00<00:00,  2.39it/s]
100%|██████████| 1/1 [00:00<00:00,  8.63it/s]
100%|██████████| 1/1 [00:00<00:00, 118.54it/s]
100%|██████████| 1/1 [00:00<00:00, 17549.39it/s]
100%|██████████| 1/1 [00:00<00:00,  9.66it/s]
100%|██████████| 1/1 [00:00<00:00, 21399.51it/s]
100%|██████████| 1/1 [00:00<00:00,  9.51it/s]
  0%|          | 0/1 [00:00<?, ?it/s]

Number of matches after splitting phrases: 1
Number of matches after splitting phrases: 1


100%|██████████| 1/1 [00:00<00:00, 152.07it/s]
100%|██████████| 1/1 [00:00<00:00, 22192.08it/s]
100%|██████████| 1/1 [00:00<00:00, 215.69it/s]
100%|██████████| 1/1 [00:00<00:00,  2.18it/s]
100%|██████████| 1/1 [00:00<00:00,  2.01it/s]
100%|██████████| 1/1 [00:00<00:00,  1.96it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
0it [00:00, ?it/s]
100%|██████████| 1/1 [00:00<00:00,  2.00it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
100%|██████████| 1/1 [00:00<00:00,  2.01it/s]


0 matches have at least 4 T values in a sequence
1 matches have at least 4 T values in a sequence
0 matches have at least 4 T values in a sequence
Number of matches after splitting phrases: 0
0 matches have at least 4 T values in a sequence
0 matches have at least 4 T values in a sequence
Number of matches after splitting phrases: 1


100%|██████████| 1/1 [00:00<00:00, 10.46it/s]
100%|██████████| 1/1 [00:00<00:00, 14.01it/s]
100%|██████████| 1/1 [00:00<00:00, 15827.56it/s]
100%|██████████| 1/1 [00:00<00:00, 124.78it/s]
100%|██████████| 1/1 [00:00<00:00, 97.93it/s]
100%|██████████| 1/1 [00:00<00:00, 13189.64it/s]
100%|██████████| 1/1 [00:00<00:00, 10.73it/s]
100%|██████████| 1/1 [00:00<00:00, 10.52it/s]
100%|██████████| 1/1 [00:00<00:00, 92.60it/s]


Number of matches after splitting phrases: 1
Number of matches after splitting phrases: 1
Number of matches after splitting phrases: 1


100%|██████████| 1/1 [00:00<00:00, 20867.18it/s]
100%|██████████| 1/1 [00:00<00:00, 19972.88it/s]
100%|██████████| 1/1 [00:00<00:00, 134.05it/s]
0it [00:00, ?it/s]/1 [00:00<?, ?it/s]
100%|██████████| 1/1 [00:00<00:00,  1.59it/s]
100%|██████████| 1/1 [00:00<00:00,  1.60it/s]
100%|██████████| 1/1 [00:00<00:00,  1.64it/s]


Time elapsed: 0.4639725685119629 minutes


100%|██████████| 1/1 [00:00<00:00,  2.05it/s]
100%|██████████| 1/1 [00:00<00:00,  2.05it/s]
100%|██████████| 1/1 [00:00<00:00,  2.12it/s]
100%|██████████| 1/1 [00:00<00:00,  2.17it/s]


In [49]:
# Get TNM values from phrases
df_crc, s = get_tnm_values(df_crc, matches=matches, col='report_text_anon', pathology_prefix=False)




Extracting values from the phrase ...


100%|██████████| 7/7 [00:00<00:00, 851.88it/s]
100%|██████████| 7/7 [00:00<00:00, 1063.00it/s]
100%|██████████| 7/7 [00:00<00:00, 14956.76it/s]
100%|██████████| 7/7 [00:00<00:00, 20000.09it/s]
100%|██████████| 7/7 [00:00<00:00, 2584.06it/s]
100%|██████████| 7/7 [00:00<00:00, 1048.05it/s]
100%|██████████| 7/7 [00:00<00:00, 22584.71it/s]
100%|██████████| 7/7 [00:00<00:00, 28339.89it/s]
100%|██████████| 7/7 [00:00<00:00, 25266.89it/s]
100%|██████████| 7/7 [00:00<00:00, 16700.87it/s]
100%|██████████| 7/7 [00:00<00:00, 25619.66it/s]
100%|██████████| 7/7 [00:00<00:00, 17455.49it/s]
100%|██████████| 7/7 [00:00<00:00, 18954.25it/s]
100%|██████████| 7/7 [00:00<00:00, 22091.89it/s]
100%|██████████| 7/7 [00:00<00:00, 18101.19it/s]
100%|██████████| 7/7 [00:00<00:00, 28449.74it/s]
100%|██████████| 19/19 [00:00<00:00, 1010.60it/s]


Extracting additional perineural invasion


100%|██████████| 8/8 [00:00<00:00, 46281.98it/s]
0it [00:00, ?it/s]
0it [00:00, ?it/s]

Time elapsed: 0.03 seconds

Getting minimum and maximum values ...





Time elapsed: 0.00 minutes
