# Task 1: Create a Prescription Parser using CRF
This task tests your ability to build a Doctor Prescription Parser with the help of CRF model

Your job is to build a Prescription Parser that takes a prescription (sentence) as an input and find / label the words in that sentence with one of the already pre-defined labels

### Problem: SEQUENCE PREDICTION - Label words in a sentence
#### Input : Doctor Prescription in the form of a sentence split into tokens
- Ex: Take 2 tablets once a day for 10 days

#### Output : FHIR Labels
- ('Take', 'Method')
- ('2', 'Qty') 
- ('tablets', 'Form')
- ('once', 'Frequency')
- ('a', 'Period') 
- ('day', 'PeriodUnit')
- ('for', 'FOR')
- ('10', 'Duration')
- ('days', 'DurationUnit') 

### Major Steps
- Install necessary library
- Import the libraries
- Create training data with labels
    - Split the sentence into tokens
    - Compute POS tags
    - Create triples
- Extract features
- Split the data into training and testing set
- Create CRF model
- Save the CRF model
- Load the CRF model
- Predict on test data
- Accuracy

#### Install necesaary library

In [1641]:
!pip install scikit-learn numpy pandas nltk




#### Import the necessary libraries

In [1643]:
import numpy as np
import pandas as pd
import nltk
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, classification_report

### Input data (GIVEN)
#### Creating the inputs to the ML model in the following form:
- sigs --> ['take 3 tabs for 10 days']       INPUT SIG
- input_sigs --> [['take', '3', 'tabs', 'for', '10', 'days']]      TOKENS
- output_labels --> [['Method','Qty', 'Form', 'FOR', 'Duration', 'DurationUnit']]       LABELS

In [1645]:
sigs = ["for 5 to 6 days", "inject 2 units", "x 2 weeks", "x 3 days", "every day", "every 2 weeks", "every 3 days", "every 1 to 2 months", "every 2 to 6 weeks", "every 4 to 6 days", "take two to four tabs", "take 2 to 4 tabs", "take 3 tabs orally bid for 10 days at bedtime", "swallow three capsules tid orally", "take 2 capsules po every 6 hours", "take 2 tabs po for 10 days", "take 100 caps by mouth tid for 10 weeks", "take 2 tabs after an hour", "2 tabs every 4-6 hours", "every 4 to 6 hours", "q46h", "q4-6h", "2 hours before breakfast", "before 30 mins at bedtime", "30 mins before bed", "and 100 tabs twice a month", "100 tabs twice a month", "100 tabs once a month", "100 tabs thrice a month", "3 tabs daily for 3 days then 1 tab per day at bed", "30 tabs 10 days tid", "take 30 tabs for 10 days three times a day", "qid q6h", "bid", "qid", "30 tabs before dinner and bedtime", "30 tabs before dinner & bedtime", "take 3 tabs at bedtime", "30 tabs thrice daily for 10 days ", "30 tabs for 10 days three times a day", "Take 2 tablets a day", "qid for 10 days", "every day", "take 2 caps at bedtime", "apply 3 drops before bedtime", "take three capsules daily", "swallow 3 pills once a day", "swallow three pills thrice a day", "apply daily", "apply three drops before bedtime", "every 6 hours", "before food", "after food", "for 20 days", "for twenty days", "with meals"]
input_sigs = [['for', '5', 'to', '6', 'days'], ['inject', '2', 'units'], ['x', '2', 'weeks'], ['x', '3', 'days'], ['every', 'day'], ['every', '2', 'weeks'], ['every', '3', 'days'], ['every', '1', 'to', '2', 'months'], ['every', '2', 'to', '6', 'weeks'], ['every', '4', 'to', '6', 'days'], ['take', 'two', 'to', 'four', 'tabs'], ['take', '2', 'to', '4', 'tabs'], ['take', '3', 'tabs', 'orally', 'bid', 'for', '10', 'days', 'at', 'bedtime'], ['swallow', 'three', 'capsules', 'tid', 'orally'], ['take', '2', 'capsules', 'po', 'every', '6', 'hours'], ['take', '2', 'tabs', 'po', 'for', '10', 'days'], ['take', '100', 'caps', 'by', 'mouth', 'tid', 'for', '10', 'weeks'], ['take', '2', 'tabs', 'after', 'an', 'hour'], ['2', 'tabs', 'every', '4-6', 'hours'], ['every', '4', 'to', '6', 'hours'], ['q46h'], ['q4-6h'], ['2', 'hours', 'before', 'breakfast'], ['before', '30', 'mins', 'at', 'bedtime'], ['30', 'mins', 'before', 'bed'], ['and', '100', 'tabs', 'twice', 'a', 'month'], ['100', 'tabs', 'twice', 'a', 'month'], ['100', 'tabs', 'once', 'a', 'month'], ['100', 'tabs', 'thrice', 'a', 'month'], ['3', 'tabs', 'daily', 'for', '3', 'days', 'then', '1', 'tab', 'per', 'day', 'at', 'bed'], ['30', 'tabs', '10', 'days', 'tid'], ['take', '30', 'tabs', 'for', '10', 'days', 'three', 'times', 'a', 'day'], ['qid', 'q6h'], ['bid'], ['qid'], ['30', 'tabs', 'before', 'dinner', 'and', 'bedtime'], ['30', 'tabs', 'before', 'dinner', '&', 'bedtime'], ['take', '3', 'tabs', 'at', 'bedtime'], ['30', 'tabs', 'thrice', 'daily', 'for', '10', 'days'], ['30', 'tabs', 'for', '10', 'days', 'three', 'times', 'a', 'day'], ['take', '2', 'tablets', 'a', 'day'], ['qid', 'for', '10', 'days'], ['every', 'day'], ['take', '2', 'caps', 'at', 'bedtime'], ['apply', '3', 'drops', 'before', 'bedtime'], ['take', 'three', 'capsules', 'daily'], ['swallow', '3', 'pills', 'once', 'a', 'day'], ['swallow', 'three', 'pills', 'thrice', 'a', 'day'], ['apply', 'daily'], ['apply', 'three', 'drops', 'before', 'bedtime'], ['every', '6', 'hours'], ['before', 'food'], ['after', 'food'], ['for', '20', 'days'], ['for', 'twenty', 'days'], ['with', 'meals']]
output_labels = [['FOR', 'Duration', 'TO', 'DurationMax', 'DurationUnit'], ['Method', 'Qty', 'Form'], ['FOR', 'Duration', 'DurationUnit'], ['FOR', 'Duration', 'DurationUnit'], ['EVERY', 'Period'], ['EVERY', 'Period', 'PeriodUnit'], ['EVERY', 'Period', 'PeriodUnit'], ['EVERY', 'Period', 'TO', 'PeriodMax', 'PeriodUnit'], ['EVERY', 'Period', 'TO', 'PeriodMax', 'PeriodUnit'], ['EVERY', 'Period', 'TO', 'PeriodMax', 'PeriodUnit'], ['Method', 'Qty', 'TO', 'Qty', 'Form'], ['Method', 'Qty', 'TO', 'Qty', 'Form'], ['Method', 'Qty', 'Form', 'PO', 'BID', 'FOR', 'Duration', 'DurationUnit', 'AT', 'WHEN'], ['Method', 'Qty', 'Form', 'TID', 'PO'], ['Method', 'Qty', 'Form', 'PO', 'EVERY', 'Period', 'PeriodUnit'], ['Method', 'Qty', 'Form', 'PO', 'FOR', 'Duration', 'DurationUnit'], ['Method', 'Qty', 'Form', 'BY', 'PO', 'TID', 'FOR', 'Duration', 'DurationUnit'], ['Method', 'Qty', 'Form', 'AFTER', 'Period', 'PeriodUnit'], ['Qty', 'Form', 'EVERY', 'Period', 'PeriodUnit'], ['EVERY', 'Period', 'TO', 'PeriodMax', 'PeriodUnit'], ['Q46H'], ['Q4-6H'], ['Qty', 'PeriodUnit', 'BEFORE', 'WHEN'], ['BEFORE', 'Qty', 'M', 'AT', 'WHEN'], ['Qty', 'M', 'BEFORE', 'WHEN'], ['AND', 'Qty', 'Form', 'Frequency', 'Period', 'PeriodUnit'], ['Qty', 'Form', 'Frequency', 'Period', 'PeriodUnit'], ['Qty', 'Form', 'Frequency', 'Period', 'PeriodUnit'], ['Qty', 'Form', 'Frequency', 'Period', 'PeriodUnit'], ['Qty', 'Form', 'Frequency', 'FOR', 'Duration', 'DurationUnit', 'THEN', 'Qty', 'Form', 'Frequency', 'PeriodUnit', 'AT', 'WHEN'], ['Qty', 'Form', 'Duration', 'DurationUnit', 'TID'], ['Method', 'Qty', 'Form', 'FOR', 'Duration', 'DurationUnit', 'Qty', 'TIMES', 'Period', 'PeriodUnit'], ['QID', 'Q6H'], ['BID'], ['QID'],['Qty', 'Form', 'BEFORE', 'WHEN', 'AND', 'WHEN'], ['Qty', 'Form', 'BEFORE', 'WHEN', 'AND', 'WHEN'], ['Method', 'Qty', 'Form', 'AT', 'WHEN'], ['Qty', 'Form', 'Frequency', 'DAILY', 'FOR', 'Duration', 'DurationUnit'], ['Qty', 'Form', 'FOR', 'Duration', 'DurationUnit', 'Frequency', 'TIMES', 'Period', 'PeriodUnit'], ['Method', 'Qty', 'Form', 'Period', 'PeriodUnit'], ['QID', 'FOR', 'Duration', 'DurationUnit'], ['EVERY', 'PeriodUnit'], ['Method', 'Qty', 'Form', 'AT', 'WHEN'], ['Method', 'Qty', 'Form', 'BEFORE', 'WHEN'], ['Method', 'Qty', 'Form', 'DAILY'], ['Method', 'Qty', 'Form', 'Frequency', 'Period', 'PeriodUnit'], ['Method', 'Qty', 'Form', 'Frequency', 'Period', 'PeriodUnit'], ['Method', 'DAILY'], ['Method', 'Qty', 'Form', 'BEFORE', 'WHEN'], ['EVERY', 'Period', 'PeriodUnit'], ['BEFORE', 'FOOD'], ['AFTER', 'FOOD'], ['FOR', 'Duration', 'DurationUnit'], ['FOR', 'Duration', 'DurationUnit'], ['WITH', 'FOOD']]

In [1646]:
len(sigs), len(input_sigs) , len(output_labels)

(56, 56, 56)

### Creating a Tuples Maker method
Create the tuples as given below by writing a function **tuples_maker(input_sigs, output_labels)** and returns **output** as given below

Input(s): 
- input_sigs
- output_lables

Output:

[[('for', 'FOR'),
  ('5', 'Duration'),
  ('to', 'TO'),
  ('6', 'DurationMax'),
  ('days', 'DurationUnit')], [second sentence], ...]

In [1648]:
def tuples_maker(input_sigs, output_labels):
    
    sample_data = []
    for tokens, labels in zip(input_sigs, output_labels):
        sentence_tuples = [(token, label) for token, label in zip(tokens, labels)]
        sample_data.append(sentence_tuples)

    return sample_data

In [1649]:
result = tuples_maker(input_sigs, output_labels)
result


[[('for', 'FOR'),
  ('5', 'Duration'),
  ('to', 'TO'),
  ('6', 'DurationMax'),
  ('days', 'DurationUnit')],
 [('inject', 'Method'), ('2', 'Qty'), ('units', 'Form')],
 [('x', 'FOR'), ('2', 'Duration'), ('weeks', 'DurationUnit')],
 [('x', 'FOR'), ('3', 'Duration'), ('days', 'DurationUnit')],
 [('every', 'EVERY'), ('day', 'Period')],
 [('every', 'EVERY'), ('2', 'Period'), ('weeks', 'PeriodUnit')],
 [('every', 'EVERY'), ('3', 'Period'), ('days', 'PeriodUnit')],
 [('every', 'EVERY'),
  ('1', 'Period'),
  ('to', 'TO'),
  ('2', 'PeriodMax'),
  ('months', 'PeriodUnit')],
 [('every', 'EVERY'),
  ('2', 'Period'),
  ('to', 'TO'),
  ('6', 'PeriodMax'),
  ('weeks', 'PeriodUnit')],
 [('every', 'EVERY'),
  ('4', 'Period'),
  ('to', 'TO'),
  ('6', 'PeriodMax'),
  ('days', 'PeriodUnit')],
 [('take', 'Method'),
  ('two', 'Qty'),
  ('to', 'TO'),
  ('four', 'Qty'),
  ('tabs', 'Form')],
 [('take', 'Method'),
  ('2', 'Qty'),
  ('to', 'TO'),
  ('4', 'Qty'),
  ('tabs', 'Form')],
 [('take', 'Method'),
  ('3', 

### Creating the triples_maker( ) for feature extraction
- input: tuples_maker_output
- output: 
[[('for', 'IN', 'FOR'),
  ('5', 'CD', 'Duration'),
  ('to', 'TO', 'TO'),
  ('6', 'CD', 'DurationMax'),
  ('days', 'NNS', 'DurationUnit')], [second sentence], ... ]

In [1651]:

def triples_maker(whole_data):
    sample_data = []
    
    for sentence in whole_data:
        # Extract tokens and labels separately
        tokens, labels = zip(*sentence)
        
        # Get POS tags for the tokens
        pos_tags = nltk.pos_tag(tokens)
        
        # Form triples of (token, POS, label) and append to sample_data
        sentence_data = [(token, pos, label) for (token, pos), label in zip(pos_tags, labels)]
        sample_data.append(sentence_data)    
    return sample_data 



In [1652]:
whole_data = tuples_maker(input_sigs, output_labels)

sample_data = triples_maker(whole_data)
sample_data


[[('for', 'IN', 'FOR'),
  ('5', 'CD', 'Duration'),
  ('to', 'TO', 'TO'),
  ('6', 'CD', 'DurationMax'),
  ('days', 'NNS', 'DurationUnit')],
 [('inject', 'JJ', 'Method'), ('2', 'CD', 'Qty'), ('units', 'NNS', 'Form')],
 [('x', 'RB', 'FOR'),
  ('2', 'CD', 'Duration'),
  ('weeks', 'NNS', 'DurationUnit')],
 [('x', 'RB', 'FOR'),
  ('3', 'CD', 'Duration'),
  ('days', 'NNS', 'DurationUnit')],
 [('every', 'DT', 'EVERY'), ('day', 'NN', 'Period')],
 [('every', 'DT', 'EVERY'),
  ('2', 'CD', 'Period'),
  ('weeks', 'NNS', 'PeriodUnit')],
 [('every', 'DT', 'EVERY'),
  ('3', 'CD', 'Period'),
  ('days', 'NNS', 'PeriodUnit')],
 [('every', 'DT', 'EVERY'),
  ('1', 'CD', 'Period'),
  ('to', 'TO', 'TO'),
  ('2', 'CD', 'PeriodMax'),
  ('months', 'NNS', 'PeriodUnit')],
 [('every', 'DT', 'EVERY'),
  ('2', 'CD', 'Period'),
  ('to', 'TO', 'TO'),
  ('6', 'CD', 'PeriodMax'),
  ('weeks', 'NNS', 'PeriodUnit')],
 [('every', 'DT', 'EVERY'),
  ('4', 'CD', 'Period'),
  ('to', 'TO', 'TO'),
  ('6', 'CD', 'PeriodMax'),
  ('

### Creating the features extractor method (GIVEN as a BASELINE)
#### The features used are:
- SOS, EOS, lowercase, uppercase, title, digit, postag, previous_tag, next_tag
#### Feel free to include more features

In [1654]:
def token_to_features(doc, i):
    word = doc[i][0]
    postag = doc[i][1]

    # Common features for all words
    features = [
        'bias',
        'word.lower=' + word.lower(),
        'word[-3:]=' + word[-3:],
        'word[-2:]=' + word[-2:],
        'word.isupper=%s' % word.isupper(),
        'word.istitle=%s' % word.istitle(),
        'word.isdigit=%s' % word.isdigit(),
        'postag=' + postag
    ]

    # Features for words that are not
    # at the beginning of a document
    if i > 0:
        word1 = doc[i-1][0]
        postag1 = doc[i-1][1]
        features.extend([
            '-1:word.lower=' + word1.lower(),
            '-1:word.istitle=%s' % word1.istitle(),
            '-1:word.isupper=%s' % word1.isupper(),
            '-1:word.isdigit=%s' % word1.isdigit(),
            '-1:postag=' + postag1
        ])
    else:
        # Indicate that it is the 'beginning of a document'
        features.append('BOS')

    # Features for words that are not
    # at the end of a document
    if i < len(doc)-1:
        word1 = doc[i+1][0]
        postag1 = doc[i+1][1]
        features.extend([
            '+1:word.lower=' + word1.lower(),
            '+1:word.istitle=%s' % word1.istitle(),
            '+1:word.isupper=%s' % word1.isupper(),
            '+1:word.isdigit=%s' % word1.isdigit(),
            '+1:postag=' + postag1
        ])
    else:
        # Indicate that it is the 'end of a document'
        features.append('EOS')

    return features

### Running the feature extractor on the training data 
- Feature extraction
- Train-test-split

In [1656]:

# Function to extract features from all tokens in a document
def get_features(doc):
    return [token_to_features(doc, i) for i in range(len(doc))]

# Function to extract labels assuming the last element in each doc is the label
def get_labels(doc):
    return [label for _, _, label in doc]  # Unpacking all 3 values

# Prepare dataset
X = [get_features(doc) for doc in sample_data]
y = [get_labels(doc) for doc in sample_data]

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

### Training the CRF model with the features extracted using the feature extractor method

In [1658]:
import pycrfsuite


In [1659]:
# Initialise the trainer
trainer = pycrfsuite.Trainer(algorithm='lbfgs')

# Submit training data to the trainer
for x, y in zip(X_train, y_train):
    trainer.append(x, y)

trainer.set_params({
    'c1': 0.1,
    'c2': 0.1,
    'max_iterations': 156,  # Max iterations to run
    'epsilon': 1e-6,        # Tolerance for convergence
    'delta': 1e-6,          # Tolerance for feature norm changes
    'feature.possible_transitions': True
})

# Providing a file name as a parameter to the train function, such that the model will be saved to the file when training is finished
trainer.train('crf_m_o_d_e_l.crfsuite')


Feature generation
type: CRF1d
feature.minfreq: 0.000000
feature.possible_states: 0
feature.possible_transitions: 1
0....1....2....3....4....5....6....7....8....9....10
Number of features: 1763
Seconds required: 0.001

L-BFGS optimization
c1: 0.100000
c2: 0.100000
num_memories: 6
max_iterations: 156
epsilon: 0.000001
stop: 10
delta: 0.000001
linesearch: MoreThuente
linesearch.max_iterations: 20

***** Iteration #1 *****
Loss: 568.143184
Feature norm: 1.000000
Error norm: 116.003715
Active features: 1749
Line search trials: 1
Line search step: 0.006435
Seconds required for this iteration: 0.000

***** Iteration #2 *****
Loss: 387.359975
Feature norm: 5.053745
Error norm: 145.752464
Active features: 1489
Line search trials: 1
Line search step: 1.000000
Seconds required for this iteration: 0.000

***** Iteration #3 *****
Loss: 282.295063
Feature norm: 5.958821
Error norm: 65.263181
Active features: 1497
Line search trials: 1
Line search step: 1.000000
Seconds required for this iteration: 

### Predicting the test data with the built model

In [1661]:

# Load the trained model
crf_model = pycrfsuite.Tagger()
crf_model.open('crf_m_o_d_e_l.crfsuite')


<contextlib.closing at 0x3169409b0>

In [1662]:
# Predict the labels for the test data
y_pred = [crf_model.tag(x) for x in X_test]

# Flatten the predictions and true labels for evaluation
y_pred_flat = [item for sublist in y_pred for item in sublist]
y_test_flat = [item for sublist in y_test for item in sublist]
y_pred

[['FOR', 'Duration', 'TO', 'PeriodMax', 'PeriodUnit'],
 ['EVERY', 'Period', 'PeriodUnit'],
 ['QID'],
 ['Method', 'Qty', 'Form', 'TID', 'DAILY'],
 ['EVERY', 'Period', 'TO', 'PeriodMax', 'PeriodUnit'],
 ['EVERY', 'Period', 'PeriodUnit'],
 ['Qty', 'Form', 'BEFORE', 'WHEN', 'AND', 'WHEN'],
 ['Qty', 'Form', 'Frequency', 'Period', 'PeriodUnit'],
 ['Method', 'Qty', 'Form', 'BEFORE', 'WHEN'],
 ['Method',
  'Qty',
  'Form',
  'Frequency',
  'PeriodUnit',
  'FOR',
  'Duration',
  'DurationUnit',
  'AT',
  'WHEN'],
 ['FOR', 'Duration', 'DurationUnit'],
 ['FOR', 'Duration', 'DurationUnit']]

In [1663]:

# Calculate accuracy
accuracy = accuracy_score(y_test_flat, y_pred_flat)
accuracy


0.8888888888888888

In [1664]:
# Generate classification report with zero_division parameter to handle undefined metrics

report = classification_report(y_test_flat, y_pred_flat, zero_division=1)

# Print the classification report
print("Classification Report:")
print(report)



Classification Report:
              precision    recall  f1-score   support

         AND       1.00      1.00      1.00         1
          AT       1.00      1.00      1.00         1
      BEFORE       1.00      1.00      1.00         2
         BID       1.00      0.00      0.00         2
       DAILY       0.00      1.00      0.00         0
    Duration       1.00      1.00      1.00         4
 DurationMax       1.00      0.00      0.00         1
DurationUnit       1.00      0.75      0.86         4
       EVERY       1.00      1.00      1.00         3
         FOR       1.00      1.00      1.00         4
        Form       1.00      1.00      1.00         5
   Frequency       0.50      1.00      0.67         1
      Method       1.00      1.00      1.00         3
          PO       1.00      0.00      0.00         2
      Period       1.00      1.00      1.00         4
   PeriodMax       0.50      1.00      0.67         1
  PeriodUnit       0.67      1.00      0.80         4
    

here model's accuracy is 0.89, but precision, recall, and F1 scores vary significantly across labels. Some labels (like AND, Method, and Qty) have perfect scores, while others (like BID, DAILY, and PO) struggle, showing zero precision or recall

### Putting all the prediction logic inside a predict method

In [1667]:
import pycrfsuite

def predict(sig):
    """
    Predict labels for the given medical prescription sig using a CRF model.
    
    @param sig: A string representing a medical prescription sig written by a doctor.
    @return: A list of lists containing predicted labels for the given sig.
    
    >>> predict('2 tabs every 4 hours')
    [['Qty', 'Form', 'EVERY', 'Period', 'PeriodUnit']]
    >>> predict('2 tabs with food')
    [['Qty', 'Form', 'WITH', 'FOOD']]
    >>> predict('2 tabs qid x 30 days')
    [['Qty', 'Form', 'QID', 'FOR', 'Duration', 'DurationUnit']]
    """
    
    # Preprocess the input string into tokens
    tokens = sig.split()
    
    # Assign dummy POS tags for now (this could be replaced with an actual POS tagger)
    tokens_with_tags = [(word, "NN", "") for word in tokens]  # Simple POS tag, no labels
    
    # Convert the tokens into the format expected by token_to_features
    features = get_features(tokens_with_tags)  
    
    # Load the trained CRF model
    crf_model = pycrfsuite.Tagger()
    crf_model.open('crf_m_o_d_e_l.crfsuite')
    
    # Use the trained CRF model to make predictions
    predicted_labels = crf_model.tag(features)  
    
    # Return the predicted labels wrapped in a list to match expected output format
    return [predicted_labels]



### Sample predictions

In [1669]:
predictions = predict("take 2 tabs every 6 hours x 10 days")
predictions

[['Method',
  'Qty',
  'Form',
  'EVERY',
  'Period',
  'PeriodUnit',
  'FOR',
  'Duration',
  'DurationUnit']]

In [1670]:
predictions = predict("2 capsu for 10 day at bed")
predictions

[['Qty', 'Form', 'FOR', 'Duration', 'DurationUnit', 'AT', 'WHEN']]

In [1671]:
predictions = predict("2 capsu for 10 days at bed")
predictions

[['Qty', 'Form', 'FOR', 'Duration', 'DurationUnit', 'AT', 'WHEN']]

In [1672]:
predictions = predict("5 days 2 tabs at bed")
predictions

[['Duration', 'DurationUnit', 'Qty', 'Form', 'AT', 'WHEN']]

In [1673]:
predictions = predict("3 tabs qid x 10 weeks")
predictions

[['Qty', 'Form', 'QID', 'FOR', 'Duration', 'DurationUnit']]

In [1674]:
predictions = predict("x 30 days")
predictions

[['FOR', 'Duration', 'DurationUnit']]

In [1675]:
predictions = predict("x 20 months")
predictions

[['FOR', 'Period', 'PeriodUnit']]

In [1676]:
predictions = predict("take 2 tabs po tid for 10 days")
predictions

[['Method', 'Qty', 'Form', 'PO', 'TID', 'FOR', 'Duration', 'DurationUnit']]

In [1677]:
predictions = predict("take 2 capsules po every 6 hours")
predictions

[['Method', 'Qty', 'Form', 'PO', 'EVERY', 'Period', 'PeriodUnit']]

In [1678]:
predictions = predict("inject 2 units pu tid")
predictions

[['Method', 'Qty', 'Form', 'PO', 'TID']]

In [1679]:
predictions = predict("swallow 3 caps tid by mouth")
predictions

[['Method', 'Qty', 'Form', 'TID', 'BY', 'PO']]

In [1680]:
predictions = predict("inject 3 units orally")
predictions

[['Method', 'Qty', 'Form', 'PeriodUnit']]

In [1681]:
predictions = predict("orally take 3 tabs tid")
predictions

[['Method', 'Method', 'Qty', 'Form', 'TID']]

In [1682]:
predictions = predict("by mouth take three caps")
predictions

[['BY', 'PO', 'Method', 'Qty', 'Form']]

In [1683]:
predictions = predict("take 3 tabs orally three times a day for 10 days at bedtime")
predictions

[['Method',
  'Qty',
  'Form',
  'Frequency',
  'Frequency',
  'TIMES',
  'Period',
  'PeriodUnit',
  'FOR',
  'Duration',
  'DurationUnit',
  'AT',
  'WHEN']]

In [1684]:
predictions = predict("take 3 tabs orally bid for 10 days at bedtime")
predictions

[['Method',
  'Qty',
  'Form',
  'Frequency',
  'TID',
  'FOR',
  'Duration',
  'DurationUnit',
  'AT',
  'WHEN']]

In [1685]:
predictions = predict("take 3 tabs bid orally at bed")
predictions

[['Method', 'Qty', 'Form', 'Frequency', 'PeriodUnit', 'AT', 'WHEN']]

In [1686]:
predictions = predict("take 10 capsules by mouth qid")
predictions

[['Method', 'Qty', 'Form', 'BY', 'PO', 'QID']]

In [1687]:
predictions = predict("inject 10 units orally qid x 3 months")
predictions

[['Method', 'Qty', 'Form', 'Frequency', 'QID', 'Q6H', 'Period', 'PeriodUnit']]

In [1688]:
prediction = predict("please take 2 tablets per day for a month in the morning and evening each day")
predictions

[['Method', 'Qty', 'Form', 'Frequency', 'QID', 'Q6H', 'Period', 'PeriodUnit']]

In [1689]:
prediction = predict("Amoxcicillin QID 30 tablets")
predictions

[['Method', 'Qty', 'Form', 'Frequency', 'QID', 'Q6H', 'Period', 'PeriodUnit']]

In [1690]:
prediction = predict("take 3 tabs TID for 90 days with food")
prediction

[['Method',
  'Qty',
  'Form',
  'Frequency',
  'FOR',
  'Duration',
  'DurationUnit',
  'WITH',
  'FOOD']]

In [1691]:
prediction = predict("with food take 3 tablets per day for 90 days")
prediction

[['WITH',
  'FOOD',
  'Method',
  'Qty',
  'Form',
  'Frequency',
  'PeriodUnit',
  'FOR',
  'Duration',
  'DurationUnit']]

In [1692]:
prediction = predict("with food take 3 tablets per week for 90 weeks")
print(prediction)

[['WITH', 'FOOD', 'Method', 'Qty', 'Form', 'Frequency', 'PeriodUnit', 'FOR', 'Duration', 'DurationUnit']]


In [1693]:
prediction = predict("take 2-4 tabs")
print(prediction)

[['Method', 'Qty', 'Form']]


In [1694]:
prediction = predict("take 2 to 4 tabs")
prediction

[['Method', 'Qty', 'TO', 'Qty', 'Form']]

In [1695]:
prediction = predict("take two to four tabs")
prediction

[['Method', 'Qty', 'TO', 'Qty', 'Form']]

In [1696]:
prediction = predict("take 2-4 tabs for 8 to 9 days")
prediction

[['Method', 'Qty', 'Form', 'FOR', 'Duration', 'TO', 'PeriodMax', 'PeriodUnit']]

In [1697]:
prediction = predict("take 20 tabs every 6 to 8 days")
prediction

[['Method', 'Qty', 'Form', 'EVERY', 'Period', 'TO', 'PeriodMax', 'PeriodUnit']]

In [1698]:
prediction = predict("take 2 tabs every 4 to 6 days")
prediction

[['Method', 'Qty', 'Form', 'EVERY', 'Period', 'TO', 'PeriodMax', 'PeriodUnit']]

In [1699]:
prediction = predict("take 2 tabs every 2 to 10 weeks")
prediction

[['Method',
  'Qty',
  'Form',
  'EVERY',
  'Period',
  'TO',
  'Duration',
  'DurationUnit']]

In [1700]:
prediction = predict("take 2 tabs every 4 to 6 days")
prediction

[['Method', 'Qty', 'Form', 'EVERY', 'Period', 'TO', 'PeriodMax', 'PeriodUnit']]

In [1701]:
prediction = predict("take 2 tabs every 2 to 10 months")
prediction

[['Method', 'Qty', 'Form', 'EVERY', 'Period', 'TO', 'PeriodMax', 'PeriodUnit']]

In [1702]:
prediction = predict("every 60 mins")
prediction

[['EVERY', 'Period', 'PeriodUnit']]

In [1703]:
prediction = predict("every 10 mins")
prediction

[['EVERY', 'Period', 'PeriodUnit']]

In [1704]:
prediction = predict("every two to four months")
prediction

[['EVERY', 'Period', 'TO', 'PeriodMax', 'PeriodUnit']]

In [1705]:
prediction = predict("take 2 tabs every 3 to 4 days")
prediction

[['Method', 'Qty', 'Form', 'EVERY', 'Period', 'TO', 'PeriodMax', 'PeriodUnit']]

In [1706]:
prediction = predict("every 3 to 4 days take 20 tabs")
prediction

[['EVERY',
  'Period',
  'TO',
  'Duration',
  'DurationUnit',
  'Method',
  'Qty',
  'Form']]

In [1707]:
prediction = predict("once in every 3 days take 3 tabs")
prediction

[['Qty', 'Form', 'EVERY', 'Period', 'PeriodUnit', 'Method', 'Qty', 'Form']]

In [1708]:
prediction = predict("take 3 tabs once in every 3 days")
prediction

[['Method', 'Qty', 'Form', 'Frequency', 'PO', 'EVERY', 'Period', 'PeriodUnit']]

In [1709]:
prediction = predict("orally take 20 tabs every 4-6 weeks")
prediction

[['Method', 'Method', 'Qty', 'Form', 'EVERY', 'Period', 'PeriodUnit']]

In [1710]:
prediction = predict("10 tabs x 2 days")
prediction

[['Qty', 'Form', 'FOR', 'Duration', 'DurationUnit']]

In [1711]:
prediction = predict("3 capsule x 15 days")
prediction

[['Qty', 'Form', 'FOR', 'Duration', 'DurationUnit']]

In [1712]:
prediction = predict("10 tabs")
prediction

[['Qty', 'Form']]