# Predicting entities using a trained model

Import the required code files

In [1]:
from os import getcwd, path
import sys
import matplotlib.pyplot as plt

BASE_PATH = path.dirname(getcwd())
sys.path.append(BASE_PATH)

from entities_recognition.bilstm.predict import load_model, predict
from config import START_TAG, STOP_TAG

The data format will again need to include:
- An array of text for prediction
- A dictionary of tags (same as used for training)

In [2]:
test_data = [
    'Contact us: hello_vietnam@yahoo.com',
    'hello.sunbox@gmail.com - drop us an email here!~',
    'This is a sentence with 2 email addresses. First one is luungoc2005@yahoo.com and second one: ngoc.nguyen@2359media.com'
]

tag_to_ix = {
    '-': 0,
    'B-EMAIL': 1,
    'I-EMAIL': 2,
    'O-EMAIL': 3,
    START_TAG: 4,
    STOP_TAG: 5
}

Load the pretrained model (path is defined in `bilstm/train.py`)
Must supply the `tag_to_ix` dictionary used during training as a parameter to reconstruct the model

In [3]:
model = load_model(tag_to_ix)

Run the prediction function

In [4]:
predict(model, test_data, tag_to_ix)

Raw predicted tags:
Importing /Users/2359media/Documents/botbot-nlp/data/glove/glove.6B.300d.pickle...
[0, 0, 0, 0, 0, 1, 2, 2, 2, 2, 2, 3]
[1, 2, 2, 2, 2, 2, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 2, 2, 2, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 2, 2, 2, 3]

---
Input: Contact us: hello_vietnam@yahoo.com
Output: 
{'EMAIL': ['hello_vietnam@yahoo.com']}

Input: hello.sunbox@gmail.com - drop us an email here!~
Output: 
{'EMAIL': ['hello.sunbox@gmail.com']}

Input: This is a sentence with 2 email addresses. First one is luungoc2005@yahoo.com and second one: ngoc.nguyen@2359media.com
Output: 
{'EMAIL': ['luungoc2005@yahoo.com', 'nguyen@2359media.com']}



[{'EMAIL': ['hello_vietnam@yahoo.com']},
 {'EMAIL': ['hello.sunbox@gmail.com']},
 {'EMAIL': ['luungoc2005@yahoo.com', 'nguyen@2359media.com']}]