<a href="https://colab.research.google.com/github/SuryaPradeepM/LLM_applications/blob/main/Promptify_Pkg_NER.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
%%capture
# !git clone https://github.com/promptslab/Promptify.git
!pip3 install openai
!pip3 install git+https://github.com/promptslab/Promptify.git

<h2>Features 🚀</h2>
<ul>
  <li>🧙‍♀️ NER in 2 lines of code with no training data required</li>
  <li>🔨 Easily add one shot, two shot, or few shot examples to the prompt</li>
  <li>✌ Output always provided as a Python object (e.g. list, dictionary) for easy parsing and filtering</li>
  <li>💥 Custom examples and samples can be easily added to the prompt</li>
  <li>💰 Optimized prompts to reduce OpenAI token costs (coming soon)</li>
</ul>




### Define any LLM model (such as GPT-3)  ✅

In [10]:
# %cd /content/Promptify

import ast
import json
from promptify import OpenAI
from promptify import Prompter 
from pprint import pprint
from IPython.display import Markdown, display
from IPython.core.display import display, HTML


# Define the API key for the OpenAI model
api_key  = "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"


# Create an instance of the OpenAI model, Currently supporting Openai's all model, In future adding more generative models from Hugginface and other platforms
model = OpenAI(api_key)
nlp_prompter = Prompter(model)


# Example testence for demonstration
test = """

Please find detailed below our bank details as requested:- 
BANK NAME THE ROYAL BANK OF SCOTLAND 
 15 FOREGATE STREET 
 CHESTER 
 CH1 1HD 
SORT CODE 16 - 16 – 14 
ACCOUNT NUMBER 11179447 
ACCOUNT NAME MULTI FACTOR EUROPE 
SWIFT CODE RBOSGB2L 
IBAN CODE GB39RBOS16161411179447 
VAT REG NO. GB 862 6906 95

"""
print(test)



Please find detailed below our bank details as requested:- 
BANK NAME THE ROYAL BANK OF SCOTLAND 
 15 FOREGATE STREET 
 CHESTER 
 CH1 1HD 
SORT CODE 16 - 16 – 14 
ACCOUNT NUMBER 11179447 
ACCOUNT NAME MULTI FACTOR EUROPE 
SWIFT CODE RBOSGB2L 
IBAN CODE GB39RBOS16161411179447 
VAT REG NO. GB 862 6906 95




In [24]:
# Define our custom domain and custom NER label set

DOMAIN = "Invoice"
LABEL_SET = ["ACC_NUM", "IBAN", "BANK_NAME", "ADDRESS", "SORT", "SWIFT"]

### 1: Named Entity Recognition (NER) Example in 2 Lines of code, with no training data required 🚀



In [8]:
# Named Entity Recognition with No labels, no description, no oneshot, no examples
# Simple prompt with instructions
# domain name gives more info to model for better result generation, the parameter is optional
# Output will be python object -> [ {'E' : Entity Name, 'T': Type of Entity } ]


result = nlp_prompter.fit('ner.jinja',
                          domain      = DOMAIN,
                          text_input  = test, 
                          labels      = LABEL_SET)

for res in result:
    pprint(res, compact=True)

{'parsed': {'data': {'completion': [{'E': 'The Royal Bank of Scotland',
                                     'T': 'BankName'},
                                    {'E': '15 Foregate Street, Chester, CH1 '
                                          '1HD',
                                     'T': 'Address'},
                                    {'E': '16-16-14', 'T': 'SortCode'},
                                    {'E': '11179447', 'T': 'AccountNumber'},
                                    {'E': 'Multi Factor Europe',
                                     'T': 'AccountName'},
                                    {'E': 'RBOSGB2L', 'T': 'SwiftCode'},
                                    {'E': 'GB39RBOS16161411179447',
                                     'T': 'IBANCode'},
                                    {'E': 'GB 862 6906 95', 'T': 'VATRegNo'},
                                    {'branch': 'Bank details',
                                     'group': 'Bank account details'}],
           

###2 : Named Entity Recognition (NER) with Custom Tags/Entities  🚀



In [14]:
# Case : 2
# If we want to perform NER with custom tags only (hangling out-of-bounds prediction) prompt


result = nlp_prompter.fit('ner.jinja',
                          domain      = DOMAIN,
                          text_input  = test, 
                          labels      = LABEL_SET)


pprint(result, compact=True)

[{'parsed': {'data': {'completion': [{'E': 'THE ROYAL BANK OF SCOTLAND',
                                      'T': 'BANK_NAME'},
                                     {'E': '15 FOREGATE STREET CHESTER CH1 1HD',
                                      'T': 'ADDRESS'},
                                     {'E': '16 - 16 – 14', 'T': 'SORT'},
                                     {'E': '11179447', 'T': 'ACC_NUM'},
                                     {'E': 'RBOSGB2L', 'T': 'SWIFT'},
                                     {'E': 'GB39RBOS16161411179447',
                                      'T': 'IBAN'},
                                     {'branch': 'BANK NAME',
                                      'group': 'BANK DETAILS'},
                                     {'branch': 'ADDRESS',
                                      'group': 'BANK DETAILS'},
                                     {'branch': 'SORT CODE',
                                      'group': 'BANK DETAILS'},
                         

[{'E': 'THE ROYAL BANK OF SCOTLAND', 'T': 'BANK_NAME'},
 {'E': '15 FOREGATE STREET CHESTER CH1 1HD', 'T': 'ADDRESS'},
 {'E': '16 - 16 – 14', 'T': 'SORT'},
 {'E': '11179447', 'T': 'ACC_NUM'},
 {'E': 'RBOSGB2L', 'T': 'SWIFT'},
 {'E': 'GB39RBOS16161411179447', 'T': 'IBAN'},
 {'branch': 'BANK NAME', 'group': 'BANK DETAILS'},
 {'branch': 'ADDRESS', 'group': 'BANK DETAILS'},
 {'branch': 'SORT CODE', 'group': 'BANK DETAILS'},
 {'branch': 'ACCOUNT NUMBER', 'group': 'BANK DETAILS'},
 {'branch': 'ACCOUNT NAME', 'group': 'BANK DETAILS'},
 {'branch': 'SWIFT CODE', 'group': 'BANK DETAILS'},
 {'branch': 'IBAN CODE', 'group': 'BANK DETAILS'},
 {'branch': 'VAT REG NO.', 'group': 'VAT DETAILS'}]


###3 : Named Entity Recognition (NER) with One shot Example  🚀

In [20]:
# Case : 3
# If we want to perform NER wit one shot example adding by default
# Observe The changes in the model's output
# the example format should be -> [ [Text, [{'E' : Entity Name, 'T': Type of Entity }]] ]

one_shot = "1. Bank Account Number: 123610771 SBI Bank, Paradise Branch, Secunderabad, Hyderabad, India\n"
one_shot = [[one_shot, [{'E': 'ACC_NUM', 'W': '123610771'}, {'E': 'BANK_NAME', 'W': 'SBI Bank'}, {'E': 'ADDRESS', 'W': 'Paradise Branch, Secunderabad, Hyderabad, India'}]]]


result = nlp_prompter.fit('ner.jinja',
                          domain      = DOMAIN,
                          text_input  = test,
                          examples    = one_shot,
                          labels      = LABEL_SET)


pprint(result, compact=True)

[{'parsed': {'data': {'completion': [[{'E': 'BANK_NAME',
                                       'W': 'THE ROYAL BANK OF SCOTLAND'},
                                      {'E': 'ADDRESS',
                                       'W': '15 FOREGATE STREET, CHESTER, CH1 '
                                            '1HD'},
                                      {'E': 'SORT', 'W': '16-16-14'},
                                      {'E': 'ACC_NUM', 'W': '11179447'},
                                      {'E': 'SWIFT', 'W': 'RBOSGB2L'},
                                      {'E': 'IBAN',
                                       'W': 'GB39RBOS16161411179447'}]],
                      'suggestions': []},
             'object_type': <class 'list'>,
             'status': 'completed'},
  'text': " [[{'E': 'BANK_NAME', 'W': 'THE ROYAL BANK OF SCOTLAND'}, {'E': "
          "'ADDRESS', 'W': '15 FOREGATE STREET, CHESTER, CH1 1HD'}, {'E': "
          "'SORT', 'W': '16-16-14'}, {'E': 'ACC_NUM', 'W': '111794

In [21]:
pprint(ast.literal_eval(result[0]['text'].strip()))

[[{'E': 'BANK_NAME', 'W': 'THE ROYAL BANK OF SCOTLAND'},
  {'E': 'ADDRESS', 'W': '15 FOREGATE STREET, CHESTER, CH1 1HD'},
  {'E': 'SORT', 'W': '16-16-14'},
  {'E': 'ACC_NUM', 'W': '11179447'},
  {'E': 'SWIFT', 'W': 'RBOSGB2L'},
  {'E': 'IBAN', 'W': 'GB39RBOS16161411179447'}]]


###5 : Named Entity Recognition (NER) with some Domain Knowledge 🚀

In [25]:
#Case : 5
#If want to give some domain knowledge and description in prompt to enhance the output

result = nlp_prompter.fit('ner.jinja',
                          domain      = DOMAIN,
                          text_input  = test,
                          examples    = one_shot,
                          description = "The paragraph shows various banking details from a generated invoice document",
                          labels      = LABEL_SET)

pprint(result, compact=True)

390 3610
[{'parsed': {'data': {'completion': [[{'E': 'BANK_NAME',
                                       'W': 'THE ROYAL BANK OF SCOTLAND'},
                                      {'E': 'ADDRESS',
                                       'W': '15 FOREGATE STREET, CHESTER CH1 '
                                            '1HD'},
                                      {'E': 'SORT', 'W': '16-16-14'},
                                      {'E': 'ACC_NUM', 'W': '11179447'},
                                      {'E': 'SWIFT', 'W': 'RBOSGB2L'},
                                      {'E': 'IBAN',
                                       'W': 'GB39RBOS16161411179447'}]],
                      'suggestions': []},
             'object_type': <class 'list'>,
             'status': 'completed'},
  'text': " [[{'E': 'BANK_NAME', 'W': 'THE ROYAL BANK OF SCOTLAND'}, {'E': "
          "'ADDRESS', 'W': '15 FOREGATE STREET, CHESTER CH1 1HD'}, {'E': "
          "'SORT', 'W': '16-16-14'}, {'E': 'ACC_NUM', 'W': 

In [26]:
pprint(ast.literal_eval(result[0]['text'].strip()))

[[{'E': 'BANK_NAME', 'W': 'THE ROYAL BANK OF SCOTLAND'},
  {'E': 'ADDRESS', 'W': '15 FOREGATE STREET, CHESTER CH1 1HD'},
  {'E': 'SORT', 'W': '16-16-14'},
  {'E': 'ACC_NUM', 'W': '11179447'},
  {'E': 'SWIFT', 'W': 'RBOSGB2L'},
  {'E': 'IBAN', 'W': 'GB39RBOS16161411179447'}]]
