GitHub

The frameBERT is available for both of English FrameNet 1.7 and Korean FrameNet 1.2.

About

The frameBERT is a BERT-based frame-semantic parser to understand the meaning of texts in terms of FrameNet.

frame (frame semantics) is a schematic representation of a situation or an event. For an example sentence, "The center's director pledged a thorough review of safety precedures", frameBERT identifies several frames such as Being_born and Death for lexical units (e.g., center.n, director.n and pledge.v).

prerequisite

python 3
pytorch (Link)
transformers (Link)
Korean FrameNet (Link)
keras (Link)
nltk (for target identification)
flask_restful (for REST API service)
flask_cors (for REST API service)

For nltk, please download following packages in the python terminal:

import nltk
nltk.download('averaged_perceptron_tagger')
nltk.download('wordnet')

How to use

Install

Install frameBERT, and Korean FrameNet.

(Note: Korean FrameNet would be not mandatory package in the next update)

git clone https://github.com/machinereading/frameBERT.git
cd frameBERT
git clone https://github.com/machinereading/koreanframenet.git

How to use a frame-semantic parser for a language (English or Korean)

1. Download the pretrained model

Download two pretrained model files to {your_model_dir} (e.g. /home/model/).

English Model (recommended for English): (download)
Multilingual Model (English & Korean): (download)

2. Import model (in your python code) (make sure that your code is in a parent folder of frameBERT)

from frameBERT import frame_parser

model_path = {your_model_dir} # absolute_path (e.g. /home/model/)
parser = frame_parser.FrameParser(model_path=model_path, language='en')

optional: If you want to DO NOT USE LU DICTIONARY, set argument masking=False)

3. Parse the input text

text = 'Hemingway was born on July 21, 1899 in Illinois, and died of suicide at the age of 62.'
parsed = parser.parser(text, sent_id='1', result_format='graph')

Then, your result would be:

[('frame:Giving_birth#1', 'frdf:lu', 'born'),
 ('frame:Giving_birth#1', 'frdf:Giving_birth-Child', 'Hemingway'),
 ('frame:Giving_birth#1', 'frdf:Giving_birth-Time', 'on July 21, 1899'),
 ('frame:Giving_birth#1', 'frdf:Giving_birth-Place', 'in Illinois,'),
 ('frame:Death#1', 'frdf:lu', 'died'),
 ('frame:Death#1', 'frdf:Death-Protagonist', 'Hemingway'),
 ('frame:Death#1', 'frdf:Death-Explanation', 'of suicide'),
 ('frame:Killing#1', 'frdf:lu', 'suicide'),
 ('frame:Killing#1', 'frdf:Killing-Victim', 'Hemingway'),
 ('frame:Age#1', 'frdf:lu', 'age'),
 ('frame:Age#1', 'frdf:Age-Age', 'of 62.')]

Also, you can run the Korean FrameBERT for the korean text

parser = frame_parser.FrameParser(model_path=model_path, language='ko')
text = '헤밍웨이는 1899년 7월 21일 미국 일리노이에서 태어났고 62세에 자살로 사망했다.'
parsed = parser.parser(text, sent_id='1', result_format='all')

optional: sent_id and result_format are not mandatory arguments. You can get the result in following argument: conll', graph, textae, and all. The result consits of following three parts:

(1) triple format (result_format='graph') (2) conll format (result_format='conll') (3) pubannotation format (result_format='textae')

Or, you can get all result in json by result_format='all'

Result Format

triple format (as a Graph) The result is a list of triples.

[
    ('frame:Giving_birth#1', 'frdf:lu', 'born'), 
    ('frame:Giving_birth#1', 'frdf:Giving_birth-Child', 'Hemingway'), 
    ('frame:Giving_birth#1', 'frdf:Giving_birth-Time', 'on July 21, 1899'), 
    ('frame:Giving_birth#1', 'frdf:Giving_birth-Place', 'in Illinois,'), 
    ...
]

conll format The result is a list, which consists of multiple Frame-Semantic structures. Each SRL structure is in a list, which consists of four lists: (1) tokens, (2) lexical units, (3) its frames, and (4) its arguments. For example, for the given input text, the output is in the following format:

[
    [
        ['Hemingway', 'was', 'born', 'on', 'July', '21,', '1899', 'in', 'Illinois,', 'and', 'died', 'of', 'suicide', 'at', 'the', 'age', 'of', '62.'], 
        ['_', '_', 'bear.v', '_', '_', '_', '_', '_', '_', '_', '_', '_', '_', '_', '_', '_', '_', '_'], 
        ['_', '_', 'Giving_birth', '_', '_', '_', '_', '_', '_', '_', '_', '_', '_', '_', '_', '_', '_', '_'], 
        ['B-Child', 'O', 'O', 'B-Time', 'I-Time', 'I-Time', 'I-Time', 'B-Place', 'I-Place', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O']
    ], 
    [
    ...
]

Running REST API service

By running the code restApp.py, you can make a standalone REST service at your own server.

How to run REST API service

python restApp.py --port {port number} --language {en|ko} --model {model path}

Example

python restApp.py --port 8888 --language en --model ./models/en

Then, you can use it with the POST method to the url XXX.XXX.XXX.XXX:8888/frameBERT. XXX.XXX.XXX.XXX is your IP address.

Input format

# JSON format
{
 "text": "Hemingway was born on July 21, 1899 in Illinois, and died of suicide at the age of 62.",
 "result_format": "all"
}

How to train a model?

Prepare the FrameNet dataset

# such as
[
 [
  ['Greece', 'wildfires', 'force', 'thousands', 'to', '<tgt>', 'evacuate', '</tgt>'], # token list (target is indicated by the special tokens)
  ['_', '_', '_', '_', '_', '_', 'evacuate.v', '_'],                                  # lu list (lu for target, else '_'
  ['_', '_', '_', '_', '_', '_', 'Escaping', '_'],                                    # Frame list (frame for target, else '_')
  ['O', 'O', 'O', 'B-Escapee', 'O', 'X', 'O', 'X']                                    # FE list (IOB scheme, 'X' for the special tokens)
 ],
 ...
]

Train the model

(reference: train.ipynb)

python train.py --train {TRAINING DATA, e.g., efn} --model_path {DIRECTORY TO SAVE YOUR MODEL} --pretrained_model {default="bert-base-multilingual-cased"} --early_stopping {default=TRUE} --epochs {default=20}

Evaluate the model

(reference: train.ipynb)

python evaluate.py --language {default='ko') --model {DIRECTORY OF YOUR MODEL} --test {test_data} --reult {DIRECTORY TO SAVE THE RESULT}

Licenses

CC BY-NC-SA Attribution-NonCommercial-ShareAlike
If you want to commercialize this resource, please contact to us

Publisher

Machine Reading Lab @ KAIST

Contact

Younggyun Hahm. hahmyg@kaist.ac.kr, hahmyg@gmail.com

Acknowledgement

This work was supported by Institute for Information & communications Technology Promotion(IITP) grant funded by the Korea government(MSIT) (2013-0-00109, WiseKB: Big data based self-evolving knowledge base and reasoning platform)

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
.ipynb_checkpoints		.ipynb_checkpoints
__pycache__		__pycache__
data		data
images		images
koreanframenet		koreanframenet
src		src
README.md		README.md
evaluate.ipynb		evaluate.ipynb
evaluate.py		evaluate.py
evaluate_argid.py		evaluate_argid.py
evaluate_ekfn.py		evaluate_ekfn.py
evaluate_ekfn_finetuning.py		evaluate_ekfn_finetuning.py
evaluate_ekfn_zeroshot.py		evaluate_ekfn_zeroshot.py
evaluate_example.ipynb		evaluate_example.ipynb
evaluate_fn15.py		evaluate_fn15.py
evaluate_fn17.py		evaluate_fn17.py
evaluate_fn17_goldframe.py		evaluate_fn17_goldframe.py
evaluate_frameid.py		evaluate_frameid.py
frame_parser.py		frame_parser.py
frame_parser.pyc		frame_parser.pyc
inference.py		inference.py
nohup.out		nohup.out
restApp.py		restApp.py
stat.ipynb		stat.ipynb
target_identifier.py		target_identifier.py
train.ipynb		train.ipynb
train.py		train.py
train_example.ipynb		train_example.ipynb
train_frameid.py		train_frameid.py
train_multitask.ipynb		train_multitask.ipynb
train_multitask.py		train_multitask.py
train_self.ipynb		train_self.ipynb
train_self.py		train_self.py
train_self_augmentation.ipynb		train_self_augmentation.ipynb
train_self_pkfn.ipynb		train_self_pkfn.ipynb
train_self_skfn.ipynb		train_self_skfn.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

prerequisite

How to use

How to use a frame-semantic parser for a language (English or Korean)

Result Format

Running REST API service

How to run REST API service

Input format

How to train a model?

Prepare the FrameNet dataset

Train the model

Evaluate the model

Licenses

Publisher

Contact

Acknowledgement

About

Releases

Packages

Languages

machinereading/frameBERT

Folders and files

Latest commit

History

Repository files navigation

About

prerequisite

How to use

How to use a frame-semantic parser for a language (English or Korean)

Result Format

Running REST API service

How to run REST API service

Input format

How to train a model?

Prepare the FrameNet dataset

Train the model

Evaluate the model

Licenses

Publisher

Contact

Acknowledgement

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages