Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NER prediction #29

Closed
harpap opened this issue Jan 31, 2022 · 3 comments
Closed

NER prediction #29

harpap opened this issue Jan 31, 2022 · 3 comments

Comments

@harpap
Copy link

harpap commented Jan 31, 2022

Hello again,
I am trying to run a prediction.
Let's say I have a txt file with a paragraph that I want to annotate with entities (B-LOC, I-LOC etc), how can I do this?
I have already set up your pretrained model and I succesfully run the test command:
CUDA_VISIBLE_DEVICES=0 python train.py --config config/conll_03_english.yaml --test
I am a little lost with the documentation. I appreciate any help.
Thanks in advance!

@wangxinyu0922
Copy link
Member

wangxinyu0922 commented Jan 31, 2022

Hi,

See this part and this issue for the instructions of predicting.

@harpap
Copy link
Author

harpap commented Jan 31, 2022

thank you for the reply. I still don't understand. I made a file called train.washington in a directory called 'toAnnotate' which contains some sentenses:

If you had to sum up George Washington's life in one word, that word would have to be unforgettable.
George's story is one of travel and adventure, full of risks and, most of all, full of glory.
After all, in 1789, he was elected the first president of the United States, a country that was to become the most powerful in the world.
At the end of his life, in 1799, George was an international hero.

if I run:
CUDA_VISIBLE_DEVICES=0 python train.py --config config/conll_03_english.yaml --parse --target_dir 'toAnnotate/' --keep_order
I get an error
RuntimeError: CUDA out of memory. Tried to allocate 16.00 MiB (GPU 0; 7.93 GiB total capacity; 7.33 GiB already allocated; 6.12 MiB free; 88.86 MiB cached)
I want to have a program that with some senteses as input to get them annotated. Sorry, if I am missing something.

@wangxinyu0922
Copy link
Member

thank you for the reply. I still don't understand. I made a file called train.washington in a directory called 'toAnnotate' which contains some sentenses:

If you had to sum up George Washington's life in one word, that word would have to be unforgettable.
George's story is one of travel and adventure, full of risks and, most of all, full of glory.
After all, in 1789, he was elected the first president of the United States, a country that was to become the most powerful in the world.
At the end of his life, in 1799, George was an international hero.

if I run: CUDA_VISIBLE_DEVICES=0 python train.py --config config/conll_03_english.yaml --parse --target_dir 'toAnnotate/' --keep_order I get an error RuntimeError: CUDA out of memory. Tried to allocate 16.00 MiB (GPU 0; 7.93 GiB total capacity; 7.33 GiB already allocated; 6.12 MiB free; 88.86 MiB cached) I want to have a program that with some senteses as input to get them annotated. Sorry, if I am missing something.

You need to tokenize the sentence to form a conll format at first, then use 'O' as a dummy tag for parsing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants