New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using the NeuroNLP2 in a different data format #11
Comments
Hi, |
I see. Assigning "None" to pos, chunk and ner variables in |
I am not sure if assigning None to them will raise errors or not. I read
POS and chunk information for the consideration to use them in the future.
I guess inserting '_' is a good idea :)
…On Tue, Apr 24, 2018 at 3:06 PM, Ayrton Denner ***@***.***> wrote:
I see. Assigning "None" to pos, chunk and ner variables in
create_alphabets isn't enough? This way won't be any real assignment to
such values. Or I guess I will insert "_" chars in my database, so I can
create two more columns to match the current code.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#11 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADUtlkrpdDgS7GD_GZVmDBICJO3ScDR_ks5tr3exgaJpZM4TiN9R>
.
--
------------------
Best regards,
Ma,Xuezhe
Language Technologies Institute,
School of Computer Science,
Carnegie Mellon University
Tel: +1 206-512-5977
|
Ok, so I will try that. Thanks! |
Hello, I saw in #9 that you used a data formed of 4 columns for NER. I am trying to run it in a corpus formed of 2 columns, like in this pic:
So, my text base is formed of a column with an word and another column with a tag only. Is there any way to parameterize the script to support such kind of data, or I will have to adapt the code specific for my use? For instance, I will have to change in
conll03_data
to read tokens[0] instead of tokens[1] as an word, and deal with pos, chunk and ner alphabet. Anything else I should know?Thanks.
The text was updated successfully, but these errors were encountered: