ValueError: [E177] Ill-formed IOB input detected #5753
-
When I try co convert simple conllu file (link to project page, link to full corpus, 2Gb) with ner the following error appears:
The simplified source file is:
Ner tags are given in the form It would be really helpful if:
if the first most probably is CoNLL-U, it would be helpful to explicitly state so, with a link to the format definition. The second needs an explicit specification. Any help with this error is appreciated. Which page or section is this issue related to? |
Beta Was this translation helpful? Give feedback.
Replies: 6 comments
-
I think you're accidentally using the wrong converter here. For CoNLL-U you can leave off the Let's see, I think the simplest option with spacy v2 is to remove the
If you need support for |
Beta Was this translation helpful? Give feedback.
-
Thanks a lot for a quick response!
Any chance to get help with a related issue while training?
(Strangely enough, I'm just training
) Is there a way to have a list of misbehaving sentences (ids, or just starting text like in 2.3.1), so that sentences with cycles be removed at once? Or maybe you may advise a different way of treating cycles like ignoring them? |
Beta Was this translation helpful? Give feedback.
-
That is pretty unexpected behavior for the
No, not directly with
Hmm, I think it tries to read in all the available annotation in case you might be training a parser later. I would just remove the "dep" and "head" values for all the tokens from the corpus in the JSON files to get around this if you're only training an NER model. |
Beta Was this translation helpful? Give feedback.
-
Source file:
Convert command:
Output:
Thanks in advance! |
Beta Was this translation helpful? Give feedback.
-
I'm not sure what version of spacy you're using? You don't have to use it for training, but try the latest v2 (currently v2.3.1) for converting. I get the error:
This line is malformed:
I suspect your best option is to use tools from universal dependencies to validate the CoNLL-U format. |
Beta Was this translation helpful? Give feedback.
-
The empty file I mentioned was due to spacy 2.1.8. |
Beta Was this translation helpful? Give feedback.
I'm not sure what version of spacy you're using? You don't have to use it for training, but try the latest v2 (currently v2.3.1) for converting. I get the error:
This line is malformed:
I suspect your best option is to use tools from universal dependencies to validate the CoNLL-U format.