Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Description of Problem:
which caused the necessity to handle non-existent /
Since simply filtering out those training samples and therefore disobey their order would cause consecutive problems, we need to find a more flexible solution.
Overview of the Solution:
since an empty string is valid for
Hi @dakshvar22 ,
I broke my head about a solution at this point. I tried several scenarios and I am now able to comprehend your problems with the architectural decision in this situation.
At least for now I'd say that it is more or less impossible to change things on the
The simple conclusion is: They are doing things with their Doc-object that simply can't be done with an empty Doc - at the moment, e.g. because actually there are no
I then came back to the idea to question this part:
The more I thought about that the more I could feel with your struggle. On idea was to change
I understand your thoughts about obeying the order of the
What should/could we do with them?
I am running out of ideas.
Hey @JulianGerhard21 , thanks for giving this a detailed look. I think, going by your observations, we can't rely on spacy or spacy-pytorch-transformers to help us out here.
I think it makes sense to do this because people can build custom components based on pre-trained BERT using spacy docs or integrating any other library that comes up and relies on spacy docs. Since we already have a
What do you think?