New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add example to Cookbook to export training data for spaCy v3 #420
Comments
Hi guys! Thought I might give some issues a try and go back to good old Rubrix. And also seemed reasonable to start with some documentation 😁. I have several doubts regarding this one.
I hope everything's going fine in sunny, warm Spain 💛. |
Hi @ignacioct !! Thanks so much for coming back! I think that it would be better to work on extending the current |
Hi @ignacioct ! great to hear from you! Yes, I think it makes more sense to extend the I think we should start by implementing the extension for one task only, and show how to use it in the documentation. |
Yes! I would say that the most widely used with a huge difference is NER so we should focus on that one first. |
Hi! I've looked at that method a little bit, and I think this is the way to go. I will need to investigate further on Spacy's docbins, but should not be a problem. Regarding my time, I will be using some spare time that I have, so as long as it's not critical I can take both code and documentation. And yeah, we can communicate via slack, this discussion (or using the forums, however you guys usually proceed) and we could have a brief talk. The method that steals the least amount of time from you guys 😊. |
Hey, just left you a message on the Rubrix slack channel, I think this will be the easiest way to get started. |
Hi! I've added a PR with the first draft of what the method would look like. It's is of course a first version, and further coding/testing must be done, but just to be sure we are on the same page with the direction of this @dcfidalgo @dvsrepo |
After reviewing this, I've improved the cookbook description and removed the cell output. Once this change is merged this is ready to go in |
Similar to the following example:
https://rubrix.readthedocs.io/en/stable/guides/cookbook.html#Training
Add example to transform a Rubrix dataset into a spaCy Docbin format and save it to disk (usable with the train spacy command for training)
The text was updated successfully, but these errors were encountered: