New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Creating and loading custom pipeline components example #2947
Comments
Yes, that's definitely an option! Alternatively, you could also add an entry to the from spacy.language import Language
Language.factories['entity_matcher'] = lambda nlp, **cfg: EntityMatcher(nlp) The factory receives the shared nlp = spacy.load('some_custom_model', entity_matcher_label='SOME_LABEL') For a more detailed end-to-end example of how to package custom models or components in a model, you might find my comment on this thread helpful. We're also hoping that we can make this process easier in the upcoming version(s). Btw, another related thing to check out: If you're working with rule-based NER (or a combination of rule-based and statistical NER), you can also try my new |
Hi Ines, Thanks, I had already added the entity_matcher to the Language.factories and that is where it required the nlp object just to initialise the entity matcher and run the terms through the nlp object. The problem is this: In #2682 you stated that the weights are not loaded yet and that only after everything is initialised the weights would be loaded. If I just added the entity_matcher to the Language.factories then I would get a "bool has no tok2vec" error or something similar. This is not a problem when using the nlp.add_pipe() method as the nlp object already has the weights loaded. I will definitely look into 2.1 and the EntityRuler - thanks for the tip! |
Ahh, sorry, I think I missed that. (That I wonder what happens if you just use |
Thanks nlp.make_doc(text) seems to do the trick! In general how stable is the nightly? Would it be ok for development purposes until 2.1 is officially released? Edit: Ok, I can confirm that 2.1 solves many issues... Will continue developing with it and am looking forward to the stable release! I have just followed the notes https://github.com/explosion/spaCy/releases/tag/v2.1.0a1 and I assume those are the latest changes published? |
Yes, the nightly should be good in development – in fact, it's always super helpful to have more people test it in "real life" so we can make sure there are no bugs or regressions. The latest alpha version is |
<!--- Provide a general summary of your changes in the title. --> ## Description The new website is implemented using [Gatsby](https://www.gatsbyjs.org) with [Remark](https://github.com/remarkjs/remark) and [MDX](https://mdxjs.com/). This allows authoring content in **straightforward Markdown** without the usual limitations. Standard elements can be overwritten with powerful [React](http://reactjs.org/) components and wherever Markdown syntax isn't enough, JSX components can be used. Hopefully, this update will also make it much easier to contribute to the docs. Once this PR is merged, I'll implement auto-deployment via [Netlify](https://netlify.com) on a specific branch (to avoid building the website on every PR). There's a bunch of other cool stuff that the new setup will allow us to do – including writing front-end tests, service workers, offline support, implementing a search and so on. This PR also includes various new docs pages and content. Resolves #3270. Resolves #3222. Resolves #2947. Resolves #2837. ### Types of change enhancement ## Checklist <!--- Before you submit the PR, go over this checklist and make sure you can tick off all the boxes. [] -> [x] --> - [x] I have submitted the spaCy Contributor Agreement. - [x] I ran the tests, and all new and existing tests passed. - [x] My changes don't require a change to the documentation, or if they do, I've added all required information.
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Hi, I have been struggling to deal with the loading of a custom pipeline component after saving the model. I found this solution here: #2682 (comment)
Would it be a good idea to update the docs to suggest users do something like this to prevent the issue of requiring the "nlp" object to initialize the pipe when loading from disk? (Kind of a chicken and egg situation) Perhaps I am overlooking something and there is already a better solution.
Therefore this:
The source can be found here:
https://spacy.io/usage/processing-pipelines#custom-components
Edit: The self.initialized = True needs to be set directly after the if statement before nlp is called.
The text was updated successfully, but these errors were encountered: