Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding to an existing language model #34

Open
mbachtell opened this issue Oct 1, 2023 · 2 comments
Open

Adding to an existing language model #34

mbachtell opened this issue Oct 1, 2023 · 2 comments

Comments

@mbachtell
Copy link

If I wanted to add to an existing model how would I do that? I have topic specific language from scientific domains that I would like to add.

I didn't see anything in the open or closed tickets.

Thanks!

@mayeulk
Copy link

mayeulk commented Nov 4, 2023

I have the same question. I do not know whether this is currently achievable or not, but the question seems like a duplicate of #12 (which says it is not straightforward). From a mathematical point of view, this is doable as it is a transformer model. From a software point of view, it is already done in some fields, see for instance:

I believe there is at least one strategy:

  1. taking the initial aligned bilingual corpus of the existing model
  2. adding your aligned documents to that corpus
  3. retraining the model on the new corpus.

This would require to add local data, see #24

@mayeulk
Copy link

mayeulk commented Nov 4, 2023

Duplicate of #12

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants