Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modifing the weights of words in the models #335

Closed
SeonggwanAhn opened this issue Jul 24, 2023 · 3 comments
Closed

Modifing the weights of words in the models #335

SeonggwanAhn opened this issue Jul 24, 2023 · 3 comments

Comments

@SeonggwanAhn
Copy link

SeonggwanAhn commented Jul 24, 2023

Hi. Your work MedCAT is so impressive.
I want to ask you a question.

Are the weights of words in the model changeable?
If possible, please let me know how to modify the weights of words in model.

Thanks

@mart-r
Copy link
Collaborator

mart-r commented Jul 25, 2023

Hi!

I'm not entirely sure what you're asking.
Are you trying to add more weight to a specific meaning of an ambiguous word (name)?
Are you trying to avoid recognition of certain words (names) altogether?
Something else?

But in general, there is no way to add more weights to any specific concept or a specific name of a concept.
With that said, the training set will have a significant impact on which concepts and/or names the model is able to effectively identify.

Then again, if you wish to limit the concepts your model is working with, you can always filter out the CUIs you don't need (CDB.filter_by_cui).
Or you could add the CUIs to a filter in the config (config.linking.filters.cuis).

@SeonggwanAhn
Copy link
Author

Thanks for your reply.
What I mean is whether I can 'modify the concept vector' of a specific word in vocabulary.

Or Can I further train an already trained model(download completed) with my additional document?
I want to transfer and adjust this model for my experiment.

@mart-r
Copy link
Collaborator

mart-r commented Aug 21, 2023

Yes, you are more than welcome to further train and/or fine tune an existing model. The additional training data can change what the model can recognise significantly. But it all depends on the training route you're taking (whether unsupervised or supervised) as well as the specific training set.

So all in all, by using your own dataset to further train the model, you can probably achieve what you're trying to do.
But it almost certainly won't be possible with a single document.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants