Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pretrained BERT model? #197

Closed
priamai opened this issue Sep 4, 2023 · 3 comments
Closed

Pretrained BERT model? #197

priamai opened this issue Sep 4, 2023 · 3 comments

Comments

@priamai
Copy link

priamai commented Sep 4, 2023

Hi there,
I have just deployed the last version via docker and noticed that there are only 2 models pre trained.

image

It would useful to know how to:
a) train the SCIBERT on some annotated dataset (the link is broken I guess is a private repo)?
b) download a pre-trained SCIBERT

Cheers!
@mehaase

@mehaase
Copy link
Contributor

mehaase commented Sep 5, 2023

Hi @priamai, that screen is a bit misleading. It is showing stats for the models that were trained inside the container; the SciBERT model is trained outside the container (by us, on high-end GPUs) and downloaded into the docker container. If you want to fine-tune the model on your own data, we have some jupyter notebooks to facilitate that: https://github.com/center-for-threat-informed-defense/tram/wiki/Large-Language-Models#jupyter-notebooks

(I also fixed the broken link that you were looking at: https://github.com/center-for-threat-informed-defense/tram/wiki/Data-Annotation)

@priamai
Copy link
Author

priamai commented Sep 5, 2023

Hi @mehaase but when I upload a report it doesn't let me choose the mode, so does it default to the SCIBERT?
Thanks for fixing the link!
I love the colabo books so we can fine tune for free on Colab!

@mehaase
Copy link
Contributor

mehaase commented Sep 5, 2023

Yes it defaults to scibert. The choice of model is specified entrypoint.sh.

@mehaase mehaase closed this as completed Sep 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants