Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Obtain embedding vectors #1

Closed
rominaappierdo opened this issue Mar 16, 2022 · 4 comments
Closed

Obtain embedding vectors #1

rominaappierdo opened this issue Mar 16, 2022 · 4 comments
Assignees

Comments

@rominaappierdo
Copy link

Hello and thank you for sharing your work!
I would like to know if there is a way to obtain embedding vectors of one (or more) sentences fed into the model.
Hope you could help me.
Thank you in any case

@MichalBrzozowski91
Copy link
Collaborator

Hi, thanks for feedback. Can you elaborate what is your use case? So far we have implemented support for longer sequences for the binary classification problem.

@rominaappierdo
Copy link
Author

Thank you for your reply, I just wanted to know if there was a way to simply obtain embedding vectors. I would like to just obtain a representation vector for a given (longer) sequence, no need to perform downstream tasks currently. Is it possible to obtain them with your work?

@MichalBrzozowski91
Copy link
Collaborator

Hi, I was working on it and I think it is possible to obtain embedding vectors using our code. I created a PR with example notebook.
Feel free to ask any questions.
I hope it will help you.

@MichalBrzozowski91 MichalBrzozowski91 self-assigned this Apr 3, 2022
@MichalBrzozowski91
Copy link
Collaborator

HI, we made major changes in this repo. Now in order to obtain embedding vectors, we suggest getting the tokenized chunks from the method _tokenize of the class BertClassifierWithPooling. The obtained tokens can then be directly fed to the pre-trained BERT to obtain embeddings of each chunk.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants