Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add stacked embedding in LanguageModel model #3165

Closed
datanerdo opened this issue Mar 28, 2023 · 11 comments
Closed

add stacked embedding in LanguageModel model #3165

datanerdo opened this issue Mar 28, 2023 · 11 comments
Labels
question Further information is requested

Comments

@datanerdo
Copy link

datanerdo commented Mar 28, 2023

Question

How to add stacked embedding features in our LanguageModel model? I have customised the embedding and done 2 models; forward and backward model for my text generation project, I didn't find the correct method to add the stacked embedding yet into the LanguageModel model..

@datanerdo datanerdo added the question Further information is requested label Mar 28, 2023
@datanerdo datanerdo changed the title dding How to add stacked embedding in LanguageModel model Mar 28, 2023
@datanerdo datanerdo changed the title How to add stacked embedding in LanguageModel model add stacked embedding in LanguageModel model Mar 28, 2023
@alanakbik
Copy link
Collaborator

Hello @datanerdo, you can combine your Flair language models with StackedEmbeddings into a single embedding. But this embedding can then only be used for downstream tasks like sequence tagging or text classification. See for instance the tutorial here.

We do not yet have text generation as a trainable downstream task and use the LanguageModel only for pre-training a single LM. So an embedding stack cannot be added here. Perhaps we could add such functionality in the future. Could you describe your text generation task in more detail?

@datanerdo
Copy link
Author

Our main project objective is to automate the text generation of a report.
One of our attempts to achieve this is by using character-based model with the text generation method in Flair.

We tried to customise our own embedding by following the tutorials in Chapters 5 and 7 from the book, "Natural Language Processing with Flair (by Tadej Magajna)".

We have trained the forward and backward model separately. So the issue here is, we are unsure of how to stack both the forward & backward flair embeddings together.

So far we have used the LanguageModel class in our model. However, we found that there is no parameter available in the class to allow us to use StackedEmbeddings.

Do you have any suggestions on how to approach this matter?

Also, as an alternative, we are thinking of using Seq2SeqGenerator class. We would like to ask your advice on this. Is it available in Flair and is it suitable for our project? If yes, do you have a tutorial for this that you can kindly share with us?

If no, is it possible to add a parameter in LanguageModel class that can allow us to use StackedEmbeddings?

Thank you for your kind help. We really appreciate it.

@alanakbik
Copy link
Collaborator

Hello @datanerdo the two language models can each generate text, but they cannot be combined.

The forward LM can generate likely continuations, which is probably what you are looking for. (The backward LM can only generate "previous" text in inverted form, so it is unlikely to be useful for text generation.)

Have you tried using the generate_text function of the forward model you trained? You can sample some generated text like this:

from flair.models import LanguageModel

language_model = LanguageModel.load_language_model("resources/LMs/1BW-1024/best-lm.pt")
print(language_model)

# set a prefix you want to continue
prefix = 'The meaning of life is'

# generate 10 samples of text continuations
for i in range(10):
    # print generated text (50 generated characters)
    generated_continuation = language_model.generate_text(prefix, number_of_characters=50)
    print(generated_continuation)

Regarding Seq2SeqGenerator we don't have good support for Seq2Seq in Flair at the moment, so I cannot say how well this would work.

@datanerdo
Copy link
Author

yes that works for me! One final question; what is the best technique to evaluate language model in terms of its perplexity and to see how well our model generalizes? Thank you :))

@alanakbik
Copy link
Collaborator

Hello @datanerdo we typically do an "intrinsic" and an "extrinsic" evaluation. For the intrinsic evaluation, we compute perplexity on holdout data. This automatically happens when you train the model. In our case, the extrinsic evaluation is done using downstream NLP tasks like NER and PoS tagging: we use the language model as an embedding and see how well the downstream task is solved.

In your case for text generation you could have a look a BLEU or other metrics that deal with text generation, provided you have suitable evaluation data.

@datanerdo
Copy link
Author

thanks Alan! will do

@datanerdo
Copy link
Author

datanerdo commented Apr 17, 2023 via email

@datanerdo
Copy link
Author

Hello @datanerdo the two language models can each generate text, but they cannot be combined.

The forward LM can generate likely continuations, which is probably what you are looking for. (The backward LM can only generate "previous" text in inverted form, so it is unlikely to be useful for text generation.)

Have you tried using the generate_text function of the forward model you trained? You can sample some generated text like this:

from flair.models import LanguageModel

language_model = LanguageModel.load_language_model("resources/LMs/1BW-1024/best-lm.pt")
print(language_model)

# set a prefix you want to continue
prefix = 'The meaning of life is'

# generate 10 samples of text continuations
for i in range(10):
    # print generated text (50 generated characters)
    generated_continuation = language_model.generate_text(prefix, number_of_characters=50)
    print(generated_continuation)

Regarding Seq2SeqGenerator we don't have good support for Seq2Seq in Flair at the moment, so I cannot say how well this would work.

Hi. I did tried to use generate_text method and now the result is giving generated text and some number behind the generated text. What does that indicates?

@alanakbik
Copy link
Collaborator

The number is the average log probability of the generated sample text. The higher the number, the more "unusual" is the sampling (i.e. more characters were sampled that were less likely given the text before).

@datanerdo
Copy link
Author

Hi. I have a question regarding generate_text method and I was hoping you could provide me with some guidance.

image002

Specifically, I set the parameter number_of_characters= 200, prefix= cybersecurity as follows, and it should be predicting 200 characters after the prefix. However the results turns out to be like this:

image003

Could you please explain the reason behind this? Why thus the model is not giving 200 characters?

I appreciate your expertise and insight on this matter and I am excited to learn more from you. Thank you very much for your time and assistance.

@helpmefindaname
Copy link
Collaborator

Hi @datanerdo
since you have set break_on_suffix="\n" in you code, the text generation will stop after generating the \n symbol and therefore creates less characters

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants