Question Regarding LLMs Used in Training and Inference

First, I would like to express my sincere appreciation for your excellent work. Your proposed method is highly innovative with a clear research approach, and it addresses important challenges. I found your paper very insightful and was deeply impressed by your contributions.

While studying your code implementation, I have a question regarding the model configuration that I hope you could help clarify:
Regarding the large language models used in training and inference:

- I noticed that the Gemini API is used during inference. However, if an API-based model is used during training, gradient backpropagation would not be possible.
- If different models are used for training and inference, does this discrepancy affect the performance of the knowledge adapter?

Understanding this aspect is crucial for me to fully grasp the working mechanism of your model. I would be extremely grateful if you could spare some time to address these questions.
Thank you once again for your outstanding work and for making your code publicly available!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question Regarding LLMs Used in Training and Inference #8

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Question Regarding LLMs Used in Training and Inference #8

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions