Skip to content

Question Regarding LLMs Used in Training and Inference #8

@wz-wz111

Description

@wz-wz111

First, I would like to express my sincere appreciation for your excellent work. Your proposed method is highly innovative with a clear research approach, and it addresses important challenges. I found your paper very insightful and was deeply impressed by your contributions.

While studying your code implementation, I have a question regarding the model configuration that I hope you could help clarify:
Regarding the large language models used in training and inference:

  • I noticed that the Gemini API is used during inference. However, if an API-based model is used during training, gradient backpropagation would not be possible.
  • If different models are used for training and inference, does this discrepancy affect the performance of the knowledge adapter?

Understanding this aspect is crucial for me to fully grasp the working mechanism of your model. I would be extremely grateful if you could spare some time to address these questions.
Thank you once again for your outstanding work and for making your code publicly available!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions