You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In certain datasets (such as EDTSUM, finQA,...), the input size might exceed the default maximum context length of the language models. I am curious to know the methodologies and considerations employed by the PIXIU team when dealing with such situations. How do you handle evaluations for large input sizes, and what strategies or techniques are implemented to ensure accurate and meaningful results?
The text was updated successfully, but these errors were encountered:
Thank you for raising this issue and for your interest in how we handle evaluations with large input sizes. Here's how we approach this challenge:
Context Length Limitation: Context length is a crucial limitation for language learning models (LLMs), especially for smaller models around the 7B parameter range. This limitation becomes particularly significant when dealing with datasets with inherently large input sizes, such as EDTSUM and finQA.
Truncation for Fair Comparison: To manage this issue and ensure a fair comparison across different models, our strategy involves truncation. We truncate the input data to fit within the maximum context length the model can handle. Although this approach might not capture the full context of the data, it allows for consistent evaluation metrics across various models.
Impact on Model Performance: For specific datasets like fintrade, the impact of context length limitation is markedly evident. Smaller models, due to their limited context length, often fail to generate a trading action. This clearly demonstrates how critical the context length is for the performance and capabilities of LLMs, especially for tasks that require the analysis of large volumes of data.
We're continuously exploring ways to mitigate these limitations and improve our models' ability to handle large input sizes more effectively. Your interest and inquiries are invaluable to our ongoing efforts and discussions on this front.
In certain datasets (such as EDTSUM, finQA,...), the input size might exceed the default maximum context length of the language models. I am curious to know the methodologies and considerations employed by the PIXIU team when dealing with such situations. How do you handle evaluations for large input sizes, and what strategies or techniques are implemented to ensure accurate and meaningful results?
The text was updated successfully, but these errors were encountered: