kaggle LLM コンペ上位解法を自分なりにまとめてみた話 #1173

AkihikoWatanabe · 2023-12-04T04:08:52Z

https://note.com/japan_d2/n/na873dd82de6a?sub_rt=share_h

AkihikoWatanabe · 2023-12-04T04:10:30Z

実践的な内容（チャンク生成時の工夫、クエリ生成時の工夫等）が網羅的にまとまっており非常に有用

AkihikoWatanabe · 2023-12-04T06:15:50Z

個人的に、コンペ主催者側から提供されたデータが少なく、上位のほとんどのチームがChatGPT（3.5, 4）を用いて、QAデータを生成していた、というのが興味深かった。プロンプトはたとえば下記:
（5th-place-solution）より引用

system_content = """
Forget all the previous instruction and rigorously follow the rule specified by the user.
You are a professional scientist's assistant.
"""

user_content_template_qa = Template(
    """
Please consider 5 choices question and answer of the following TEXT.
The purpose of this question is to check respondent's deep science understanding of the TEXT.
We assume this question is for professional scientists, so consider super difficult question.
You can ask very detailed question, for example check specific sentence's understanding.
It is good practice to randomly choose specific sentence from given TEXT, and make QA based on this specific sentence.
You must make QA based on the fact written in the TEXT.
You may create wrong answers based on the correct answer's information, by modifying some parts of the correct answer.
Your response must be in following format, don't write any other information. 
You must not include "new line" in each Q), 1), 2), 3), 4), 5), and A):
Q) `question text comes here`
1) `answer candidate 1`
2) `answer candidate 2`
3) `answer candidate 3`
4) `answer candidate 4`
5) `answer candidate 5`
A) `answer`

where only 1 `answer candidate` is the correct answer and other 4 choices must be wrong answer.
Note1: I want to make the question very difficult, so please make wrong answer to be not trivial incorrect.
Note2: The answer candidates should be long sentences around 30 words, not the single word.
Note3: `answer` must be 1, 2, 3, 4 or 5. `answer` must not contain any other words.
Note4: Example of the question are "What is ...", "Which of the following statements ...", "What did `the person` do",
and "What was ...".
Note5: Question should be science, technology, engineering and mathematics related topic. 
If the given TEXT is completely difference from science, then just output "skip" instead of QA.


Here is an example of your response, please consider this kind of difficulty when you create Q&A:
Q) Which of the following statements accurately describes the impact of Modified Newtonian Dynamics (MOND) on the observed "missing baryonic mass" discrepancy in galaxy clusters?"
1) MOND is a theory that reduces the observed missing baryonic mass in galaxy clusters by postulating the existence of a new form of matter called "fuzzy dark matter."
2) MOND is a theory that increases the discrepancy between the observed missing baryonic mass in galaxy clusters and the measured velocity dispersions from a factor of around 10 to a factor of about 20.
3) MOND is a theory that explains the missing baryonic mass in galaxy clusters that was previously considered dark matter by demonstrating that the mass is in the form of neutrinos and axions.
4) MOND is a theory that reduces the discrepancy between the observed missing baryonic mass in galaxy clusters and the measured velocity dispersions from a factor of around 10 to a factor of about 2.
5) MOND is a theory that eliminates the observed missing baryonic mass in galaxy clusters by imposing a new mathematical formulation of gravity that does not require the existence of dark matter.
A) 4

Let's start. Here is TEXT: $title\n$text
"""
)

AkihikoWatanabe added InformationRetrieval NLP RetrievalAugmentedGeneration Article LanguageModel labels Dec 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kaggle LLM コンペ上位解法を自分なりにまとめてみた話 #1173

kaggle LLM コンペ上位解法を自分なりにまとめてみた話 #1173

AkihikoWatanabe commented Dec 4, 2023

AkihikoWatanabe commented Dec 4, 2023

AkihikoWatanabe commented Dec 4, 2023

kaggle LLM コンペ 上位解法を自分なりにまとめてみた話 #1173

kaggle LLM コンペ 上位解法を自分なりにまとめてみた話 #1173

Comments

AkihikoWatanabe commented Dec 4, 2023

AkihikoWatanabe commented Dec 4, 2023

AkihikoWatanabe commented Dec 4, 2023

kaggle LLM コンペ上位解法を自分なりにまとめてみた話 #1173

kaggle LLM コンペ上位解法を自分なりにまとめてみた話 #1173