Use LLMs to answer difficult science questions
There are already some excellent notebooks demonstrating the use of HuggingFace's AutoModelForMultipleChoice
for MultipleChoice tasks in Kaggle - LLM Science Exam competition. However, it is challenging to comprehend the underlying mechanisms inside the model. This led me to create this notebook, which is centered around building a MultipleChoice model from the ground up, using the standard classifier from KerasNLP. In this notebook, I also use the multi-backend KerasCore alongside KerasNLP.
Furthermore, as time progresses, it's likely that larger datasets will become available, in which TPUs will be invaluable for training large models on these large datasets.
- training: LLM Science Exam: KerasCore + KerasNLP [TPU]
- inference: LLM Science Exam: KerasCore + Keras [Infer]
Note: Train and Inference notebooks are also available in the
notebooks
folder.
In the image below, you'll find token_ids
on the left and corresponding padding_masks
on the right:
I also tried a fun augmentation, ShuffleOptions
. This approach involves shuffling the answer options of each question. For instance, options [A, B, C]
would be transformed into [C, A, B]
. The purpose behind this augmentation is to ensure that the model doesn't focus on the positions of the options.
You can track the all experiments here
- Setting backend
tensorflow
leads to OOM in RAM which is very weird. You can solve it by either usingjax
backend or usingtf.keras
instead ofkeras
. - Currently
TPU
is throwing an error withtensorflow
. You can usejax
backend withkeras_core
to resolve this issue.