Skip to content

awsaf49/llm-science-exam

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Kaggle - LLM Science Exam

Use LLMs to answer difficult science questions

Background

There are already some excellent notebooks demonstrating the use of HuggingFace's AutoModelForMultipleChoice for MultipleChoice tasks in Kaggle - LLM Science Exam competition. However, it is challenging to comprehend the underlying mechanisms inside the model. This led me to create this notebook, which is centered around building a MultipleChoice model from the ground up, using the standard classifier from KerasNLP. In this notebook, I also use the multi-backend KerasCore alongside KerasNLP.

Furthermore, as time progresses, it's likely that larger datasets will become available, in which TPUs will be invaluable for training large models on these large datasets.

Kaggle Notebooks

Note: Train and Inference notebooks are also available in the notebooks folder.

Model Architecture

In the image below, you'll find token_ids on the left and corresponding padding_masks on the right: Model Architecture

Augmentation

I also tried a fun augmentation, ShuffleOptions. This approach involves shuffling the answer options of each question. For instance, options [A, B, C] would be transformed into [C, A, B]. The purpose behind this augmentation is to ensure that the model doesn't focus on the positions of the options.

Tracking with WandB

You can track the all experiments here

Known Issues

  • Setting backend tensorflow leads to OOM in RAM which is very weird. You can solve it by either using jax backend or using tf.keras instead of keras.
  • Currently TPU is throwing an error with tensorflow. You can use jax backend with keras_core to resolve this issue.

About

Kaggle - LLM Science Exam || Use LLMs to answer difficult science questions

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published