Implementaion notes Base 300k total samples 8k single digit * single dit randomly initialised gpt-2 Trained for 300 epochs Use # as = Improvments product contains 2n digits and numbers have n digits with 0 padding