Reimplementation of OpenAI's "Learning to summarize from human feedback" (blog, paper, original code).
This is being done to spin up on PyTorch and some OpenAI safety/alignment ideas. As much as possible, I'm trying to not look at OpenAI's code (unless I get very stuck, but that kinda hurts my learning experience, so I should minimize that). The idea is to create something to replicate their results (knowing they use PyTorch, and based on the paper).