This repo showcases a simple (and silly?) LLM eval:
Can the LLM generate a four line poem that follows the ABBA rhyme scheme?
Read the blogpost with more context.
The ABBA rhyme scheme is a four-line poem where the first and last lines rhyme, and the second and third lines rhyme. For example:
In the realm of code and collaboration, (A)
Where developers unite and innovation thrives, (B)
GitHub stands tall, a platform that survives, (B)
Fostering creativity and inspiration. (A)
Can you use LLMs to generate ABBA poems? If so, how? Out of the box today only GPT-4o does an excellent job out of the box. But there are prompting tricks you can apply to make smaller models generate ABBA poems as well.
Notebooks:
- Lets you generate the poems in various ways
- Lets you label the results
- Lets you compare the results
Package only contains the labeling code from PigeonXT, had to copy it in to make a minor change to the printing of the results.
If you'd like to contribute, for example by adding more:
- Models
- Evaluation Criteria
- Prompting Strategies
Feel free to create an issue so we can discuss it!