alignment-tax

Here is 1 public repository matching this topic...

MidnightDarling / when-better-means-less

When Better Means Less: Quantifying What Benchmarks Miss Between Model Generations. 2,310 controlled comparisons show GPT-5 series lost 6.7x creativity and gained 4.4x false refusals vs chatgpt-4o-latest — invisible to standard benchmarks.

ai-safety model-comparison gpt-5 llm-evaluation benchmark-evaluation chatgpt-4o-latest keep4o alignment-tax false-refusal-rate creativity-measurement

Updated Feb 23, 2026
Python

Improve this page

Add a description, image, and links to the alignment-tax topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the alignment-tax topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

alignment-tax

Here is 1 public repository matching this topic...

MidnightDarling / when-better-means-less

Improve this page

Add this topic to your repo