Skip to content

taskdemon/code-vs-codex

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Claude Code vs OpenAI Codex

This repo contains 5 tasks that I gave to Claude Code and OpenAI Codex - in each case the agent was asked to create a detailed plan on how to achieve the task, without actually implementing it.

The tasks are each in the tasks directory, and the responses from Claude Code and OpenAI Codex are in the agent1 and agent2 directories, respectively. This is just me trying to be scientific and not give any hints to the agents when they were doing their comparison.

agent1 = "Claude Code"
agent2 = "OpenAI Codex"

Judging

Given that Claude Code and OpenAI Codex were responsible for creating the plans, who better to ask for a verdict on which one did a better job? I wrote a detailed PROCESS.md file that contains a prompt asking the AI to compare the 2 sets of generated plans in various ways and give its verdict.

The verdicts are in the verdicts directory. To create from-claude.md I ran claude on the CLI and then gave it the prompt:

please read PROCESS.md and answer what it asks you

To create from-codex.md I ran codex on the CLI and then gave it the same prompt.

Verdicts

Neither agent knew whether agent1 was Claude Code or OpenAI. Both gave the slight edge in their verdicts to agent2, which is OpenAI Codex. You can read their full verdict text yourself but the general vibe was that Codex produced more focused plans without what it perceived as over-engineering. That over-engineering may well actually help the LLM do the implementation, so YMMV - this eval method is far from gospel.

Limitations

Obviously, this is a sample set of 5, done one time, in a fairly non-scientific way. OpenAI Codex was released yesterday, so expect all this to change.

Other Impressions

Codex was considerably slower, both generating the plans and comparing them. It seemed to want to spend a lot more time reading than Claude Code does.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published