This data is a collation of several chess opening databases, identified as follows:
- eco_tsv: Source: eco. This is the authoritive database, which supplants conflicts with the databases listed below (such as move order or ECO code).
- eco_js: The original eco.json data from several years ago, which contains some openings not in eco_tsv
- scid: An database that's part of a sourceforge project, pulled via Waterford Chess Club's website. SCID codes extend ECO, and opening names vary.
- eco_wikip: Opening data from the Wikipedia page at https://en.wikipedia.org/wiki/List_of_chess_openings (Aug. 2024)
There is a JSON file for each of the ECO categories A, B, C, D, & E; e.g. ecoB.json. In additon there is a /tooling folder with scripts for manipulating the data in the JSON files. They have a node.js-compatible ".mjs" extension so that they can be run standalone from the command line. For example:
node tooling/ecoConjoin.mjs
{
"fen": "rnbqkb1r/pppppppp/8/3nP3/3P4/8/PPP2PPP/RNBQKBNR b KQkq",
"src": "eco_tsv",
"eco": "B03",
"moves": "1. e4 Nf6 2. e5 Nd5 3. d4",
"name": "Alekhine Defense",
"aliases": {
"eco_js": "Alekhine Defense, 2. e5 Nd5 3. d4",
"scid": "Alekhine: 3.d4"
},
"scid": "B03a"
"isEcoRoot": true
}
fen
The Forsyth-Edwards Notation of the position on the board after all opening moves are played. FEN notations uniquely identify each opening.
src
Identifies the source of the opening data; normally this will be eco_tsv, but could be eco_js or scid if no eco_tsv opening corresponds to the fen.
eco
The ECO code of the opening; multiple openings can share the same ECO (it is a category, not an identifier)
moves
The "standard" move sequence of the opening. Some openings can be arrived at by transposition, so opening moves are not identifiers.
name
The common English name of an opening. Origin of the name is determined by src, but there can be aliases from other sources
aliases
These are variations of what the opening is called. For example, the Ruy Lopez opening is sometimes called the Spanish Opening, depending on source
scid
since SCID codes extend ECO codes, this will be included where applicable
isEcoRoot
If true, this variation's moves appear in the Encyclopedia of Chess Openings as the root variation for the eco code, above
In eco.json there are 1811 "orphan" variations. An orphan variation has no from
field, indicating that there is no preceding named variation. There are moves that precede the last move of the orphan variation (unless it's a first move, of course). Opening records can be created for these prededing move sequences, which fill in the gaps in the eco.json data structure.
Let's take a look at one case. One of the opening books at eco_tsv contains these four variations of the Alekhine Defense:
There are two entries for the Brooklyn Variation of the Alekhine Defense on lines 132 and 133. However, there is no named variation for the move sequence 1. e4 Nf6 2. e5 Ng8 3. d4. This makes the variation on line 133 "orphaned", in that it has no preceding named variation that leads to it.
Every orphan variation (like the Alekhine Defense: Brooklyn Variation, Everglades Variation) has a move sequence. By moving backwards from the end of the move sequence, we eventually wind up at a named variation, which is called the root variation. Along the way, a record of each FEN position and remaining moves in the sequence is made. Then, from the root to the orphan, are created interpolated opening objects that bridge the gap between root and orphan.
Interpolated opening variations may have a name, but just weren't found in our sources. This can be corrected over time, and freshly named interpolated openings can be inserted into the eco.json file as they are discovered. In the meantime, the names assigned to interpolated openings are the root name plus " (i)"
. There may be several of these, but openings names are not required to be unique (only FENs are).
For the example above, the interpolated opening object would be:
{
"rnbqkbnr/pppppppp/8/4P3/3P4/8/PPP2PPP/RNBQKBNR b KQkq - 0 3": {
"src": "interpolated",
"eco": "B02",
"moves": "1. e4 Nf6 2. e5 Ng8 3. d4",
"name": "Alekhine Defense: Brooklyn Variation (i)",
"scid": "B02l",
"aliases": {
"scid": "Alekhine: Brooklyn Defence (Retreat Variation)"
}
},
Note that src is labeled "interpolated", meaning it wasn't derived directly from either of the originating three sources: eco_tsv, eco_js, or scid.
It is often desirable to have every move subsequence that appears within an opening to have an entry in the database as well. For example, the move sequence for the "Queen's Gambit Declined: Exchange Variation" is "1. d4 Nf6 2. c4 e6 3. Nc3 d5 4. cxd5"; however there is no opening book entry for the subsequence "1. d4 Nf6 2. c4 e6 3. Nc3 d5". This leaves a gap between "1. d4 Nf6 2. c4 e6 3. Nc3" ("Queen's Pawn: Neo-Indian") and the current variation. To fix this, an interpolation is created for the missing subsequence, "Queen's Pawn: Neo-Indian (i)".
It's a simple matter to merge eco_interpolated.json with eco.json, once the two files are read into a program as JSON objects. In JavaScript, one way is:
const complete_openings = {...ecoJson, ...interpolatedJson}
(There are also other ways of doing this.)
Credit goes to Shane Hudson for the original SCID opening data
Original eco.json data was compiled by Ömür Yanıkoğlu.