mappings between the headwords of various NT Greek lexicons, the lemmas of MorphGNT and Nestle 1904, and Strongs and GK numbers
Note that the focus of this repo is in data (and soon code) for mapping between lemmas and other identifiers for lexical items. It is not intended to be the home of glosses, definitions, morphological information, etc. Rather it is a Rosetta Stone of sorts to help integrate various resources with that information.
This data is made available under a CC-BY-SA 4.0 License.
Back in 2006, Ulrik Sandborg-Petersen and I spent time integrating various Greek New Testament analyses we'd separately worked on. One output of that work was our joint paper A New Numbering System for Greek New Testament Lexemes. A few years later, as part of my larger work on a Morphological Lexicon of New Testament Greek, I started a broader integration of various lexical resources including BDAG, Danker's Concise Lexicon and even an old word list Bill Mounce had shared with me in 1997. A lot of the data and scripts for that work are in the morphgnt/morphological-lexicon GitHub repo. But when it recently became clear many people were not aware of the existing work, I decided it was worth extracting lemma mapping data and code out of that repo (where it's easy for it be lost) and start a new repo with much more focus. This repo is the in-progress result.
What's Here Now
The main file is
lexemes.yaml. The keys in this file are the lemmas used by the SBLGNT edition of the MorphGNT. The properties under each key are:
full-citation-formthe canonical full citation form (with genitive and article for nouns, etc)
bdag-headwordthe corresponding headword in BDAG
danker-entrythe corresponding full citation form from Danker's Concise Lexicon
dodson-entrythe corresponding full citation form from Dodson's lexicon
mounce-headwordthe corresponding headword from the list Bill Mounce gave me in 1997 (I need to update this with his much more up-to-date data)
abbott-smith-headerthe corresponding headword from Abbott-Smith NEW
abbott-smith-entrythe corresponding full citation form from Abbott-Smith NEW (probably has some extraneous info I need to manually remove)
strongsthe corresponding strongs number
gkthe corresponding G/K number
There are also some additional mapping files (all derived from
alt_mapping.yamla mapping from alternative spellings to the key found in
gk_mapping.yamla mapping from G/K number to the key found in
strongs_mapping.yamla mapping from Strongs number to the key found in
There is also an initial mapping for Nestle 1904 lemmas (via Strongs numbers) at
NOTES.md contain old notes from my work on this back in 2013. I need to work through them and update them.
A newly added file
conflation_sets.txt lists all the sets of words conflated. Not every pair in every set has been conflated but the file partitions all conflated words such that any pair that is conflated appears in the same partition.
I need to look at more recent data from Mounce, sort out some of the mismatch issues with Nestle 1904, clean up the Abbott-Smith entries more, reintroduce a lot of the validation and utility code from before, integrate other headword lists such as LSJ and possible Brill, and eventually expand things to more GNT editions and a broader corpus of texts.
Ulrik Sandborg-Petersen was my original collaborator on projects that ultimately led to this work. Jonathan Robie is responsible for building the community which has reinvigorated my interest in continuing this and much other work. Eliran Wong encouraged me immensely with his use of earlier versions of this work and the feedback he has given.
For more of my work on linguistics and Ancient Greek, see http://jktauber.com/.