VXGL is Vocabulary eXpected Grade Level
VXGL is a graded word list -
a list of English words with a designation to which grade level they belong.
The mapping is to the US educational-system grade levels.
The levels go from 0 (kindergarten) to 16 (fourth year of college).
The initial release version is coded as 1.4.
It was released on GitHub in March 2024.
It has mappings for 126,423 words (including inflectional variants).
The paper describing this project
Mapping of American English vocabulary by grade levels
was published in 2024, in the
ITL - International Journal of Applied Linguistics
https://www.jbe-platform.com/content/journals/10.1075/itl.22025.flo
A prepublication draft is available here.
First, multiple vocabulary lists were collected from schools, educational organizations, and other sources.
Grade-level designations for each word were averaged.
This provided the initial mapped collection.
Then, a statistical method (GBM regression) was used to learn to predict the grade levels for words in the initial collection.
The model was then applied to extrapolate the grade-level scores for thousands of additional words.
The designations in the VXGL list represent the grade levels at which the words might typically be expected to be known by students.
For example if word, e.g. escapade, is marked with grade level 8.6,
it implies that 'educators' (i.e. teachers, publishers, test developers, etc.)
may expect this word to be known in the 8th grade (after the mid-year? 😄 ).
That means that at least the most familiar sense of a word would be expected to be known at the listed level.
This is not the same as describing which words are actually known (on average) at a given grade level. See the research paper for an explanation of the differences.
A tradition of publishing grade-level vocabulary lists existed in the USA for many years (see references in the paper). The current list might be considered an upgraded list in this tradition, providing a mapping for about 126K English words.
Word complexity is usually considered to be the property of the word itself (its structure, meaning, etc.). Difficulty involves the interaction between the word, the student (e.g. a reader), the context and the task. But overall, words from higher grade levels are expected to be more complex or be more difficult for most students.