Skip to content

maafiah/VXGL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 

Repository files navigation

VXGL project

VXGL is Vocabulary eXpected Grade Level

VXGL is a graded word list -
a list of English words with a designation to which grade level they belong.
The mapping is to the US educational-system grade levels.
The levels go from 0 (kindergarten) to 16 (fourth year of college).

Versions

The initial release version is coded as 1.4.
It was released on GitHub in March 2024.
It has mappings for 126,423 words (including inflectional variants).

Publication

The paper describing this project
Mapping of American English vocabulary by grade levels
was published in 2024, in the
ITL - International Journal of Applied Linguistics
https://www.jbe-platform.com/content/journals/10.1075/itl.22025.flo
A prepublication draft is available here.

How it was developed

First, multiple vocabulary lists were collected from schools, educational organizations, and other sources.
Grade-level designations for each word were averaged.
This provided the initial mapped collection.
Then, a statistical method (GBM regression) was used to learn to predict the grade levels for words in the initial collection.
The model was then applied to extrapolate the grade-level scores for thousands of additional words.

What does the VXGL score mean

The designations in the VXGL list represent the grade levels at which the words might typically be expected to be known by students.
For example if word, e.g. escapade, is marked with grade level 8.6, it implies that 'educators' (i.e. teachers, publishers, test developers, etc.) may expect this word to be known in the 8th grade (after the mid-year? 😄 ).
That means that at least the most familiar sense of a word would be expected to be known at the listed level.

This is not the same as describing which words are actually known (on average) at a given grade level. See the research paper for an explanation of the differences.

History

A tradition of publishing grade-level vocabulary lists existed in the USA for many years (see references in the paper). The current list might be considered an upgraded list in this tradition, providing a mapping for about 126K English words.

How is that related to word difficulty or complexity?

Word complexity is usually considered to be the property of the word itself (its structure, meaning, etc.). Difficulty involves the interaction between the word, the student (e.g. a reader), the context and the task. But overall, words from higher grade levels are expected to be more complex or be more difficult for most students.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published