[Python] Vocabulary crowdsourcing project

This is the code supporting analysis of the data from the vocabulary crowdsourcing project. Allows to easily subset and aggregate the data.

Compatible with:

English Crowdsourcing Project dataset dataset www

(other datasets in preparation)

Installation

git clone git@github.com:pmandera/vocab-crowd.git
cd vocab-crowd
python3 -m pip install -r requirements.txt --user
python3 setup.py install

Usage

To load the dataset:

from vocabtest.vocabtest import VocabTest

vt = VocabTest.from_dir('./english-vocabtest-20180919-native.lang.en/')

To compute lexical statistics based on a subset of data:

print(vt.profiles.head())

# use only data from female participants younger than 25
vt_female = vt.query_by_profile('gender == "Female" and age < 25')

print(vt_female.profiles.head())

# calculate average statistics for all words but skip trials 0-9
# and use only those not filtered out by the
# adjusted boxplot method (Hubert & Vandervieren, 2008)
w_stats = vt_female.spelling_stats(
  query='lexicality == "W" and trial_order > 9 and rt_adjbox == True')

print(w_stats.head())

Authors

The tool was developed by Paweł Mandera.

If you are using this code for scientific purposes please cite:

Mandera, P., Keuleers, E., & Brysbasert, M. (submitted). Recognition times for 62 thousand English words: Data from the English Crowdsourcing Project.

License

The project is licensed under the Apache License 2.0.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
bin		bin
vocabtest		vocabtest
.gitignore		.gitignore
.venv		.venv
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

[Python] Vocabulary crowdsourcing project

Installation

Usage

Authors

License

About

Releases

Packages

Languages

License

pmandera/vocab-crowd

Folders and files

Latest commit

History

Repository files navigation

[Python] Vocabulary crowdsourcing project

Installation

Usage

Authors

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages