Skip to content
master
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
csv
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

COWSL2H

The UC Davis Corpus of Written Spanish, L2 and Heritage Speakers

The UC Davis Corpus of Written Spanish, L2 and Heritage Speakers (COWSL2H) consists of short essays collected from students enrolled in university-level Spanish courses. Courses SPA 1-24 are L2 Learner courses. Course SPA 31-33 are Heritage Learner courses.

All essays, annotations, and corrections are available both as individual text files as well as comma-separated value (csv) files.

Essays are divided based on the prompts used to collect the data-

famous: Write a text in Spanish about the following subject: "a famous person"

vacation: Write a text in Spanish about the following subject: "your perfect vacation plan"

special: Write a text in Spanish about the following subject: "a special person in your life"

terrible: Write a text about the following subject: "a terrible story"

Each essay prompt is further divided by the quarter in which the data was collected.

Annotations: We have annotated a subset of essays for gender/number agreement and usage of "a personal." These annotation targets were chosen based on specific research questions. We encourage fellow researchers to add to our annotations. Please see the included annotation scheme for further information.

Corrections: We have also included corrected essays for S17_vacation, S17_famous, and F17_famous. We are in the process of correcting additional essays and will update the corpus as these are ready to be made public.

Metadata: Metadata files consist of the following data items separated by "|||":

  1. Course enrolled
  2. Age
  3. Sex
  4. L1 language
  5. Other L1 language(s)
  6. Language(s) spoken at home
  7. Language(s) studied
  8. listening comprehension *
  9. reading comprehension *
  10. speaking ability **
  11. writing ability **
  12. Have you ever lived in a Spanish-speaking country?

* Comprehension is self-described on the following scale:

  • 1 (not confident at all)
  • 2 (not extremely confident, but I am sometimes able to understand)
  • 3 (somewhat confident but it depends a lot on the context and on my degree of focus on the task)
  • 4 (quite confident: I understand written messages most of the time)
  • 5 (extremely confident: I can understand any written message in Spanish)

** Speaking/writing ability is self-described on the following scale:

  • 1 (not confident at all)
  • 2 (not extremely confident)
  • 3 (somewhat confident)
  • 4 (quite confident)
  • 5 (extremely confident)

About

The UC Davis Corpus of Written Spanish, L2 and Heritage Speakers

Resources

License

Releases

No releases published

Packages

No packages published

Languages