Skip to content

Latest commit

 

History

History
44 lines (33 loc) · 1.49 KB

getting-data.rst

File metadata and controls

44 lines (33 loc) · 1.49 KB

Getting some data

.. toctree::
   :maxdepth: 4

You can either have some data of your own that you would like to use the package on, or you may know of some datasets that are already in this format that you'd like to reuse.

It may be easier to start with an extant dataset. Here is the list that we know exists. Please note that the large majority of these data are NOT public, and thus if you cannot retrieve them, this means you need to get in touch with the data managers.

Public data sets

We have prepared a public dataset for testing purposes which is based on the VanDam Public Daylong HomeBank Corpus; VanDam, Mark (2018). VanDam Public Daylong HomeBank Corpus. doi:10.21415/T5388S.

From the LAAC team

EL1000

The EL1000 dataset contains several corpora accessible upon request.

Other private datasets

We know of no other private datasets at present, but we hope one day to be able to use datalad’s search feature