CM_supplementary

Rob Truswell, University of Edinburgh

This repository contains corpus queries for investigating word order in Old and Middle English texts, and R scripts for producing figures based on the outputs of those corpus queries. The queries are designed to be used in conjunction with the York-Toronto-Helsinki Parsed Corpus of Old English Prose (YCOE), the Penn-Helsinki Parsed Corpus of Middle English, 2nd edition (PPCME2), and the Parsed Linguistic Atlas of Early Middle English (PLAEME). I distribute the query files, rather than the results of those queries, because not all the corpora have licenses which would permit distribution of the results.

Included in this repository are:

Competition_mat_OE_full.c, Competition_sub_OE_full.c : coding query files for use with YCOE
Competition_mat_PPCME_full.c, Competition_sub_PPCME_full.c : coding query files for use with PPCME2
Competition_mat_PLAEME_full.c, Competition_sub_PLAEME_full.c, Competition_mat_PLAEME_full_v3.c, Competition_sub_PLAEME_full_v3.c, V2.c : coding query files for use with PLAEME
WhRel.def : Definitions file referred to by the coding queries
OoosIds.q : generic query file for extracting codes and IDs from the output of coding queries
CM_maps_final.R, Competition_maps_final.R, Competition_plots_final.R : R scripts for generating maps and figures.
PLAEME_more_info.csv : metadata for PLAEME texts.
CM_grammar_comparison.csv : CSV file created by manual triage + summarization of the first six coding queries. It would be desirable, and possible in principle to automate the manual triage, but this research was performed in lockdown over a flaky SSH connection, and it wasn't practical under those circumstances.

.c and .q files should be run using CorpusSearch.

Workflow is as follows (assumes all files in the same directory):

Run coding queries on relevant corpora.
Run OoosIds.q on .cod files output by coding queries.
Perform minor edits on .cod.ooo files output by OoosIds.q (globally replace @ symbol with :; globally delete token IDs while retaining text IDs — for YCOE and PPCME2 queries this involves globally deleting the regex string ,.*$; for PLAEME queries, delete \..*$).
Run R scripts (scripts assume that .cod.ooo files are accessible in the working directory).

NB the outputs of the first six coding queries are not called by any R script. I have included these queries because they are the basis for the summary counts in CM_grammar_comparison.csv.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CM_supplementary

Rob Truswell, University of Edinburgh

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
CM_grammar_comparison.csv		CM_grammar_comparison.csv
CM_maps_final.R		CM_maps_final.R
Competition_maps_final.R		Competition_maps_final.R
Competition_mat_OE_full.c		Competition_mat_OE_full.c
Competition_mat_PLAEME_full.c		Competition_mat_PLAEME_full.c
Competition_mat_PLAEME_full_v3.c		Competition_mat_PLAEME_full_v3.c
Competition_mat_PPCME_full.c		Competition_mat_PPCME_full.c
Competition_plots.R		Competition_plots.R
Competition_plots_final.R		Competition_plots_final.R
Competition_sub_OE_full.c		Competition_sub_OE_full.c
Competition_sub_PLAEME_full.c		Competition_sub_PLAEME_full.c
Competition_sub_PLAEME_full_v3.c		Competition_sub_PLAEME_full_v3.c
Competition_sub_PPCME_full.c		Competition_sub_PPCME_full.c
LICENSE		LICENSE
OoosIds.q		OoosIds.q
PLAEME_more_info.csv		PLAEME_more_info.csv
README.md		README.md
V2.c		V2.c
WhRel.def		WhRel.def

License

rtruswell/CM_supplementary

Folders and files

Latest commit

History

Repository files navigation

CM_supplementary

Rob Truswell, University of Edinburgh

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages