Contains survey instruments, data collected and related documents from a survey issued in November 2012 by Michelle Dalmau and Kevin S. Hawkins to determine encoding practices in libraries.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


Historically, libraries— especially academic libraries—have contributed to the development of the TEI Guidelines, largely in response to mandates to provide access to and preserve electronic texts. The institutions leveraged standards such as the TEI Guidelines and traditional library expertise—authority control, subject analysis, and bibliographic description—to positively impact publishing and academic research. But the advent of mass digitization efforts involving scanning of pages called into question such a role for libraries in text encoding. Still, with the rise of library involvement in digital humanities initiatives and renewed interest in supporting text analysis, it is unclear how these events relates to the evolution of text encoding projects in libraries.

In an attempt to better understand text encoding practices in libraries, we surveyed employees of libraries around the world between November 2012 and January 2013. This repository contains the survey questions and the data collected — raw and coded.

Survey Authors

Survey Questions and Data

The survey questions were exported from SurveyMonkey and are available in a PDF.

The data is in an Excel spreadsheet with multiple tabs:

  • Q_KEY contains a mapping of the prose questions to an identifier scheme (Q1, Q2, etc.)for statistical processing. The Key reflects questions that were not included in the analysis for a number of reasons (see "Notes" column for rationale). The Key includes a mapping to the open-ended questions which were coded. However, upon completion of coding, some of these questions were also excluded (see "Notes" column). The qualitative questions that were included in our analysis are made available in tabs labeled: Q4, Q9, Q16, Q25, Q118, Q119.

  • Data contains the 112 valid responses from our data set; 26 responses were disqualified since they did not meet the sole survey requirement of working in libraries. A subset of questions marked as “invalid” were disqualified based in errors uncovered in SurveyMonkey’s skip logic. We retained, and included in our analysis, the valid responses.

  • Likert_Key reflects values assigned to likert questions. All questions were normalized to this scale.

  • Q4, Q9, Q16, Q25, Q118, Q119 Contain the qualitative questions we used in our analysis. Each tab reflects original responses and our coding of those responses.

Terms of Use

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.