New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A tweak on the <emph> to <said> disambiguation #20
Comments
Great idea! I think this'll be an improvement. |
The DTD validation is complaining that |
Ah. I never checked the Travis CI. Good catch. I didn’t realize handed him <lb n="161336"/>his silk hat when it was knocked off and he said <said who="Parnell" direct="false" rend="italics">Thank you</said>, excited as he <lb n="161337"/>undoubtedly was I added an Would that mean we revert <lb n="070295"/><said who="Ned Lambert">―<said who="Dan Dawson">Or again if we but climb the serried mountain peaks.</said></said> Do we need an |
Good questions. I think we can keep the nested As for italics, you're right--it might be a good idea to add |
OK. Sorry, I should have raised this as an issue instead of marching ahead and making a load of changes! So our convention is, ultimately, to disambiguate inherited (1) If a character quotes direct speech within her speech, we’re encoding it like this: <said who="Stephen Dedalus">―You said,</said> Stephen answered, <said who="Stephen Dedalus"><said who="Buck Mulligan" rend="italics">O, it's only Dedalus whose mother is beastly dead</said>.</said> (2) If direct speech is recalled in interior monologue or (occasionally) represented in the third-person narrative using italics, we’re encoding it like this: she was one of those good souls who had always to be told twice <said who="Father Conmee" direct="false" rend="italics">bless you, my child,</said> that they have been absolved, <said who="Father Conmee" direct="false" rend="italics">pray for me</said>. Does that sound right? |
Sounds great! I'll go ahead and add this to our conventions document. |
Great! I'm going to quickly go through all the |
With the addition to our conventions document, I feel like this issue is now closed. (Always happy for it to be reopened if needs be.) R |
We have a ·lot· of quoted direct speech within character dialogue in our corpus. An early instance:
Initially this was all tagged as
<emph>
on account of the italics. We had been tackling the<emph>
to<said>
disambiguation by just tagging the direct quoted speech the same way that we treat dialogue:<said who="">
etc. We were trusting to the nesting to indicate when direct speech was being quoted within character dialogue without any additional markup.But it turns out that there are plenty of exceptions to this loose rule. So I went through the corpus and added a
@type="reported"
on all instances of<emph>
that we had retagged as<said>
. It took a while, but I think we’ve teased out a potential ambiguity in the process. Some examples:The text was updated successfully, but these errors were encountered: