Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A tweak on the <emph> to <said> disambiguation #20

Closed
yellwork opened this issue Feb 13, 2017 · 8 comments
Closed

A tweak on the <emph> to <said> disambiguation #20

yellwork opened this issue Feb 13, 2017 · 8 comments

Comments

@yellwork
Copy link
Collaborator

We have a ·lot· of quoted direct speech within character dialogue in our corpus. An early instance:

—You said, Stephen answered, O, it’s only Dedalus whose mother is beastly dead. (U 1.198–99)

Initially this was all tagged as <emph> on account of the italics. We had been tackling the <emph> to <said> disambiguation by just tagging the direct quoted speech the same way that we treat dialogue: <said who=""> etc. We were trusting to the nesting to indicate when direct speech was being quoted within character dialogue without any additional markup.

But it turns out that there are plenty of exceptions to this loose rule. So I went through the corpus and added a @type="reported" on all instances of <emph> that we had retagged as <said>. It took a while, but I think we’ve teased out a potential ambiguity in the process. Some examples:

she was one of those good souls who had always <lb n="100139"/>to be told twice <said who="Father Conmee" type="reported">bless you, my child,</said> that they have been absolved, <said who="Father Conmee" type="reported">pray for <lb n="100140"/>me</said>.
a shrill <lb n="131174"/>voice went crying, wailing: <said who="shrill voice" type="reported"><title type="newspaper">Evening Telegraph</title>, stop press edition! Result of <lb n="131175"/>the Gold Cup races!</said>
handed him <lb n="161336"/>his silk hat when it was knocked off and he said <said who="Parnell" type="reported">Thank you</said>, excited as he <lb n="161337"/>undoubtedly was
yellwork added a commit that referenced this issue Feb 13, 2017
@JonathanReeve
Copy link
Member

Great idea! I think this'll be an improvement.

@JonathanReeve
Copy link
Member

The DTD validation is complaining that @type isn't valid for <said>. But it seems like there's an attribute for this: @direct. The TEI docs have it that indirect speech would be <said direct="false">. We can assume that otherwise it's direct speech (or thought), and so we don't need direct="true". I'll go ahead and make this global change, if that's OK, just to get the validation working.

@yellwork
Copy link
Collaborator Author

yellwork commented Feb 15, 2017

Ah. I never checked the Travis CI. Good catch.

I didn’t realize @type wasn’t valid but I had read through the <said> description and looked longingly at @direct. Is it a slight tag abuse for us to use it now to describe direct speech being quoted within direct speech or do we limit its application to the few cases of recalled (italicised) direct speech that I highlighted above?

handed him <lb n="161336"/>his silk hat when it was knocked off and he said <said who="Parnell" direct="false" rend="italics">Thank you</said>, excited as he <lb n="161337"/>undoubtedly was

I added an @rend="italics" to preserve the rendering. Or is <said direct="false"> enough to indicate it?

Would that mean we revert <said> within <said> to the way it was before I added the @type? e.g. Ned Lambert reading Dawson’s inflated prose from the newspaper:

<lb n="070295"/><said who="Ned Lambert">―<said who="Dan Dawson">Or again if we but climb the serried mountain peaks.</said></said>

Do we need an @direct and/or an @rend here?

@JonathanReeve
Copy link
Member

Good questions. I think we can keep the nested <said> structure as-is, but adding direct="false" where appropriate would be a good idea. That way we can distinguish between a character's actual speech (as reported by Joyce, at least) and his speech as reported by some other, potentially less reliable, character.

As for italics, you're right--it might be a good idea to add rend="italics" here, since I think the standard TEI renderers don't automatically render <said direct="false"> as italicized.

@yellwork
Copy link
Collaborator Author

OK. Sorry, I should have raised this as an issue instead of marching ahead and making a load of changes!

So our convention is, ultimately, to disambiguate inherited <emph> into <said> in two different ways, right?

(1) If a character quotes direct speech within her speech, we’re encoding it like this:

<said who="Stephen Dedalus">―You said,</said> Stephen answered, <said who="Stephen Dedalus"><said who="Buck Mulligan" rend="italics">O, it's only Dedalus whose mother is beastly dead</said>.</said>

(2) If direct speech is recalled in interior monologue or (occasionally) represented in the third-person narrative using italics, we’re encoding it like this:

she was one of those good souls who had always to be told twice <said who="Father Conmee" direct="false" rend="italics">bless you, my child,</said> that they have been absolved, <said who="Father Conmee" direct="false" rend="italics">pray for me</said>.

Does that sound right?

@JonathanReeve
Copy link
Member

Sounds great! I'll go ahead and add this to our conventions document.

@yellwork
Copy link
Collaborator Author

Great! I'm going to quickly go through all the @direct and turn them into types (1) or (2) above. Shouldn't take long. (Unless you're already working on it?!)

yellwork added a commit that referenced this issue Feb 16, 2017
@yellwork
Copy link
Collaborator Author

yellwork commented Feb 23, 2017

With the addition to our conventions document, I feel like this issue is now closed. (Always happy for it to be reopened if needs be.) R

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants