Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Citation should treat authors with the same given name and initials the same (even if one has given their name spelled out in the bib entry) #1274

Closed
mapfiable opened this issue Mar 17, 2023 · 10 comments

Comments

@mapfiable
Copy link

mapfiable commented Mar 17, 2023

(Issue copied from here; it was recommended to me to make it an official biblatex issue.)

Similar to this post, I would like to limit the author list in a citation to one + et al. (even if the author teams may be different). The proposed solution (uniquelist=false) essentially also works, but there is a problem when using it in combination with citestyle=authoryear-comp. If in the bibliography, the same (first) author once appears with their given name written out, and once only with their initials, they are not treated as the same author. So in a list of citations, even though citestyle=authoryear-comp, the citations will be separated. Here's an example:

\documentclass{article}

\usepackage[
	citestyle=authoryear-comp,
    maxcitenames=1,
    giveninits=true,
    uniquename=init,
    uniquelist=false,
	ibidtracker=context,
 % 
    bibstyle=authoryear,
    dashed=false, % dashed: substitute rep. author with ---
    date=year,
	sorting=nyt,
    minbibnames=3,
	hyperref=true,
	backref=true,
	citecounter=true,
	citetracker=true,
    natbib=true, % natbib compatibility mode (\citep and \citet still work)
	backend=biber, % Compile the bibliography with biber
]{biblatex}

\usepackage{filecontents}
\usepackage{hyperref}

\begin{filecontents}{\jobname.bib}
@misc{ABC01,
  author = {Author, A. and Buthor, B. and C},
  year = {2001},
  title = {Alpha},
}
@misc{ABC02,
  author = {Author, A. and Buthor, B. and Cuthor, C., and Duthor, D.},
  year = {2001},
  title = {Beta},
}
@misc{ADE01,
  author = {Author, Andrew and Duthor, D. and E},
  year = {2001},
  title = {And now for something completely different},
}
\end{filecontents}

\addbibresource{\jobname.bib}

\begin{document}

Some text \citep{ABC01, ABC02, ADE01}.

Some text \autocite{ADE01}.

\printbibliography

\end{document}

enter image description here

I tried to fix this behavior by following this post and setting giveninits=true and uniquename=init, but that did not solve the issue. How can I get biblatex to treat Author, A. and Author, Andrew as the same author?

With the settings that I chose, I honestly think it should work the way I expect it. It doesn't really make sense to me that in the citation, Author et al. 2001c should be listed separately. I think if there are two works by a (first) author with the same family name and initials, it's usually the same person. Two authors sharing the same family name and initials (or even their given name) is an absolute exception. So when the even rarer case happens where you reference them in the same citation you can just address that in the text.

In the answer to the original post it was suggested that I manually edit the bib entries, but that is not feasible because I have a large bibliography that is also constantly changing (mainly importing from zotero).

@plk
Copy link
Owner

plk commented Mar 17, 2023

The options to do with initials control the display of names, not how they are internally represented. There is no way for biber to know that two forms of the same name are the same person (Author, A and Author, Andrew) and it will generate different hashes for the names. The way to do this is with a sourcemap to dynamically change the data in the .bib before biber processes it. You can see examples of this in the biblate PDF manual and here:

https://tex.stackexchange.com/questions/675744/normalize-given-names-from-various-sources/

You could do this for specific names or generally change all first names to initials only.

@mapfiable
Copy link
Author

Hey, thanks a lot for the quick reply! I thought that this might be the case. I will give it a try.

@moewew
Copy link
Collaborator

moewew commented Mar 17, 2023

I think an option to have the name hashing take into account only the given name initial instead of the full given name might be useful here. We already have configuration option for most uniquename to the degree that we can control what is used to make a name unique, but at the moment we can't fully control what is used to compare names.

@plk
Copy link
Owner

plk commented Mar 17, 2023

I was thinking too about this. One option would be to allow users to pass an id for a name in the extended name format which would solve a whole class of issues with people whose name changes etc.

We can also add an option specifically for this case which may make sense as it's come up a few times. Shouldn't be too difficult to do.

@moewew
Copy link
Collaborator

moewew commented Mar 17, 2023

Yeah, the ID sounds cool and would also be useful for #1094.

Independent of that I think making the hash configurable a la \DeclareUniquenameTemplate/\DeclareSortingNamekeyTemplate could also be useful for people who want to use the extended name format more extensively. Because they can then control which bit goes into the hash and which doesn't.

@plk
Copy link
Owner

plk commented Mar 17, 2023

Ah yes, that was the issue I was thinking about re IDs. I'll get sonething in DEV soon.

@mapfiable
Copy link
Author

Wow, thanks a lot guys for being so active! I guess for now I will just try to frankenstein a solution.

plk added a commit to plk/biber that referenced this issue Mar 19, 2023
plk added a commit to plk/biber that referenced this issue Mar 19, 2023
plk added a commit to plk/biber that referenced this issue Mar 19, 2023
plk added a commit that referenced this issue Mar 19, 2023
plk added a commit that referenced this issue Mar 19, 2023
plk added a commit that referenced this issue Mar 19, 2023
@plk
Copy link
Owner

plk commented Mar 19, 2023

Two approaches are now in biblatex 3.20 DEV (requiring biber 2.20 DEV, both on Sourceforge).

This was a fairly major new feature as it meant extending the reference contexts but was required to generalise hash customisation.

Option 1 (syntactic)

The hash generation algorithm is now customisable with the new command \DeclareNamehashTemplate. The (backwards compatible) default is:

\DeclareNamehashTemplate{
  \namepart[hashscope=full]{family}
  \namepart[hashscope=full]{given}
  \namepart[hashscope=full]{prefix}
  \namepart[hashscope=full]{suffix}
}

As with other templates like this, you can defined as many as you want and use them at different scopes, all the way from refcontexts to individual names. So, to do this globally for your example, you would put this in your preamble:

\DeclareNamehashTemplate{
  \namepart[hashscope=full]{family}
  \namepart[hashscope=init]{given}
  \namepart[hashscope=full]{prefix}
  \namepart[hashscope=full]{suffix}
}

This makes the hashing algorithm only use the given name initials when creating the hash. So, all of the "Author" names would have the same hash.

Option 2 (semantic)

A more general feature is also offered but this is for more "semantic" differences. The extended name format now allows a per-name "id" which will be used to construct the hash for the name, overriding and ignoring any name hash template. So, to get the same results in your case, you could do:

@misc{ABC01,
  author = {id=A1, family=Author, given=A, given-i=A and Buthor, B. and C},
  year = {2001},
  title = {Alpha},
}
@misc{ABC02,
  author = {id=A1, family=Author, given=A, given-i=A and Buthor, B. and Cuthor, C. and Duthor, D.},
  year = {2001},
  title = {Beta},
}
@misc{ADE01,
  author = {id=A1, family=Author, given=Andrew, given-i=A and Duthor, D. and E},
  year = {2001},
  title = {And now for something completely different},
}

and since all the "Author" names have the same id, the hash will be the same. Same result as before. This is option is really intended for more extreme cases where people change names but you want biblatex to consider the names as the same person. It's a bit contrived to use option 2 in your example as if you were using the extended name format, you'd likely just spell out the given names and initials fully anyway and that would avoid the problem in the first place.

@plk
Copy link
Owner

plk commented Mar 19, 2023

I also added a new field fullhashraw so that we retain the distinction between the new hash logic and the heretofore situation where the fullhash was always all of the full nameparts, in case that's needed when someone uses the new functionality and changes the hashes. fullhashraw will always contain the hash of all full nameparts and hence the complete glyphs of the name.

@plk plk added fixedindev Fixed in current DEV version biber needs-feedback needs feedback from reporting user labels Mar 23, 2023
@plk plk added this to the v3.20 milestone Mar 23, 2023
@moewew
Copy link
Collaborator

moewew commented Mar 28, 2024

biblatex v3.20 with these changes is out now.

If you want only initials to be used for hashing, go with

\documentclass{article}

\usepackage[
  backend=biber,
  style=authoryear-comp,
  maxcitenames=1,
  giveninits=true,
  uniquename=init,
  uniquelist=false,
]{biblatex}

\usepackage{hyperref}

\DeclareNamehashTemplate{
  \namepart[hashscope=full]{family}
  \namepart[hashscope=init]{given}
  \namepart[hashscope=full]{prefix}
  \namepart[hashscope=full]{suffix}
}

\begin{filecontents}{\jobname.bib}
@misc{ABC01,
  author = {Author, A. and Buthor, B. and C},
  year   = {2001},
  title  = {Alpha},
}
@misc{ABC02,
  author = {Author, A. and Buthor, B. and Cuthor, C., and Duthor, D.},
  year   = {2001},
  title  = {Beta},
}
@misc{ADE01,
  author = {Author, Andrew and Duthor, D. and E},
  year   = {2001},
  title  = {And now for something completely different},
}
\end{filecontents}
\addbibresource{\jobname.bib}

\begin{document}
Some text \autocite{ABC01, ABC02, ADE01}.

Some text \autocite{ADE01}.

\printbibliography
\end{document}

If you want to override hashing, go with

\documentclass{article}

\usepackage[
  backend=biber,
  style=authoryear-comp,
  maxcitenames=1,
  giveninits=true,
  uniquename=init,
  uniquelist=false,
]{biblatex}

\usepackage{hyperref}

\begin{filecontents}{\jobname.bib}
@misc{ABC01,
  author = {id=A1, family=Author, given=A. and Buthor, B. and C},
  year   = {2001},
  title  = {Alpha},
}
@misc{ABC02,
  author = {id=A1, family=Author, given=A. and Buthor, B. and Cuthor, C. and Duthor, D.},
  year   = {2001},
  title  = {Beta},
}
@misc{ADE01,
  author = {id=A1, family=Author, given=Andrew and Duthor, D. and E},
  year   = {2001},
  title  = {And now for something completely different},
}
\end{filecontents}
\addbibresource{\jobname.bib}

\begin{document}
Some text \autocite{ABC01, ABC02, ADE01}.

Some text \autocite{ADE01}.

\printbibliography
\end{document}

@moewew moewew closed this as completed Mar 28, 2024
@moewew moewew removed fixedindev Fixed in current DEV version needs-feedback needs feedback from reporting user labels Mar 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants