-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request - option in utf8tolatex to maintain capitalisation for BibTeX #21
Comments
Hi, I don't understand your point. You want an option to surround/protect some parts by braces? My understanding is that the BibTeX entry itself should be protected... |
Sorry if this wasn't completely clear. I am trying to create bibtex files programatically. Depending on the bibliography style, bibtex may generate titles in "titlecaps" or "sentence case" irrespective of the capitalisation used in the bibtex file. To avoid interfering with mandatory capitalisation (e.g. acronyms), mandatorily capitalised title parts (e.g. acronyms) should be protected by braces in bibtex files. Let's consider this example title
in the bibliography as compiled in a latex documents, this might get displayed as
The corresponding part of the bibtex file should be
Note that using double braces would prevent bibtex from using the title capitalisation from the bibliography style and is thus not wanted.
would lead to Below is a small python code to ease looking at this question: #!/usr/bin/env python3
## Import statements
import sys
import re
from pylatexenc.latexencode import utf8tolatex
## Function to surround accronyms with braces
def capitalize_title(title):
capitalization_regex = re.compile('[A-Z]{2,}')
words = re.split('(\W)', title)
for idx, word in enumerate(words):
m = capitalization_regex.search(word)
if m:
new_word = '{' + word[m.start():m.end()] + '}'
words[idx] = words[idx].replace(word[m.start():m.end()], new_word)
return ''.join(words)
def utf8tobibtex_title(title):
return capitalize_title(utf8tolatex(orig_title))
orig_titles = [ "AET: An Exposé of Titles", "AET: An exposé of titles" ]
for cmd_line_arg in sys.argv[1:]:
orig_titles.append(cmd_line_arg)
for orig_title in orig_titles:
print("===")
print("orig_title\n" + orig_title + "\n")
print("utf8tolatex(orig_title)\n" + utf8tolatex(orig_title) + "\n")
print("utf8tobibtex_title(orig_title)\n" + utf8tobibtex_title(orig_title) + "\n")
print("Title in bibtex context")
print("title={" + utf8tobibtex_title(orig_title) + "},\n") |
Thanks for your feedback. My impression is that the functionality that you're suggesting is a bit orthogonal to the purpose of However, I've been meaning to improve Hopefully I'll be able to get to this soon. |
OK thanks. As this is not on the roadmap, I will close this ticket. As an off-topic side note as well, I also tried |
Hi again. I'm working on a u = UnicodeToLatexEncoder(
conversion_rules=[
latexencode.UnicodeToLatexConversionRule(
latexencode.RULE_REGEX,
[ (re.compile(r'([{}])'), r'\1'), # keep existing braces
(re.compile(r'\b([A-Z]{2,}\w*)\b'), r'{\1}'), ]
),
] + latexencode.get_builtin_conversion_rules('defaults')
)
result = u.unicode_to_latex(input_string) See updated doc: https://pylatexenc.readthedocs.io/en/latest/latexencode/ To install the development version, clone the git repo, then in the cloned directory run the commands:
|
It would be great to have an option to keep custom capitalisation for bibtex.
For example
would be encoded as
For now, I am using code borrowed from https://openreview-py.readthedocs.io/en/latest/_modules/tools.html#get_bibtex in combination with
utf8encode
:The text was updated successfully, but these errors were encountered: