Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

As per request, BBT charmap table #1073

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

As per request, BBT charmap table #1073

wants to merge 1 commit into from

Conversation

retorquere
Copy link
Contributor

@retorquere retorquere commented Jun 4, 2016

I'm not entirely sure that

  1. you want all of these characters, or
  2. my assumption that you do math-mode by wrapping each individual construct with '$...$' is true

but here it is (in response to https://forums.zotero.org/discussion/59190/accented-character-u/#Item_12)

I'm not entirely sure that

1. you want all of these characters, or
2. whether my assumption that you do math-mode by wrapping each individual construct with '$...$'

but here it is (in response to https://forums.zotero.org/discussion/59190/accented-character-u/#Item_12)
@adam3smith
Copy link
Collaborator

@dstillman your opinion on this would be useful -- this would definitely improve import/export, but we're talking about 9.5k additional lines in the bibtex translator, making it about 5* its current size. Is that a concern?

@dstillman
Copy link
Member

But this is still only relevant for non-UTF-8 import/export, right? I just continue to care very little about that. People should use an implementation that supports UTF-8. Reproducing all of Unicode seems incredibly dumb.

(As for the impact of this, I don't know. It might be fine, or it might slow down loading of the translator, and it's certainly a lot of data to send down to every Zotero client whenever the translator is updated and to the connector for any non-client save that uses BibTeX (once every 24 hours, though we should do better caching). And it certainly will make the translator more annoying to edit.)

@adam3smith
Copy link
Collaborator

That's right. But BibTeX is really a non- (or pre-) UTF8 format, which is why the majority of straight bibtex implementations don't work with utf-8 and certainly don't assume it. Should people be using biblatex instead? sure. But I'm not seeing that as widely as one might expect.

@dstillman
Copy link
Member

Should people be using biblatex instead? sure. But I'm not seeing that as widely as one might expect.

More people might, though, if tools stopped catering to the now-ridiculous demands of a pre-Unicode format…

@retorquere
Copy link
Contributor Author

I wouldn't fault zotero for skipping this table because I think Dan's assumption that this is a solved problem is a valid argument, but a) I put the pull request here because I was requested to do so, and b) BBT users don't seem to think it's dumb; as @adam3smith points out, a lot of places still demand bibtex and the author gets zero say, and when people submit error reports I also get their BBT settings and even biblatex users have this translation active quite often.

@zuphilip
Copy link
Contributor

How about taking only portion of this list? I think that the correct spelling of author names is more important than all the mathematical symbols, which can be then also added manually. Here is a rough statistic about this PR:

  • LATIN -> 777 lines
  • GREEK -> 268 lines
  • CYRILLIC -> 762 lines
  • Rest -> 7645 lines

(The numbers are simply search results for these terms and maybe only rough estimates.)

@dstillman
Copy link
Member

That seems reasonable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants