Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Hungarian support #717

Closed
marczellm opened this issue Feb 24, 2018 · 104 comments

Comments

Projects
4 participants
@marczellm
Copy link
Contributor

commented Feb 24, 2018

I have forked biblatex at https://github.com/marczellm/biblatex and started working on Hungarian support with the help of @priestoferis / Bence Ferdinándy.

This issue is to track our progress.

@moewew moewew added the localisation label Feb 24, 2018

@moewew

This comment has been minimized.

Copy link
Collaborator

commented Feb 24, 2018

Great! Let us know if you need help or anything fancy.

The new language file checklist is at https://github.com/plk/biblatex/wiki/Checklist-for-submitting-a-new-localisation-file-(.lbx)
Dates can do much more than they used to, so we need to know about 'long' and 'short' date formats (for dates with year, month, day; as well as for those with no day or no month; and for dates with only a season component) as well as date ranges and compact date ranges.

You can have a look at #708 and #710 for an example how this went with Latvian support.

@moewew moewew added this to To Do in biblatex 3.12 Feb 24, 2018

@priestoferis

This comment has been minimized.

Copy link
Contributor

commented Feb 24, 2018

Thanks very much! I'm adding this here, we'll edit this as we go along.

Bibliography Extras

Adding bibliography extras requires more in-depth knowledge of Biblatex's localization modules. It is usually best if you go over the checklist below and send the answers to the Biblatex developers along with the translations; i.e., you translate the \DeclareBibliographyStrings part but omit \DeclareBibliographyExtras.

  1. What kind of dash do you use in number ranges like "35-50"?

    • \textendash? \textemdash? hyphen?
    • Are there any spaces around the dash?
  2. What kind of dash do you use in date ranges like "1st January-3rd February 2009"?

    • \textendash? \textemdash? hyphen?
    • Are there any spaces around the dash?
  3. In lists and enumerations, do you add a comma in addition to connectors like "and"? Examples:

    • UK English: "apples, oranges** and **cherries": no comma, like UK English
  4. How do you format short ordinals? Are ordinals gender/number sensitive? Examples:

    • German: 1. 2. 3. 4. 5. ... [= gender/number insensitive]: insensitive, like German
  5. What's the gender of the word "edition"?

    • No genders in Hungarian
  6. What's the gender of the word "series" (of a journal)?

    • No genders.
  7. What's the long date format? Hun. Academy of Sci. date spelling reference

    1. What happens if the day is missing from the date?
    2. What happens if both day and month are missing?
    3. Does this day format use leading zeros?
    4. Please be specific about the spaces to be used (non-breakable, thin spaces).
    5. What happens for date ranges with compact (1.--6. February 2010, 1. February--6. March 2010) and non-compact ranges (1. February 2010--6. February 2010)?

    Examples:

    • US English: [month name] [day as number], [year] = January 1, 2009
    • UK English: [day as ordinal] [month name] [year] = 1st January 2009

    Reduced precision examples:

    • UK English: March 2010
    • UK English: 2010

    Comp date range examples:

    • US English: January 1--6, 2009; January 1--February 2, 2009
  8. What's the short date format?

    1. What happens if the day is missing from the date?
    2. What happens if both day and month are missing?
    3. Does this day format use leading zeros?
    4. Please be specific about the spaces to be used (non-breakable, thin spaces).
    5. What happens for date ranges with compact (1.--6. February 2010, 1. February--6. March 2010) and non-compact ranges (1. February 2010--6. February 2010)?

    Examples:

    • US English: month/day/year = 01/31/2009
    • UK English: day/month/year = 31/01/2009
    • German: day.month.year = 31.01.2009
    • Swedish: year-month-day = 2009-01-31
  9. After which punctuation marks do you capitalize a word? Examples:

    • after ". ! ?" (English = package default): like English
  10. What's the 12h time format?/What's the 24h time format?

  11. What's the format for dates with seasons only, such as 'Summer 2010'?
    * This might get tricky, if the word Summer has to be used separately and with a date. E.g. Summer is 'nyár', but in 'Summer 2010' you have to conjugate '2010 nyara' (literally, 'summer of', but it's always used like this).

@moewew

This comment has been minimized.

Copy link
Collaborator

commented Feb 24, 2018

Since I'm not sure the checklist covers every eventuality any more, get ready to get bombarded with more questions later.

And prepare to explain the basics of Hungarian grammar to me should it be necessary.

There are a few things the localisation simply can't deal with at the moment because it was written from a Germanic point of view. You can always ask if something is not there, but we may have to resort to workarounds in some cases.

@priestoferis

This comment has been minimized.

Copy link
Contributor

commented Feb 24, 2018

Ok, sure, I'm prepared that this might take some time :) And thanks for helping! I'm hoping that most quirks of Hungarian will not come up during a bibliography, but we'll try to recruit someone who's more humanities inclined than we are, just in case.

@moewew

This comment has been minimized.

Copy link
Collaborator

commented Feb 25, 2018

@marczellm

This comment has been minimized.

Copy link
Contributor Author

commented Mar 2, 2018

Here are my answers to some of the questions

  1. What kind of dash do you use in number ranges like "35-50"?

  2. What kind of dash do you use in date ranges like "1st January-3rd February 2009"?

    • \textendash with spaces
  3. What's the long date format? [year]. [month name] [day]. = 2018. március 2.

    1. What happens if the day is missing from the date? [year]. [month] = 2018. március
    2. What happens if both day and month are missing? [year].
    3. Does this day format use leading zeros? No
    4. Please be specific about the spaces to be used (non-breakable, thin spaces). Non-breakable, although the spelling dictionary doesn't specify.
    5. What happens for date ranges with compact and non-compact ranges?
    • 2009. január 1--6.
    • 2009. január 1. -- február 2.
    • 2009. január 1. -- 2009. február 2.
  4. What's the short date format? **[year]. [month with leading zeros]. [day with leading zeros]. **

    1. What happens if the day is missing from the date? [year]. [month with leading zeros].
    2. What happens if both day and month are missing? [year].
    3. Non-breakable spaces are used
    4. What happens for date ranges with compact and non-compact ranges?
    • 2009. 01. 01--06.
    • 2009. 01. 01. -- 02. 02.
    • 2009. 01. 01. -- 2009. 02. 02.
  5. What's the 12h time format?/What's the 24h time format?

    • I'd really suggest always using the 24h time format which is simply [hour with leading zeros]:[minutes with leading zeros] e. g. 01:35
    • We don't have AM and PM but we have hajnali 5, reggel 6, reggel 9, délelőtt 10, 12, délután 1, délután 5, este 6, este 11 where hajnal=dawn, reggel=morning (6-10), délelőtt=10-12, délután=afternoon (1-5), este=evening (6-)
@marczellm

This comment has been minimized.

Copy link
Contributor Author

commented Mar 2, 2018

What are the rules for capitalizing/not capitalizing bibliography string in the lbx file? Is it relevant?

@moewew

This comment has been minimized.

Copy link
Collaborator

commented Mar 2, 2018

Capitalise the strings as if they were to appear mid-sentence.

@marczellm

This comment has been minimized.

Copy link
Contributor Author

commented Mar 2, 2018

What to do in case of strings that apparently have no translation? "continuator" for example. The concept just didn't warrant a word.

@moewew

This comment has been minimized.

Copy link
Collaborator

commented Mar 2, 2018

If you don't have a good translation leave it untranslated. Do not leave the string empty as in continuator = {{}{}}. You can go with a commented

%continuator = {{}{}},%FIXME missing

If it is about prepositions that are normally rendered with declension of the following noun phrase, we will have to examine that on a case-by-case basis.

@priestoferis

This comment has been minimized.

Copy link
Contributor

commented Mar 2, 2018

Months can also be written with roman numerals

2009. január 1. -- 2009. február 2. -> 2009. I. 1. -- 2009. II. 2.

I think it can substituted anywhere, so we might need a roman date option.

@moewew

This comment has been minimized.

Copy link
Collaborator

commented Mar 2, 2018

Sorry, I messed up my last comment. If you don't know how to translate a string, comment it out

%continuator = {{}{}},% FIXME missing
@marczellm

This comment has been minimized.

Copy link
Contributor Author

commented Mar 29, 2018

What is \lbx@lfromlang and \lbx@sfromlang?

@moewew

This comment has been minimized.

Copy link
Collaborator

commented Mar 29, 2018

These macros store the string "from the <language>" so that bit can be inserted at the correct place in strings such as "translated from the <language> by". \lbx@lfromlang is the long version, \lbx@sfromlang is the short version.

See for example

  byeditortranin   = {{edited, translated \lbx@lfromlang\finalandcomma\ and annotated, with an introduction, by}%
                      {ed.,\addabbrvspace trans\adddot\ \lbx@sfromlang\finalandcomma\ and annot, with an introd., by}},

in english.lbx.

@marczellm

This comment has been minimized.

Copy link
Contributor Author

commented Mar 29, 2018

The word order will have to be switched in the case of withforeword and similar keys. In Hungarian it's "<name> előszavával" etc., the name comes first.

@moewew

This comment has been minimized.

Copy link
Collaborator

commented Mar 29, 2018

Unfortunately, the localisation interface does not support different word orders at the moment.

It would in theory be possible to change the word order be re-defining the macros that use these strings. That would mean quite a bit of redefinitions though. The issue with redefining macros in the .lbx is that these redefinitions overwrite definitions the user makes in the preamble and are reset every time the language changes.

@marczellm marczellm closed this Mar 31, 2018

@marczellm marczellm reopened this Mar 31, 2018

@moewew

This comment has been minimized.

Copy link
Collaborator

commented Mar 31, 2018

Seems as though your last comment accidentally got deleted. Your solution for witforeword is exactly right
The English version is

withforeword     = {{with a foreword by}{with a forew\adddot\ by}},

and the German is

withforeword     = {{mit einem Vorwort von}{mit einem Vorw\adddot\ von}},

so a translation that comes down to "[with a] foreword (written) by" seems good.

Translations for certain phrases with prepositions can be very challenging in languages that use case instead. The translations for those can be guided by example use. Check out https://github.com/plk/biblatex/blob/dev/doc/latex/biblatex/examples/03-localization-keys.tex and what it has to say about these strings. If you can't find an example that uses the strings, you can ask here again. Maybe we can come up with a typical example.

@marczellm

This comment has been minimized.

Copy link
Contributor Author

commented Mar 31, 2018

Does the plurals such as "pages", "volumes" only ever get printed in conjunction with numbers? Because if they do, there's no need for plurals in Hungarian. It's "1. oldal" and "1-100. oldal", the singular form is used. Or should I add the plural forms to the bib strings anyway and then we'll redefine the field formats to use the singular form?

@moewew

This comment has been minimized.

Copy link
Collaborator

commented Mar 31, 2018

Good question. I'm always amazed by the grammatical features of different languages.

pages is exclusively used for page references: pp. 20-28 etc. totalpages is used for 200 pages when it signifies the length of a book.

I prefer the solution that involved fewer redefinitions of things that are not bibstrings (field formats, bibmacros etc.), so in this case you would probably be fine going for the singular form in both variants. If, however, we need to put the bibstring after the page number we have an entirely different and more serious problem anyway.

@marczellm

This comment has been minimized.

Copy link
Contributor Author

commented Mar 31, 2018

Why? \DeclareFieldFormat[article,periodical,inproceedings]{pages}{#1\adddotspace\bibstring{pages}} worked just fine when I was writing my thesis.

@moewew

This comment has been minimized.

Copy link
Collaborator

commented Mar 31, 2018

Two things.

Firstly, the standard definitions use a more intricate definition

\DeclareFieldFormat{pages}{\mkpageprefix[bookpagination]{#1}}

\mkpageprefix can do a little bit more than a simple #1\adddotspace\bibstring{pages} (it suppresses the pagination bibstring if it detects that the argument is not a page range, it can deal with \psq correctly, ...).

Secondly, a solution would either involve redefining field formats or internal macros for Hungarian. As I stated above I would like to stay clear of that as much as possible, since it can lead to undesired effects. It would probably be acceptable to locally redefine \mkpageprefix...

@marczellm

This comment has been minimized.

Copy link
Contributor Author

commented Mar 31, 2018

The alternative is: leave it as p. and pp. and use the English word order. Some, maybe most Hungarian journals are OK with that.

@moewew

This comment has been minimized.

Copy link
Collaborator

commented Mar 31, 2018

I'll see what I can do to \mkpageprefix. I can't promise anything, though. I'd only go for 'p.' and 'pp.' as a last resort and if it is indeed widely acceptable.

@moewew

This comment has been minimized.

Copy link
Collaborator

commented Mar 31, 2018

OK, for page numbers I can offer an acceptable solution that only involves redefining lower-level commands that should not be touched by the average user. I consider that more acceptable than redefining higher-level things that a user could conceivably modify themselves.

@moewew

This comment has been minimized.

Copy link
Collaborator

commented Mar 31, 2018

I have submitted a pull request with the necessary changes to the low-level page macro (marczellm#2). I will decline write access to your repository since I prefer to work with pull request so you can check the changes I make and I don't accidentally overwrite something important on your end.

@marczellm

This comment has been minimized.

Copy link
Contributor Author

commented Mar 31, 2018

Meanwhile, could you explain in a nutshell (or point me to documentation if I missed it) the purpose of the various test files in the testfiles folder? With the bibliography extras that we're adding, we're going to need tests for all these cases, but I don't know if I'll have to write it or just haven't found it yet.

(Or doc/latex/biblatex/examples)

@marczellm

This comment has been minimized.

Copy link
Contributor Author

commented Aug 13, 2018

@priestoferis @moewew I pushed a few fixes. We can proceed to pull request.

@priestoferis

This comment has been minimized.

Copy link
Contributor

commented Aug 13, 2018

Great! @marczellm Will you do it or should I? And to clarify compared to your first post, I actually spell my name with an a instead of an á :D

But before we actually do make a request I just noticed something:
@moewew there are many language specific keys, like fromenglish, countryeu. Hungarian is both missing from magyar.lbx and the english.lbx. I thought that the hungarian branch should also including adding these to a few languages. I can safely do english and german I guess, which would also be the two most important ones anyway, and could also confirm several other languages with native speakers. What do you think?

@moewew

This comment has been minimized.

Copy link
Collaborator

commented Aug 13, 2018

langhungarian and fromhungarian should be added, yes. Ideally you'd add them to all .lbx files and comment them out if you don't know the translation, cf.

% langlatvian = {{}{}},% FIXME: missing

Please add the strings in a sensible place, i.e. fromhungarian between fromgreek and fromitalian, and langhungarian between langgreek and langitalian.

If you can confirm translations for other languages from native speakers, that would be great, but it is not a must, they can and hopefully will be picked up when a native speaker reviews the .lbx file and suggests improvements. I have already added the strings for English, German and Dutch, so anything else would be welcome. Cf. dev...marczellm:hungarian

The country... and patent... strings are only for patents and are for illustrative purposes only, so we don't add new languages there.

@moewew

This comment has been minimized.

Copy link
Collaborator

commented Aug 13, 2018

You may also want to remove the % -*- coding: utf-8 -*- in https://github.com/marczellm/biblatex/blob/9b162b765f672c1494fdc3766a45af89e3cb81d8/tex/latex/biblatex/lbx/magyar.lbx#L1 before the pull request.

@priestoferis

This comment has been minimized.

Copy link
Contributor

commented Aug 13, 2018

I'm on it. It's been called to my attention, that in the Bulgarian lbx the greek abbrevation is bad fromgreek = {{от гръцки}{от\addabbrvspace греч\adddot}}, and should be гръч (as it is in langgreek), but I don't feel comfortable changing it in this pull request. What should I do?

@moewew

This comment has been minimized.

Copy link
Collaborator

commented Aug 13, 2018

Just open a different pull request if you have this on good authority or open an issue if it needs to be discussed first. Don't include it in the Hungarian changes, in case we need to roll it back later.

@priestoferis

This comment has been minimized.

Copy link
Contributor

commented Aug 13, 2018

Made a pull for the Bulgarian fix. I also added empty strings for Hungarian, and confirmed a couple of translations with native speakers. I am waiting for a few more languages, so I'd wait a few days with making the pull request. @marczellm If it's fine with you, I'll make the pull request once I have all the translations in that I could find a native speaker for.

@priestoferis

This comment has been minimized.

Copy link
Contributor

commented Aug 14, 2018

I will get some translations tomorrow, but I decided to stop with those. If I do come across translations later I can always do a pull request later. So if @marczellm is okay with it, I'll do the pull request sometime late Wednesday.

@marczellm

This comment has been minimized.

Copy link
Contributor Author

commented Aug 14, 2018

@priestoferis

This comment has been minimized.

Copy link
Contributor

commented Aug 15, 2018

@moewew This is a technical question (I haven't done something like this before, so I'm trying to learn and not mess up).

I squased the commits and did a rebase with dev using this as a template.

Now when I make the pull request it says that the request can't be automatically merged, I'm guessing because of this:
image

Did I do anything wrong? Shouldn't the whole rebase thing be for not having conflicts like this? I already force pushed the rebase to the repo, but I made backup before so I can restore it if needed.

@moewew

This comment has been minimized.

Copy link
Collaborator

commented Aug 15, 2018

Mhhh, that change looks OK. Normally we get merge conflicts with the docs or the CHANGELOG if it changed in between. I would have thought a rebase would have alleviated those problems, but I'm not Git expert and I rarely do a rebase, so what do I know. Just submit the PR and I'll hopefully sort it out.

@priestoferis

This comment has been minimized.

Copy link
Contributor

commented Aug 15, 2018

Ok. I am also not exactly happy with the fact that this process seems to have hidden your contribution somehow. But anyway, I'll go forward with the request.

@moewew

This comment has been minimized.

Copy link
Collaborator

commented Aug 15, 2018

#780 is merged. The Hungarian localisation should be included in the next regular release. There is no ETA on that as of yet, it might take a while until it comes out unless there is a critical bugfix. PLK is a bit busy right now.

@moewew moewew moved this from In Progress to Done in biblatex 3.12 Aug 23, 2018

@priestoferis

This comment has been minimized.

Copy link
Contributor

commented Oct 11, 2018

shouldn't this issue be closed? (sorry for spamming if not)

@moewew

This comment has been minimized.

Copy link
Collaborator

commented Oct 12, 2018

I normally only close issues formally when the relevant version is released, so if it does not offend your sense of aesthetics I would leave this open for now.

We should release a new version soon (people are already complaining #821), but there are still some things to sort out.

@priestoferis

This comment has been minimized.

Copy link
Contributor

commented Oct 12, 2018

Ok, it's actually logical not to, now that I think of it.

@priestoferis

This comment has been minimized.

Copy link
Contributor

commented Oct 18, 2018

I tried using this dev branch on my computer, but just the lbx is incompatible with texlive2017 and if I install this locally to ~/texmf, than biber complains that ERROR - Error: Found biblatex control file version 3.5, expected version 3.4. (biber is 2.10). What would be the easiest way to start using the hungarian support?

@moewew

This comment has been minimized.

Copy link
Collaborator

commented Oct 18, 2018

The .lbx files you find in the repository here are meant to be used with biblatex 3.12 (as of now unreleased, dev versions available on https://sourceforge.net/projects/biblatex/files/development/; the dev versions are only meant for testing, not for production use). They will need tweaking if you want to use them with older versions of biblatex (even with the current version 3.11).

You will need a dev/beta version of Biber 2.12 to use biblatex 3.12 dev https://sourceforge.net/projects/biblatex-biber/files/biblatex-biber/development/ (current Biber version is 2.11, so 2.10 is outdated even by today's standards).

The easiest way is probably to wait a week or two until we have released biblatex 3.12 and Biber 2.12. You can then get the update via your TeX distribution and things should work out of the box.

edit FWIW https://github.com/plk/biblatex/blob/1df4a512b4c804afccd3bd015554c1a5ee6f62d8/tex/latex/biblatex/lbx/magyar.lbx can be tweaked to work with biblatex 3.11/Biber 2.11 simply by commenting out langhungarian and fromhungarian

langhungarian = {{magyar}{magyar}},

and

fromhungarian = {{magyarb\'ol}{magyarb\'ol}},

or conversely by adding \NewBibliographyString{langhungarian} and \NewBibliographyString{fromhungarian} to each document that uses the file. For older versions like 3.10 or 3.8 you may have to comment out or define even more unknown strings.

@priestoferis

This comment has been minimized.

Copy link
Contributor

commented Oct 18, 2018

Thanks, I ended up installing the dev versions for now. But good to know it will be solved soon.

@moewew moewew added this to the v3.12 milestone Oct 21, 2018

@moewew

This comment has been minimized.

Copy link
Collaborator

commented Nov 6, 2018

biblatex 3.12 has been released and is available in MikTeX and TeX live 2018 as of this morning (UTC).

The Hungarian language module is included in that release and should work as expected.

I'm closing this issue for now, please open a new issue if you want to report any issues about magyar.lbx.

@moewew moewew closed this Nov 6, 2018

moewew added a commit that referenced this issue Nov 21, 2018

@moewew

This comment has been minimized.

Copy link
Collaborator

commented Apr 19, 2019

I'm looking into ordinal page ranges again for #882.

Is the current output

\documentclass[hungarian]{article}
\usepackage{babel}
\usepackage{csquotes}

\usepackage[style=authoryear, backend=biber]{biblatex}

\addbibresource{biblatex-examples.bib}

\begin{document}
\cite{sigfridsson}

\cite[lorem]{sigfridsson}

\cite[34-36]{sigfridsson}

\cite[34a-36a]{sigfridsson}

\cite[34-]{sigfridsson}

\cite[-]{sigfridsson}

\cite[-36]{sigfridsson}

\cite[34-36,40-56]{sigfridsson}
\end{document}

34–36. old.

as expected? Or would

34.–36. old.

be more natural?

@priestoferis

This comment has been minimized.

Copy link
Contributor

commented Apr 19, 2019

34-36. old. in the current output is good.

But I built your example and there are several bad outputs.
In cases where the number is absent after the hyphen, there is a dot. e.g 34–. old.. I'm quite sure that is not right. What I'm not sure of, is whether it should be 34- old. or 34.- old. I will try and figure it out.

@priestoferis

This comment has been minimized.

Copy link
Contributor

commented Apr 19, 2019

Here's what we came up with:

  • \cite[34-]{sigfridsson} should be rendered 34.– old.
  • \cite[-]{sigfridsson} makes no sense, so I think it should be treated as a simple \cite{sigfridsson}
  • \cite[34a-36a]{sigfridsson} should use the same dash as the others
@moewew

This comment has been minimized.

Copy link
Collaborator

commented Apr 19, 2019

Thanks for the quick answer.

  1. Not sure if I can get \cite[34-]{sigfridsson} to say 34.-- old when usual ranges say 34--36. old. (I mean, it might be possible, but the amount of work might not be proportional to the good it would do).
  2. Yup, \cite[-]{sigfridsson} is pretty much nonsensical. I think we can live with the status quo (or with almost anything else we get), so I don't think it makes sense to invest time to discard it or do something else about it.
  3. \cite[34a-36a]{sigfridsson} does not pass the \ifpages test and consequently does not get the same formatting as real page ranges. It just gets treated as normal text, hence we don't change the dash.

So only the 34.-- old thingy is relevant, but I'm not sure we can do anything there.

@priestoferis

This comment has been minimized.

Copy link
Contributor

commented Apr 19, 2019

Thanks for the quick answer.

  1. Not sure if I can get \cite[34-]{sigfridsson} to say 34.-- old when usual ranges say 34--36. old. (I mean, it might be possible, but the amount of work might not be proportional to the good it would do).

Well, I certainly have no idea how :) Manually writing \cite[34.--~old.]{sigfridsson} solves the issue currently, just not too elegantly.

  1. Yup, \cite[-]{sigfridsson} is pretty much nonsensical. So I think we can live with the status quo (or with almost anything else we get), so I don't think it makes sense to invest time to discard it or do something else about it.

Ok, one can pay attention to just not do this.

  1. \cite[34a-36a]{sigfridsson} does not pass the \ifpages test and consequently does not get the same formatting as real page ranges. It just gets treated as normal text, hence we don't change the dash.

For the record, putting \cite[34a--36a.~old.]{sigfridsson} gets the desired result.

So only the 34.-- old thingy is relevant, but I'm not sure we can do anything there.

Yes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.