Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Hungarian support #717

Closed
marczellm opened this issue Feb 24, 2018 · 104 comments
Closed

Adding Hungarian support #717

marczellm opened this issue Feb 24, 2018 · 104 comments

Comments

@marczellm
Copy link
Contributor

@marczellm marczellm commented Feb 24, 2018

I have forked biblatex at https://github.com/marczellm/biblatex and started working on Hungarian support with the help of @priestoferis / Bence Ferdinándy.

This issue is to track our progress.

@moewew
Copy link
Collaborator

@moewew moewew commented Feb 24, 2018

Great! Let us know if you need help or anything fancy.

The new language file checklist is at https://github.com/plk/biblatex/wiki/Checklist-for-submitting-a-new-localisation-file-(.lbx)
Dates can do much more than they used to, so we need to know about 'long' and 'short' date formats (for dates with year, month, day; as well as for those with no day or no month; and for dates with only a season component) as well as date ranges and compact date ranges.

You can have a look at #708 and #710 for an example how this went with Latvian support.

@moewew moewew added this to To Do in biblatex 3.12 Feb 24, 2018
@ferdinandyb
Copy link
Contributor

@ferdinandyb ferdinandyb commented Feb 24, 2018

Thanks very much! I'm adding this here, we'll edit this as we go along.

Bibliography Extras

Adding bibliography extras requires more in-depth knowledge of Biblatex's localization modules. It is usually best if you go over the checklist below and send the answers to the Biblatex developers along with the translations; i.e., you translate the \DeclareBibliographyStrings part but omit \DeclareBibliographyExtras.

  1. What kind of dash do you use in number ranges like "35-50"?

    • \textendash? \textemdash? hyphen?
    • Are there any spaces around the dash?
  2. What kind of dash do you use in date ranges like "1st January-3rd February 2009"?

    • \textendash? \textemdash? hyphen?
    • Are there any spaces around the dash?
  3. In lists and enumerations, do you add a comma in addition to connectors like "and"? Examples:

    • UK English: "apples, oranges** and **cherries": no comma, like UK English
  4. How do you format short ordinals? Are ordinals gender/number sensitive? Examples:

    • German: 1. 2. 3. 4. 5. ... [= gender/number insensitive]: insensitive, like German
  5. What's the gender of the word "edition"?

    • No genders in Hungarian
  6. What's the gender of the word "series" (of a journal)?

    • No genders.
  7. What's the long date format? Hun. Academy of Sci. date spelling reference

    1. What happens if the day is missing from the date?
    2. What happens if both day and month are missing?
    3. Does this day format use leading zeros?
    4. Please be specific about the spaces to be used (non-breakable, thin spaces).
    5. What happens for date ranges with compact (1.--6. February 2010, 1. February--6. March 2010) and non-compact ranges (1. February 2010--6. February 2010)?

    Examples:

    • US English: [month name] [day as number], [year] = January 1, 2009
    • UK English: [day as ordinal] [month name] [year] = 1st January 2009

    Reduced precision examples:

    • UK English: March 2010
    • UK English: 2010

    Comp date range examples:

    • US English: January 1--6, 2009; January 1--February 2, 2009
  8. What's the short date format?

    1. What happens if the day is missing from the date?
    2. What happens if both day and month are missing?
    3. Does this day format use leading zeros?
    4. Please be specific about the spaces to be used (non-breakable, thin spaces).
    5. What happens for date ranges with compact (1.--6. February 2010, 1. February--6. March 2010) and non-compact ranges (1. February 2010--6. February 2010)?

    Examples:

    • US English: month/day/year = 01/31/2009
    • UK English: day/month/year = 31/01/2009
    • German: day.month.year = 31.01.2009
    • Swedish: year-month-day = 2009-01-31
  9. After which punctuation marks do you capitalize a word? Examples:

    • after ". ! ?" (English = package default): like English
  10. What's the 12h time format?/What's the 24h time format?

  11. What's the format for dates with seasons only, such as 'Summer 2010'?
    * This might get tricky, if the word Summer has to be used separately and with a date. E.g. Summer is 'nyár', but in 'Summer 2010' you have to conjugate '2010 nyara' (literally, 'summer of', but it's always used like this).

@moewew
Copy link
Collaborator

@moewew moewew commented Feb 24, 2018

Since I'm not sure the checklist covers every eventuality any more, get ready to get bombarded with more questions later.

And prepare to explain the basics of Hungarian grammar to me should it be necessary.

There are a few things the localisation simply can't deal with at the moment because it was written from a Germanic point of view. You can always ask if something is not there, but we may have to resort to workarounds in some cases.

@ferdinandyb
Copy link
Contributor

@ferdinandyb ferdinandyb commented Feb 24, 2018

Ok, sure, I'm prepared that this might take some time :) And thanks for helping! I'm hoping that most quirks of Hungarian will not come up during a bibliography, but we'll try to recruit someone who's more humanities inclined than we are, just in case.

@moewew
Copy link
Collaborator

@moewew moewew commented Feb 25, 2018

@marczellm
Copy link
Contributor Author

@marczellm marczellm commented Mar 2, 2018

Here are my answers to some of the questions

  1. What kind of dash do you use in number ranges like "35-50"?

  2. What kind of dash do you use in date ranges like "1st January-3rd February 2009"?

    • \textendash with spaces
  3. What's the long date format? [year]. [month name] [day]. = 2018. március 2.

    1. What happens if the day is missing from the date? [year]. [month] = 2018. március
    2. What happens if both day and month are missing? [year].
    3. Does this day format use leading zeros? No
    4. Please be specific about the spaces to be used (non-breakable, thin spaces). Non-breakable, although the spelling dictionary doesn't specify.
    5. What happens for date ranges with compact and non-compact ranges?
    • 2009. január 1--6.
    • 2009. január 1. -- február 2.
    • 2009. január 1. -- 2009. február 2.
  4. What's the short date format? **[year]. [month with leading zeros]. [day with leading zeros]. **

    1. What happens if the day is missing from the date? [year]. [month with leading zeros].
    2. What happens if both day and month are missing? [year].
    3. Non-breakable spaces are used
    4. What happens for date ranges with compact and non-compact ranges?
    • 2009. 01. 01--06.
    • 2009. 01. 01. -- 02. 02.
    • 2009. 01. 01. -- 2009. 02. 02.
  5. What's the 12h time format?/What's the 24h time format?

    • I'd really suggest always using the 24h time format which is simply [hour with leading zeros]:[minutes with leading zeros] e. g. 01:35
    • We don't have AM and PM but we have hajnali 5, reggel 6, reggel 9, délelőtt 10, 12, délután 1, délután 5, este 6, este 11 where hajnal=dawn, reggel=morning (6-10), délelőtt=10-12, délután=afternoon (1-5), este=evening (6-)

@marczellm
Copy link
Contributor Author

@marczellm marczellm commented Mar 2, 2018

What are the rules for capitalizing/not capitalizing bibliography string in the lbx file? Is it relevant?

@moewew
Copy link
Collaborator

@moewew moewew commented Mar 2, 2018

Capitalise the strings as if they were to appear mid-sentence.

@marczellm
Copy link
Contributor Author

@marczellm marczellm commented Mar 2, 2018

What to do in case of strings that apparently have no translation? "continuator" for example. The concept just didn't warrant a word.

@moewew
Copy link
Collaborator

@moewew moewew commented Mar 2, 2018

If you don't have a good translation leave it untranslated. Do not leave the string empty as in continuator = {{}{}}. You can go with a commented

%continuator = {{}{}},%FIXME missing

If it is about prepositions that are normally rendered with declension of the following noun phrase, we will have to examine that on a case-by-case basis.

@ferdinandyb
Copy link
Contributor

@ferdinandyb ferdinandyb commented Mar 2, 2018

Months can also be written with roman numerals

2009. január 1. -- 2009. február 2. -> 2009. I. 1. -- 2009. II. 2.

I think it can substituted anywhere, so we might need a roman date option.

@moewew
Copy link
Collaborator

@moewew moewew commented Mar 2, 2018

Sorry, I messed up my last comment. If you don't know how to translate a string, comment it out

%continuator = {{}{}},% FIXME missing

@marczellm
Copy link
Contributor Author

@marczellm marczellm commented Mar 29, 2018

What is \lbx@lfromlang and \lbx@sfromlang?

@moewew
Copy link
Collaborator

@moewew moewew commented Mar 29, 2018

These macros store the string "from the <language>" so that bit can be inserted at the correct place in strings such as "translated from the <language> by". \lbx@lfromlang is the long version, \lbx@sfromlang is the short version.

See for example

  byeditortranin   = {{edited, translated \lbx@lfromlang\finalandcomma\ and annotated, with an introduction, by}%
                      {ed.,\addabbrvspace trans\adddot\ \lbx@sfromlang\finalandcomma\ and annot, with an introd., by}},

in english.lbx.

@marczellm
Copy link
Contributor Author

@marczellm marczellm commented Mar 29, 2018

The word order will have to be switched in the case of withforeword and similar keys. In Hungarian it's "<name> előszavával" etc., the name comes first.

@moewew
Copy link
Collaborator

@moewew moewew commented Mar 29, 2018

Unfortunately, the localisation interface does not support different word orders at the moment.

It would in theory be possible to change the word order be re-defining the macros that use these strings. That would mean quite a bit of redefinitions though. The issue with redefining macros in the .lbx is that these redefinitions overwrite definitions the user makes in the preamble and are reset every time the language changes.

@marczellm marczellm closed this Mar 31, 2018
@marczellm marczellm reopened this Mar 31, 2018
@moewew
Copy link
Collaborator

@moewew moewew commented Mar 31, 2018

Seems as though your last comment accidentally got deleted. Your solution for witforeword is exactly right
The English version is

withforeword     = {{with a foreword by}{with a forew\adddot\ by}},

and the German is

withforeword     = {{mit einem Vorwort von}{mit einem Vorw\adddot\ von}},

so a translation that comes down to "[with a] foreword (written) by" seems good.

Translations for certain phrases with prepositions can be very challenging in languages that use case instead. The translations for those can be guided by example use. Check out https://github.com/plk/biblatex/blob/dev/doc/latex/biblatex/examples/03-localization-keys.tex and what it has to say about these strings. If you can't find an example that uses the strings, you can ask here again. Maybe we can come up with a typical example.

@marczellm
Copy link
Contributor Author

@marczellm marczellm commented Mar 31, 2018

Does the plurals such as "pages", "volumes" only ever get printed in conjunction with numbers? Because if they do, there's no need for plurals in Hungarian. It's "1. oldal" and "1-100. oldal", the singular form is used. Or should I add the plural forms to the bib strings anyway and then we'll redefine the field formats to use the singular form?

@moewew
Copy link
Collaborator

@moewew moewew commented Mar 31, 2018

Good question. I'm always amazed by the grammatical features of different languages.

pages is exclusively used for page references: pp. 20-28 etc. totalpages is used for 200 pages when it signifies the length of a book.

I prefer the solution that involved fewer redefinitions of things that are not bibstrings (field formats, bibmacros etc.), so in this case you would probably be fine going for the singular form in both variants. If, however, we need to put the bibstring after the page number we have an entirely different and more serious problem anyway.

@marczellm
Copy link
Contributor Author

@marczellm marczellm commented Mar 31, 2018

Why? \DeclareFieldFormat[article,periodical,inproceedings]{pages}{#1\adddotspace\bibstring{pages}} worked just fine when I was writing my thesis.

@moewew
Copy link
Collaborator

@moewew moewew commented Mar 31, 2018

Two things.

Firstly, the standard definitions use a more intricate definition

\DeclareFieldFormat{pages}{\mkpageprefix[bookpagination]{#1}}

\mkpageprefix can do a little bit more than a simple #1\adddotspace\bibstring{pages} (it suppresses the pagination bibstring if it detects that the argument is not a page range, it can deal with \psq correctly, ...).

Secondly, a solution would either involve redefining field formats or internal macros for Hungarian. As I stated above I would like to stay clear of that as much as possible, since it can lead to undesired effects. It would probably be acceptable to locally redefine \mkpageprefix...

@marczellm
Copy link
Contributor Author

@marczellm marczellm commented Mar 31, 2018

The alternative is: leave it as p. and pp. and use the English word order. Some, maybe most Hungarian journals are OK with that.

@moewew
Copy link
Collaborator

@moewew moewew commented Mar 31, 2018

I'll see what I can do to \mkpageprefix. I can't promise anything, though. I'd only go for 'p.' and 'pp.' as a last resort and if it is indeed widely acceptable.

@moewew
Copy link
Collaborator

@moewew moewew commented Mar 31, 2018

OK, for page numbers I can offer an acceptable solution that only involves redefining lower-level commands that should not be touched by the average user. I consider that more acceptable than redefining higher-level things that a user could conceivably modify themselves.

@moewew
Copy link
Collaborator

@moewew moewew commented Mar 31, 2018

I have submitted a pull request with the necessary changes to the low-level page macro (marczellm#2). I will decline write access to your repository since I prefer to work with pull request so you can check the changes I make and I don't accidentally overwrite something important on your end.

@marczellm
Copy link
Contributor Author

@marczellm marczellm commented Mar 31, 2018

Meanwhile, could you explain in a nutshell (or point me to documentation if I missed it) the purpose of the various test files in the testfiles folder? With the bibliography extras that we're adding, we're going to need tests for all these cases, but I don't know if I'll have to write it or just haven't found it yet.

(Or doc/latex/biblatex/examples)

@moewew
Copy link
Collaborator

@moewew moewew commented Aug 13, 2018

You may also want to remove the % -*- coding: utf-8 -*- in https://github.com/marczellm/biblatex/blob/9b162b765f672c1494fdc3766a45af89e3cb81d8/tex/latex/biblatex/lbx/magyar.lbx#L1 before the pull request.

@ferdinandyb
Copy link
Contributor

@ferdinandyb ferdinandyb commented Aug 13, 2018

I'm on it. It's been called to my attention, that in the Bulgarian lbx the greek abbrevation is bad fromgreek = {{от гръцки}{от\addabbrvspace греч\adddot}}, and should be гръч (as it is in langgreek), but I don't feel comfortable changing it in this pull request. What should I do?

@moewew
Copy link
Collaborator

@moewew moewew commented Aug 13, 2018

Just open a different pull request if you have this on good authority or open an issue if it needs to be discussed first. Don't include it in the Hungarian changes, in case we need to roll it back later.

@ferdinandyb
Copy link
Contributor

@ferdinandyb ferdinandyb commented Aug 13, 2018

Made a pull for the Bulgarian fix. I also added empty strings for Hungarian, and confirmed a couple of translations with native speakers. I am waiting for a few more languages, so I'd wait a few days with making the pull request. @marczellm If it's fine with you, I'll make the pull request once I have all the translations in that I could find a native speaker for.

@ferdinandyb
Copy link
Contributor

@ferdinandyb ferdinandyb commented Aug 14, 2018

I will get some translations tomorrow, but I decided to stop with those. If I do come across translations later I can always do a pull request later. So if @marczellm is okay with it, I'll do the pull request sometime late Wednesday.

@marczellm
Copy link
Contributor Author

@marczellm marczellm commented Aug 14, 2018

@ferdinandyb
Copy link
Contributor

@ferdinandyb ferdinandyb commented Aug 15, 2018

@moewew This is a technical question (I haven't done something like this before, so I'm trying to learn and not mess up).

I squased the commits and did a rebase with dev using this as a template.

Now when I make the pull request it says that the request can't be automatically merged, I'm guessing because of this:
image

Did I do anything wrong? Shouldn't the whole rebase thing be for not having conflicts like this? I already force pushed the rebase to the repo, but I made backup before so I can restore it if needed.

@moewew
Copy link
Collaborator

@moewew moewew commented Aug 15, 2018

Mhhh, that change looks OK. Normally we get merge conflicts with the docs or the CHANGELOG if it changed in between. I would have thought a rebase would have alleviated those problems, but I'm no Git expert and I rarely do a rebase, so what do I know. Just submit the PR and I'll hopefully sort it out.

@ferdinandyb
Copy link
Contributor

@ferdinandyb ferdinandyb commented Aug 15, 2018

Ok. I am also not exactly happy with the fact that this process seems to have hidden your contribution somehow. But anyway, I'll go forward with the request.

@moewew
Copy link
Collaborator

@moewew moewew commented Aug 15, 2018

#780 is merged. The Hungarian localisation should be included in the next regular release. There is no ETA on that as of yet, it might take a while until it comes out unless there is a critical bugfix. PLK is a bit busy right now.

@moewew moewew moved this from In Progress to Done in biblatex 3.12 Aug 23, 2018
@ferdinandyb
Copy link
Contributor

@ferdinandyb ferdinandyb commented Oct 11, 2018

shouldn't this issue be closed? (sorry for spamming if not)

@moewew
Copy link
Collaborator

@moewew moewew commented Oct 12, 2018

I normally only close issues formally when the relevant version is released, so if it does not offend your sense of aesthetics I would leave this open for now.

We should release a new version soon (people are already complaining #821), but there are still some things to sort out.

@ferdinandyb
Copy link
Contributor

@ferdinandyb ferdinandyb commented Oct 12, 2018

Ok, it's actually logical not to, now that I think of it.

@ferdinandyb
Copy link
Contributor

@ferdinandyb ferdinandyb commented Oct 18, 2018

I tried using this dev branch on my computer, but just the lbx is incompatible with texlive2017 and if I install this locally to ~/texmf, than biber complains that ERROR - Error: Found biblatex control file version 3.5, expected version 3.4. (biber is 2.10). What would be the easiest way to start using the hungarian support?

@moewew
Copy link
Collaborator

@moewew moewew commented Oct 18, 2018

The .lbx files you find in the repository here are meant to be used with biblatex 3.12 (as of now unreleased, dev versions available on https://sourceforge.net/projects/biblatex/files/development/; the dev versions are only meant for testing, not for production use). They will need tweaking if you want to use them with older versions of biblatex (even with the current version 3.11).

You will need a dev/beta version of Biber 2.12 to use biblatex 3.12 dev https://sourceforge.net/projects/biblatex-biber/files/biblatex-biber/development/ (current Biber version is 2.11, so 2.10 is outdated even by today's standards).

The easiest way is probably to wait a week or two until we have released biblatex 3.12 and Biber 2.12. You can then get the update via your TeX distribution and things should work out of the box.

edit FWIW https://github.com/plk/biblatex/blob/1df4a512b4c804afccd3bd015554c1a5ee6f62d8/tex/latex/biblatex/lbx/magyar.lbx can be tweaked to work with biblatex 3.11/Biber 2.11 simply by commenting out langhungarian and fromhungarian

langhungarian = {{magyar}{magyar}},

and

fromhungarian = {{magyarb\'ol}{magyarb\'ol}},

or conversely by adding \NewBibliographyString{langhungarian} and \NewBibliographyString{fromhungarian} to each document that uses the file. For older versions like 3.10 or 3.8 you may have to comment out or define even more unknown strings.

@ferdinandyb
Copy link
Contributor

@ferdinandyb ferdinandyb commented Oct 18, 2018

Thanks, I ended up installing the dev versions for now. But good to know it will be solved soon.

@moewew moewew added this to the v3.12 milestone Oct 21, 2018
@moewew
Copy link
Collaborator

@moewew moewew commented Nov 6, 2018

biblatex 3.12 has been released and is available in MikTeX and TeX live 2018 as of this morning (UTC).

The Hungarian language module is included in that release and should work as expected.

I'm closing this issue for now, please open a new issue if you want to report any issues about magyar.lbx.

@moewew moewew closed this Nov 6, 2018
moewew added a commit that referenced this issue Nov 21, 2018
@moewew
Copy link
Collaborator

@moewew moewew commented Apr 19, 2019

I'm looking into ordinal page ranges again for #882.

Is the current output

\documentclass[hungarian]{article}
\usepackage{babel}
\usepackage{csquotes}

\usepackage[style=authoryear, backend=biber]{biblatex}

\addbibresource{biblatex-examples.bib}

\begin{document}
\cite{sigfridsson}

\cite[lorem]{sigfridsson}

\cite[34-36]{sigfridsson}

\cite[34a-36a]{sigfridsson}

\cite[34-]{sigfridsson}

\cite[-]{sigfridsson}

\cite[-36]{sigfridsson}

\cite[34-36,40-56]{sigfridsson}
\end{document}

34–36. old.

as expected? Or would

34.–36. old.

be more natural?

@ferdinandyb
Copy link
Contributor

@ferdinandyb ferdinandyb commented Apr 19, 2019

34-36. old. in the current output is good.

But I built your example and there are several bad outputs.
In cases where the number is absent after the hyphen, there is a dot. e.g 34–. old.. I'm quite sure that is not right. What I'm not sure of, is whether it should be 34- old. or 34.- old. I will try and figure it out.

@ferdinandyb
Copy link
Contributor

@ferdinandyb ferdinandyb commented Apr 19, 2019

Here's what we came up with:

  • \cite[34-]{sigfridsson} should be rendered 34.– old.
  • \cite[-]{sigfridsson} makes no sense, so I think it should be treated as a simple \cite{sigfridsson}
  • \cite[34a-36a]{sigfridsson} should use the same dash as the others

@moewew
Copy link
Collaborator

@moewew moewew commented Apr 19, 2019

Thanks for the quick answer.

  1. Not sure if I can get \cite[34-]{sigfridsson} to say 34.-- old when usual ranges say 34--36. old. (I mean, it might be possible, but the amount of work might not be proportional to the good it would do).
  2. Yup, \cite[-]{sigfridsson} is pretty much nonsensical. I think we can live with the status quo (or with almost anything else we get), so I don't think it makes sense to invest time to discard it or do something else about it.
  3. \cite[34a-36a]{sigfridsson} does not pass the \ifpages test and consequently does not get the same formatting as real page ranges. It just gets treated as normal text, hence we don't change the dash.

So only the 34.-- old thingy is relevant, but I'm not sure we can do anything there.

@ferdinandyb
Copy link
Contributor

@ferdinandyb ferdinandyb commented Apr 19, 2019

Thanks for the quick answer.

  1. Not sure if I can get \cite[34-]{sigfridsson} to say 34.-- old when usual ranges say 34--36. old. (I mean, it might be possible, but the amount of work might not be proportional to the good it would do).

Well, I certainly have no idea how :) Manually writing \cite[34.--~old.]{sigfridsson} solves the issue currently, just not too elegantly.

  1. Yup, \cite[-]{sigfridsson} is pretty much nonsensical. So I think we can live with the status quo (or with almost anything else we get), so I don't think it makes sense to invest time to discard it or do something else about it.

Ok, one can pay attention to just not do this.

  1. \cite[34a-36a]{sigfridsson} does not pass the \ifpages test and consequently does not get the same formatting as real page ranges. It just gets treated as normal text, hence we don't change the dash.

For the record, putting \cite[34a--36a.~old.]{sigfridsson} gets the desired result.

So only the 34.-- old thingy is relevant, but I'm not sure we can do anything there.

Yes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
biblatex 3.12
  
Done
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants