Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Asciidoctor/Include issue, when using Hugo shortcode #352

Closed
dbaio opened this issue Jan 22, 2022 · 13 comments · Fixed by #355
Closed

Asciidoctor/Include issue, when using Hugo shortcode #352

dbaio opened this issue Jan 22, 2022 · 13 comments · Fixed by #355

Comments

@dbaio
Copy link
Contributor

dbaio commented Jan 22, 2022

Source file.
PO file.
po4a version 0.66 and prior releases has the same issue.

po4a-gettextize \
        --format asciidoc \
        --option compat=asciidoctor \
        --option yfm_keys=title,part,description \
        --master "po4a-asciidoctor-includes.adoc" \
        --master-charset "UTF-8" \
        --copyright-holder "The FreeBSD Project" \
        --package-name "FreeBSD Documentation" \
        --po "po4a-asciidoctor-includes.po"

-->

#. type: Plain text
#: po4a-asciidoctor-includes.adoc:35
msgid ""
"include::shared/attributes/attributes-{{% lang %}}.adoc[] "
"include::shared/{{% lang %}}/teams.adoc[] include::shared/{{% lang "
"%}}/mailing-lists.adoc[] include::shared/{{% lang %}}/urls.adoc[]"
msgstr ""

Includes without {{% lang %}} works fine; they are skipped.

I'm reporting this so others can help, but the plan is to dig/debug po4a later.

https://docs.asciidoctor.org/asciidoc/latest/directives/include/

@jnavila
Copy link
Collaborator

jnavila commented Jan 22, 2022

Ah, yes... Variables in include statements are not supported. Macros are matched with this regex:

} elsif ( not defined $self->{verbatim}
and ( $line =~ m/^([\w\d][\w\d-]*)(::)(\S*)\[(.*)\]$/ ) )

@jnavila
Copy link
Collaborator

jnavila commented Jan 22, 2022

In fact, it should work if you remove the space characters around lang. Managing these ones may be quite tricky...

@dbaio
Copy link
Contributor Author

dbaio commented Jan 22, 2022

In fact, it should work if you remove the space characters around lang. Managing these ones may be quite tricky...

I'll try this, thanks

@dbaio dbaio changed the title Asciidoctor/Include issue Asciidoctor/Include issue, when using Hugo shortcode Jan 22, 2022
@dbaio
Copy link
Contributor Author

dbaio commented Jan 22, 2022

This {{% lang %}} is a Hugo shortcode in the Asciidoctor include.

https://github.com/mquinson/po4a/blob/master/lib/Locale/Po4a/AsciiDoc.pm#L793

It seems that changing the macro regex to accept any character before [] also fixes this and does not break anything.

and ( $line =~ m/^([\w\d][\w\d-]*)(::)(\S*)\[(.*)\]$/ ) )

to

and ( $line =~ m/^([\w\d][\w\d-]*)(::)(.*)\[(.*)\]$/ ) )

I tested it with all includes from the Asciidoctor Documentation examples and others I could find.

What do you think about this change?

@jnavila
Copy link
Collaborator

jnavila commented Jan 23, 2022

I have two oppositions:

  • I do not like this catch all regex. I'm not sure it could start to wrongly match description lists . To rule out this mismatch, you should match a non-space character just after the colons.
  • More generally, what we are trying to do here is to match some part of a format of an external templating engine (Hugo). There are plenty of templating engines out there, and we surely don't want to tweak our hand-made parser to accept all their formats. In the end, our parser should be used on the same content as asciidoc, that is the output of the templating engine.

Our parser must be facing plain asciidoc for translatable content. This is a task of internationalizing the content to use the features of asciidoc in order to split the content between template specific code and plain asciidoc. You could use asciidoc document attributes to do so:

:hugolang: {{% lang %}}

include::shared/attributes/attributes-{hugolang}.adoc[]

In fact the lang attribute is already in use in asciidoctor, in order to select the language for naming figure, chapters,...
Also note that document attributes can be passed to asciidoctor at invocation time, on the command line with the -a option or in code, thus eliminating the need to define them in the document and run the templating engine on them, which may make the whole processing lighter.

@jnavila
Copy link
Collaborator

jnavila commented Jan 23, 2022

@mquinson What do you think?

@mquinson
Copy link
Owner

I never wrote one line of asciidoc in my life and would prefer if I can blindly trust someone here...

If someone like @dbaio needs this asciidoc+hugo thing and is willing to contribute to it, why not? Hugo may be one of the templating solution among the mass, but it's not the less used one either. But again, if there is no user to contribute to that code, it may not happen (no matter how desirable): that's free software and we mostly fix the code we use and are familiar with. Please @dbaio jump into the code :)

There is a slight challenge to ensure that the new features do not clutter too much the code, but I'm not familiar with this code at all, so it's hard for me to comment.

If things go seriously wrong, you should consider doing a specific po4a formatter for asciidoc+hugo, alongside to the original formatter (starting by copying the file). But that should only be done in last resort, as the maintainance of the asciidoc parts will be doubled if you dupplicate the code. This is really not a good design for the long term, but that's maybe something to consider if code reusability gets hard. We already have 2 YAML parsers (one for the stand alone formatter and one for the front matter) and the main reason I think it's OK so is because this code is so simple and small.

@dbaio
Copy link
Contributor Author

dbaio commented Jan 23, 2022

The lang attribute (with -a option) is an issue with Hugo and Asciidoctor; we can't define it for each language when building the project.
I tried to talk with Hugo developers about that in the past; anyway, I'll send more messages in their forum about it again.

The example with asciidoc document attributes can be a way out as well, thanks for the tip, although, in our project, we will need to change more than a thousand files to use it.

@dbaio
Copy link
Contributor Author

dbaio commented Jan 23, 2022

About the catch-all regex, we can work around it with (\S*|\S*\{.*\}\S*).

I've opened PR #355 for review.

@jnavila
Copy link
Collaborator

jnavila commented Jan 23, 2022

The lang attribute (with -a option) is an issue with Hugo and Asciidoctor; we can't define it for each language when building the project. I tried to talk with Hugo developers about that in the past; anyway, I'll send more messages in their forum about it again.

I don't understand this remark. From a quick review of your project, you call ./tools/asciidoctor.sh books ${_lang} pdf for each language, as detected in the Makefile ; the correct language is already passed to asciidoctor with -a lang="$doc_lang", making the attribute already available from asciidoc.

I am not minding my own business, but it seems that Hugo is useless for your usecase, as you already have all the needed transclusion facility in asciidoctor.

@jnavila
Copy link
Collaborator

jnavila commented Jan 23, 2022

As for thousands of files, well it's just a simple sed command.

@dbaio
Copy link
Contributor Author

dbaio commented Jan 23, 2022

The FreeBSD docs have two Hugo projects on it, website and documentation, and they use Asciidoctor, but it's all driven by Hugo. That script generates a pdf for the documentation articles/books; it's the only place we use asciidoctor standalone.

While that issue with Hugo and Asciidoctor exists, we will need somewhere an if statement to change something between them.

IMHO, this change won't harm po4a and the asciidoc format, but I will respect it if you don't want to mix Hugo format here.

@mquinson
Copy link
Owner

I tend to agree with the fact that we can support Hugo without cluttering too much the code (or at least I hope so), but I didn't read the code yet. I think that at the end of the day, that's @jnavila decision. He's the one who did most of the work on the Asciidoc formatter, so he decides.

If we need to split the implementations, I'm confident that the BSD community will manage to maintain a fork of that formatter.

Again, I'm not saying that we must fork the formatter, because I didn't read the patch. And I prefer if I don't have to, so that the bus factor of that project continues to grow :)

Thanks for your time, guys.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants