New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow Inclusion of non-UTF-8 Files #3248
Comments
This request does seem reasonable to me. FYI, you can accomplish this today using a custom include processor. In Asciidoctor 2, we already isolated the read mode, so technical it is very feasible. For now, this would not change the fact that the AsciiDoc document itself has to be encoded in UTF-8. We're still discussing making that configurable, but it would be a separate issue. |
Thanks @mojavelinux , I'm looking forward to it.
I'm sure there are many ways to circumvent this issue, and using a custom include processor would definitely be more elegant than my current solution (and won't require creating copies of the sources). It's just that I think that keeping things simpler, by finding solution within the native functionality of Asciidoctor is always preferable for in many projects the documentation part is often a subproject on the side, managed by specific users, and not all contributors to the main project might have experience with Asciidoctor (or none at all). In quite a few project I'm the one that follows the documentation part, and I always try to leave behind something that is easy to use and understand, just in case someone else would have to take on its maintainance in the future. Right now, the bash script solution is fine (and even Windows contributors are expected to have Bash as part of Git for Windows, which includes iconv), but as the saying goes "less is more", therefore the proposed feature would take off some burden from the project complexity (there already enough complications with custom extensions to handle Highlight and a few other third party tools to generate documentation from sources). |
I understand that. By suggesting the custom include processor, I was not arguing against the idea. I was simply offering you a path forward in the short term. So there's no need to provide further justification. I get it. |
Don't get me wrong, I understood perfectly that you were both welcoming my suggestion and offering a better workaround. My intention was just to share personal experience and thoughts about Asciidoctor, as a way of giving some feedback because I know that so many people use Asciidoctor in different ways, each one with his/her own needs and goals. So I just thought that providing some context about the scenario I'm working-in might provide some additional insight — i.e. to illustrate that sometimes what are easy and natural solutions for everyday Asciidoctor users might be seen as an obstacle by other collaborators who aren't into the document side of projects. Often by reading users comments in issues I learn about how others are using Asciidoctor in ways that I never considered, which broadens my view of the context the tool is being used in. |
👍 |
In Gitter, you mentioned that you are thinking about scoping a feature for 2.0.11 or 2.0.12 that allows the importer to set the encoding of the file to be included. Any news about this? :) |
First Glossary draft with an initial entry (*stropping*) and some commented-out pending entries TBD later on (Closes #54). Update contents of "§4.2. Words, Identifiers and Names": * Add "Stropping" sub-section. * Add `stropping` anchor. * Add `stropping` Index entry. * Revise and improve contents of this section: * More examples. * Extra admonitions. * Polish text. Clean-up, polish and update README files in Alan Manual directory. Referenced Issues: #36, #50, #54, asciidoctor/asciidoctor#3248.
First Glossary draft with an initial entry (*stropping*) and some commented-out pending entries TBD later on (Closes #54). Update contents of "§4.2. Words, Identifiers and Names": * Add "Stropping" sub-section. * Add `stropping` anchor. * Add `stropping` Index entry. * Revise and improve contents of this section: * More examples. * Extra admonitions. * Polish text. Clean-up, polish and update README files in Alan Manual directory. Referenced Issues: #36, #50, #54, asciidoctor/asciidoctor#3248.
First Glossary draft with an initial entry (*stropping*) and some commented-out pending entries TBD later on (Closes #54). Update contents of "§4.2. Words, Identifiers and Names": * Add "Stropping" sub-section. * Add `stropping` anchor. * Add `stropping` Index entry. * Revise and improve contents of this section: * More examples. * Extra admonitions. * Polish text. Clean-up, polish and update README files in Alan Manual directory. Referenced Issues: #36, #50, #54, asciidoctor/asciidoctor#3248.
…ied using encoding attribute
I've submitted a PR. See #3419 |
Thanks, I work with many big-sized projects for documentation of old software tools from the '80s and '90s, and I have to handle lot's of source code in ISO- and other legacy encodings! |
…ied using encoding attribute
First Glossary draft with an initial entry (*stropping*) and some commented-out pending entries TBD later on (Closes #54). Update contents of "§4.2. Words, Identifiers and Names": * Add "Stropping" sub-section. * Add `stropping` anchor. * Add `stropping` Index entry. * Revise and improve contents of this section: * More examples. * Extra admonitions. * Polish text. Clean-up, polish and update README files in Alan Manual directory. Referenced Issues: #36, #50, #54, asciidoctor/asciidoctor#3248.
First Glossary draft with an initial entry (*stropping*) and some commented-out pending entries TBD later on (Closes #54). Update contents of "§4.2. Words, Identifiers and Names": * Add "Stropping" sub-section. * Add `stropping` anchor. * Add `stropping` Index entry. * Revise and improve contents of this section: * More examples. * Extra admonitions. * Polish text. Clean-up, polish and update README files in Alan Manual directory. Referenced Issues: #36, #50, #54, asciidoctor/asciidoctor#3248.
Stop converting ALAN sources and transcripts to UTF-8 and directly include the original ISO-8859-1 files in AsciiDoc sources (fixes #126). This huge commit entirely removes from the repo all assets that dealt with creating UTF-8 intermediate versions of the ISO-8559-1 ALAN sources and transcripts, using instead the new (undocumented) `encoding` option of Asciidoctor's `include::` directives, which was kindly added by @mojavelinux on our request for the ALAN-IF projects: - asciidoctor/asciidoctor#3248 The build toolchain is now much faster than before. For the full details of the changes, refer see the task list of #126.
Rationale:
I'm currently working on a documentation project that involves sourcecode files encoded in ISO-8859-1, which can't be directly included into AsciiDoc socuments.
In order to include the source files (or parts of them) into the documents via the
include::
directive I need to first run a script that converts them to UTF-8 via iconv, and then run the Asciidoctor toolchain and include the UTF-8 version instead. Here's a real case example:Where the script creates a copy of the original sourcefiles (eg. "
mysource.alan
"/".i
") converted to UTF-8 ("mysource.utf8_alan
"/".utf8_i
").This introduces an extra layer of complexity and dependencies, especially on Windows which doesn't have a native tool like iconv, adds extra files and complicates managing any watch scripts.
If Asciidoctor were to allow an extra attribute to control the encoding of the included file — e.g.
include::path[encoding=iso-8859-1]
it would be much simpler and elegant.I know that today most source files are expected to be in UTF-8, but some legacy tools still cling on ISO encodings — and, besides, there are many other encodings still in use today. Being an optional extra feature that doesn't break backward compatibility, this would introduce and added benefit to Asciidoctor. My guess is that there should be Ruby libraries to handle encoding conversion.
The text was updated successfully, but these errors were encountered: