Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fetch includes asynchronously (in-browser / client-side) #630

Open
tigregalis opened this issue Dec 14, 2018 · 14 comments
Open

fetch includes asynchronously (in-browser / client-side) #630

tigregalis opened this issue Dec 14, 2018 · 14 comments

Comments

@tigregalis
Copy link

I'd like to fetch documents and includes asynchronously (in-browser), using fetch.

By default, it uses synchronous XMLHttpRequest, which locks the browser and also give us a warning:

Synchronous XMLHttpRequest on the main thread is deprecated because of its detrimental effects to the end user's experience. For more help http://xhr.spec.whatwg.org/    asciidoctor.js:23436:8 

That's why I'd like to suggest a loadAsync / convertAsync API for Asciidoctor.js.

Perhaps a related suggestion is a way for the user to specify their own http client, e.g. if someone wishes to use fetch, axios, or node-fetch.

As a really convoluted work-around / proof-of-concept, I've managed to use two include processor extensions and Promises to progressively "load" the document. Right now this will only go down to one level of includes. There are a number of other issues of course, but primarily is that it's loading/parsing the document more than once.

const asciidoctor = Asciidoctor();

const root = document.getElementById('root');

function promiseObject(object) {

  let promisedProperties = [];

  const objectKeys = Object.keys(object);

  objectKeys.forEach((key) => promisedProperties.push(object[key]));

  return Promise.all(promisedProperties)
    .then((resolvedValues) => {
      return resolvedValues.reduce((resolvedObject, property, index) => {
        resolvedObject[objectKeys[index]] = property;
        return resolvedObject;
      }, object);
    });

}

fetch('index.adoc').then(res => res.text()).then(content => {

  const includesMap = {};

  const registry1 = asciidoctor.Extensions.create();
  registry1.includeProcessor(function () {
    var self = this;
    self.handles(_ => true);
    self.process(function (doc, reader, target, attrs) {
      if (!includesMap[target]) includesMap[target] = fetch(target).then(res => res.text());
      return reader.pushInclude('pending...', target, target, 1, attrs);
    });
  });

  asciidoctor.load(content, {'safe': 'safe', extension_registry: registry1});

  promiseObject(includesMap).then(includes => {

    const registry2 = asciidoctor.Extensions.create();
    registry2.includeProcessor(function () {
      var self = this;
      self.handles(_ => true);
      self.process(function (doc, reader, target, attrs) {
        return reader.pushInclude(includes[target], target, target, 1, attrs);
      });
    });

    const doc = asciidoctor.load(content, {'safe': 'safe', extension_registry: registry2});
    const html = doc.convert();
    root.innerHTML = html;

  });

});
@ggrossetie
Copy link
Member

Hello @tigregalis

I had this discussion with @mojavelinux but it's not an easy task 😅

As you saw the includes need to be resolved sequentially. In fact Asciidoctor parses the document from top to bottom and will resolve includes along the way.
An AsciiDoc document can contain attributes and attributes can change how the document is processed. So if we don't process includes sequentially the output will be different.

For instance let's take this simple example:

main.adoc

include::attrs.adoc[]
include::installation.adoc[]

attrs.adoc

:program-version: 1.2.3

installation.adoc

Install program version {program-version}

In this example attrs.adoc must be processed before installation.adoc otherwise {program-version} won't be resolved.
It's a simple example but of course you can have more complex situation where the attribute is defined in an include of include of include... or where we use condition to include or not a file.

However I think we can safely resolve includes asynchronously when the included file is not an AsciiDoc document:

[source,ruby]
----
include::app.rb[]
----

Or (embedded) images when :data-uri: is defined:

image::sunset.jpg[Sunset,300,200]

We could hack something in Asciidoctor.js but I think it would be better if Asciidoctor core knows that the function can be async...

@mojavelinux
Copy link
Member

If we think outside of the include processor (so to speak), the solution to this is a lot easier than we realize. What we want to do is start processing the document once all the includes have been resolved. In order to do that, we flip the include processing inside-out. First, we grab all the include URLs, then toss them into a Promise.all to build a map of URL to content. When that promise resolves, we invoke the processor. The include processor can reach back into this resolved map to grab the content for a URL when it comes across an include. (In fact, you don't have to limit it to URLs since file IO is also async).

This is actually the right way to do async includes. If we designed the processor to understand an async include processor, all we would end up doing is turning it into a sync operation because we can't advance the line until we have the content.

The benefit of the approach I'm proposing is that it opens the door to managing the cache of the content you downloaded to avoid having to download it every time. And if you're processing a lot of documents, you could even avoid repeat visits to the same URL.

In order to make this work, we'd need an API for extracting the include targets. (Keep in mind this could be recursive). It's actually pretty simple since includes aren't context sensitive (they don't see blocks). A very naive approach would just be to look for any lines that match an include directive. A slightly better approach is to handle all preprocessor conditionals too.

There might be some sharp edges we'll find, but I think we can get most of the way there, then think about how to take it the rest of the way after that.

@mojavelinux
Copy link
Member

Btw, in order to support URL includes in Antora, I'd really like to do it this way instead of invoking a sync subprocess. (Though we could to the latter first, then improve on it).

Related issue in Antora: https://gitlab.com/antora/antora/issues/246

@tigregalis
Copy link
Author

After reading @Mogztter's comment I was about to suggest, but @mojavelinux already beat me to it: Can you defer the resolution of attributes until all includes have been retrieved?

In the case of recursion, recursion is already an issue as it is:

recursion-1.adoc:

== this is from recursion 1

hello

include::recursion-2.adoc[]

recursion-2.adoc:

== this is from recursion 2

world

include::recursion-1.adoc[]

renders:

this is from recursion 1

hello

this is from recursion 2

world

this is from recursion 1

hello

this is from recursion 2

world

...

this is from recursion 1

hello

include::recursion-2.adoc[]

To deal with this, it would be appropriate to generate a tree of includes, and at each include, check along the current path from root-to-branch.

Given the following documents:

  • "root"
  • "example"
  • "example2"
  • "example3"

And the following includes, in this order:

  • "root":
    • include "example"
    • include "example2"
    • include "example"
    • include "root"
  • "example":
    • include "example2"
    • include "example3"
  • "example2":
    • include "example"

The tree generated would be:

  • "root" ["root"]
    • "example" ["root", "example"]
      • "example2" ["root", "example", "example2"]
        • "example" ["root", "example", "example2", "example"] ("example" has already appeared along the path to this branch, so don't include, and warn)
      • "example3" ["root", "example", "example3"]
    • "example2" ["root", "example2"]
      • "example" ["root", "example2", "example"]
        • "example2" ["root", "example2", "example", "example2"] ("example2" has already appeared along the path to this branch, so don't include, and warn)
    • "example" ["root", "example"]
      • "example2" ["root", "example", "example2"]
        • "example" ["root", "example", "example2", "example"] ("example" has already appeared along the path to this branch, so don't include, and warn)
      • "example3" ["root", "example", "example3"]
    • "root" ["root", "root"] ("root" has already appeared along the path to this branch, so don't include, and warn)

@ggrossetie
Copy link
Member

The benefit of the approach I'm proposing is that it opens the door to managing the cache of the content you downloaded to avoid having to download it every time. And if you're processing a lot of documents, you could even avoid repeat visits to the same URL.

👍

In order to make this work, we'd need an API for extracting the include targets. (Keep in mind this could be recursive). It's actually pretty simple since includes aren't context sensitive (they don't see blocks).

It might be an edge case but what about:

attrs.adoc

:include-all:

main.adoc

include::attrs.adoc[]

ifdef::include-all[]
include::a.adoc[]
include::b.adoc[]
include::c.adoc[]
include::d.adoc[]
include::e.adoc[]
include::f.adoc[]
// ...
endif::[]

If we took the naive approach we will fetch a.adoc, b.adoc, c.adoc, d.adoc, e.adoc, f.adoc... for nothing 😞
The other approach is to wait for attrs.adoc to resolve but then it's just a sequential asynchronous include resolver. We could of course try to do things in parallel if the include target does not contains any attribute nor include directive.
It means that this API should be able to resolve attributes and evaluate conditions.

A very naive approach would just be to look for any lines that match an include directive. A slightly better approach is to handle all preprocessor conditionals too.

What do you mean by conditionals ? We should execute all the registered preprocessor but I don't get the conditionals ? Maybe you are referring to ifdef ?

In the case of recursion, recursion is already an issue as it is:

If there's a infinite loop, we should use the max-include-depth attribute.
That's what Asciidoctor core Ruby is using to "abort" and avoid an endless recursive loop.

The tree generated would be:

Indeed if we have the complete tree we could do the resolution and detect loop early 👍

@tigregalis
Copy link
Author

A very naive approach would just be to look for any lines that match an include directive. A slightly better approach is to handle all preprocessor conditionals too.

What do you mean by conditionals ? We should execute all the registered preprocessor but I don't get the conditionals ? Maybe you are referring to ifdef ?

Attributes can also be in the target itself, e.g, https://docs.antora.org/antora/1.0/asciidoc/include-content/#include-partial

include::{partialsdir}/log-definition.adoc[]

In order to make this work, we'd need an API for extracting the include targets. (Keep in mind this could be recursive). It's actually pretty simple since includes aren't context sensitive (they don't see blocks).

It might be an edge case but what about:

attrs.adoc

:include-all:

main.adoc

include::attrs.adoc[]

ifdef::include-all[]
include::a.adoc[]
include::b.adoc[]
include::c.adoc[]
include::d.adoc[]
include::e.adoc[]
include::f.adoc[]
// ...
endif::[]

If we took the naive approach we will fetch a.adoc, b.adoc, c.adoc, d.adoc, e.adoc, f.adoc... for nothing 😞
The other approach is to wait for attrs.adoc to resolve but then it's just a sequential asynchronous include resolver. We could of course try to do things in parallel if the include target does not contains any attribute nor include directive.
It means that this API should be able to resolve attributes and evaluate conditions.

You could do it greedily, but iteratively.

As far as I'm aware, there are four types of includes:

  • unconditional, not within an ifdef block and no text-based attribute(s) in the target,
  • conditional within an ifdef block and no text-based attribute(s) in the target,
  • conditional with text-based attribute(s) in the target, and
  • conditional within an ifdef block and with text-based attribute(s) in the target.

The flow would be:

  • 1 Retrieve document if document has not yet been retrieved and does not already appear in the path from root-to-branch.

  • 2 Parse document to find unconditional includes and conditional includes, leaving a stub for each.

  • 3 For each unconditional stub (or conditional stub where the precedent attribute(s) is locked),

    • 1 Retrieve document if document has not yet been retrieved and does not already appear in the path from root-to-branch.
    • 2 Parse document to find unconditional includes and conditional includes, leaving a stub for each.
    • 3 For each unconditional stub (or conditional stub where the precedent attribute(s) is locked),
      • 1 Retrieve document if document has not yet been retrieved and does not already appear in the path from root-to-branch.
      • 2 Parse document to find unconditional includes and conditional includes, leaving a stub for each.
      • 3 For each unconditional stub (or conditional stub where the precedent attribute(s) is locked),
        • ...
  • 4 When there are no more unconditional stubs AND there are conditional stubs, resolve the attributes in all documents up to the first conditional.

  • 5 If attribute is part of an ifdef and attribute is defined and true, upgrade all dependent stubs to unconditional. If within a target and attribute is defined, resolve the attribute in the target.

  • 3 For each unconditional stub (or conditional stub where the precedent attribute(s) is locked),

    • 1 Retrieve document if document has not yet been retrieved and does not already appear in the path from root-to-branch.
    • 2 Parse document to find unconditional includes and conditional includes, leaving a stub for each.
    • 3 For each unconditional stub (or conditional stub where the precedent attribute(s) is locked),
      • ...
  • 4 When there are no more unconditional stubs AND there are conditional stubs, resolve the attributes in all documents up to the first conditional.

  • 5 If attribute is part of an ifdef and attribute is defined and true, upgrade all dependent stubs to unconditional. If within a target and attribute is defined, resolve the attribute in the target.

  • 3 For each unconditional stub (or conditional stub where the precedent attribute(s) is locked),

    • 1 Retrieve document if document has not yet been retrieved and does not already appear in the path from root-to-branch.
    • 2 Parse document to find unconditional includes and conditional includes, leaving a stub for each.
    • 3 For each unconditional stub (or conditional stub where the precedent attribute(s) is locked),
      • ...
  • ...

  • 6 When there are no more unconditional stubs AND there are no more conditionals, begin inserting document content and preparing the document for processing.

If all includes were unconditional, this is the best case scenario, and the lowest time spent waiting for files.

If all includes were conditional, this is the worst case scenario, and a lot of time would be spent waiting for files.

Example

Retrieve the root document

  • root main.adoc ["main.adoc"] RETRIEVED

Parse the root document to find unconditional includes and conditional includes, and start building up the tree one-level deep, leaving a stub for each.

  • main.adoc ["main.adoc"] RETRIEVED PARSED
    • attrs.adoc ["main.adoc", "attrs.adoc"] STUB
    • if(include-all) PENDING
      • a.adoc ["main.adoc", "a.adoc"] STUB
      • b.adoc ["main.adoc", "b.adoc"] STUB
      • c.adoc ["main.adoc", "c.adoc"] STUB
      • d.adoc ["main.adoc", "d.adoc"] STUB
      • e.adoc ["main.adoc", "e.adoc"] STUB
      • f.adoc ["main.adoc", "f.adoc"] STUB

Retrieve all unconditional stubs. Don't retrieve the conditional stubs.

  • main.adoc ["main.adoc"] RETRIEVED PARSED
    • attrs.adoc ["main.adoc", "attrs.adoc"] RETRIEVED
    • if(include-all) PENDING
      • a.adoc ["main.adoc", "a.adoc"] STUB
      • b.adoc ["main.adoc", "b.adoc"] STUB
      • c.adoc ["main.adoc", "c.adoc"] STUB
      • d.adoc ["main.adoc", "d.adoc"] STUB
      • e.adoc ["main.adoc", "e.adoc"] STUB
      • f.adoc ["main.adoc", "f.adoc"] STUB

Parse the retrieved documents to find unconditional includes and conditional includes.

  • main.adoc ["main.adoc"] RETRIEVED PARSED
    • attrs.adoc ["main.adoc", "attrs.adoc"] RETRIEVED PARSED (no includes)
    • if(include-all) PENDING
      • a.adoc ["main.adoc", "a.adoc"] STUB
      • b.adoc ["main.adoc", "b.adoc"] STUB
      • c.adoc ["main.adoc", "c.adoc"] STUB
      • d.adoc ["main.adoc", "d.adoc"] STUB
      • e.adoc ["main.adoc", "e.adoc"] STUB
      • f.adoc ["main.adoc", "f.adoc"] STUB

At this point, there are two conditions which are true: there are no unconditional stubs, and there are conditional stubs. So resolve the attributes up to the first conditional.

  • main.adoc ["main.adoc"] RETRIEVED PARSED RESOLVED (no attributes)
    • attrs.adoc ["main.adoc", "attrs.adoc"] RETRIEVED PARSED (no includes) RESOLVED - (include-all: true)
    • if(include-all) TRUE
      • a.adoc ["main.adoc", "a.adoc"] STUB
      • b.adoc ["main.adoc", "b.adoc"] STUB
      • c.adoc ["main.adoc", "c.adoc"] STUB
      • d.adoc ["main.adoc", "d.adoc"] STUB
      • e.adoc ["main.adoc", "e.adoc"] STUB
      • f.adoc ["main.adoc", "f.adoc"] STUB

Even if the include-all attribute was changed within c.adoc, it was true at the start of the ifdef block. "Upgrade" the conditional stubs to unconditional:

  • main.adoc ["main.adoc"] RETRIEVED PARSED RESOLVED (no attributes)
    • attrs.adoc ["main.adoc", "attrs.adoc"] RETRIEVED PARSED (no includes) RESOLVED (include-all: true)
    • a.adoc ["main.adoc", "a.adoc"] STUB
    • b.adoc ["main.adoc", "b.adoc"] STUB
    • c.adoc ["main.adoc", "c.adoc"] STUB
    • d.adoc ["main.adoc", "d.adoc"] STUB
    • e.adoc ["main.adoc", "e.adoc"] STUB
    • f.adoc ["main.adoc", "f.adoc"] STUB

Retrieve all unconditional stubs. Don't retrieve the conditional stubs.

  • main.adoc ["main.adoc"] RETRIEVED PARSED RESOLVED (no attributes)
    • attrs.adoc ["main.adoc", "attrs.adoc"] RETRIEVED PARSED (no includes) RESOLVED (include-all: true)
    • a.adoc ["main.adoc", "a.adoc"] RETRIEVED
    • b.adoc ["main.adoc", "b.adoc"] RETRIEVED
    • c.adoc ["main.adoc", "c.adoc"] RETRIEVED
    • d.adoc ["main.adoc", "d.adoc"] RETRIEVED
    • e.adoc ["main.adoc", "e.adoc"] RETRIEVED
    • f.adoc ["main.adoc", "f.adoc"] RETRIEVED

Parse the retrieved documents to find unconditional includes and conditional includes.

  • main.adoc ["main.adoc"] RETRIEVED PARSED RESOLVED (no attributes)
    • attrs.adoc ["main.adoc", "attrs.adoc"] RETRIEVED PARSED (no includes) RESOLVED (include-all: true)
    • a.adoc ["main.adoc", "a.adoc"] RETRIEVED PARSED (no includes)
    • b.adoc ["main.adoc", "b.adoc"] RETRIEVED PARSED (no includes)
    • c.adoc ["main.adoc", "c.adoc"] RETRIEVED PARSED (no includes)
    • d.adoc ["main.adoc", "d.adoc"] RETRIEVED PARSED (no includes)
    • e.adoc ["main.adoc", "e.adoc"] RETRIEVED PARSED (no includes)
    • f.adoc ["main.adoc", "f.adoc"] RETRIEVED PARSED (no includes)

At this point, there are no unconditional stubs, and there are no conditional stubs. So start processing the overall document.

@ggrossetie
Copy link
Member

Thanks @tigregalis for the detailed example.
I gave it more thoughts and it's basically what I had in mind 👍

@mojavelinux I'm wondering how we should proceed ? Maybe we could implement something in Asciidoctor.js and introduce an experimental API so we can test it against various documents and fine tune the solution (before committing to it in Asciidoctor core) ?

Speaking of attributes and resolution, it's reminding me of another issue: asciidoctor/asciidoctor#1911 (comment)
Maybe we could also address this issue since we will need to parse the included document. Just an idea to keep in mind as it might be better to stay focus on the initial goal.

@ggrossetie
Copy link
Member

@tigregalis Are you working on it ? I want to give it a try but I don't want to step on your toes 😉

@tigregalis
Copy link
Author

@Mogztter
Sorry, I've been on site, far from civilisation.

I've seen that you've started making some changes in #633 and read a bit of the discussion there.

I don't understand the codebase well enough to contribute, and the syntax has a lot of quirks (discussed further below), but I have been playing around with building a wrapper/extension as a proof-of-concept, with my plan as follows:

  • Set up a global "virtual file system" which simulates the network/file system and "attribute store" which simulates the asciidoctor CLI/API: the "attribute store" should intercept and "lock" attributes that are provided by the CLI or API.
  • Fetch the root file and populate the "virtual file system".
  • When it comes back, parse the attribute definitions, if passages, code listings (code listings don't resolve attributes), and includes in the fetched file.
  • Determine the dependencies within the file and which includes have no dependencies, building up a "dependency tree".
  • Fetch files without dependencies and populate the "virtual file system".
  • Simultaneously build up the "include tree" (might be the same as the "dependency tree") so as to determine "root-to-branch" and prevent recursion entirely (I can't think of a single use-case for recursive text inclusion, especially without exit conditions).
  • Parse the attribute definitions, if passages, code listings (code listings don't resolve attributes), and includes in the fetched files.
  • Iterate until all files without dependencies are retrieved.
  • Resolve attributes from top to bottom ("stepping into" and "stepping out of" included files in the "dependency tree"), updating the "attribute store" (if not "locked") as we go, until encountering an include with a dependency.
  • Resolve the dependency (for ifn?def, determine whether defined, for ifeval and {attribute-use}, retrieve the value from the "attribute store"). If the file is to be included, it can effectively be treated as a new root file.
  • Repeat until all files have been retrieved and all dependencies have been resolved.
  • At this stage, the "virtual file system" should be fully populated. Set up an include processor that when encountering an include, will intercept it and grab the files from the "virtual file system".
  • Register the extension.
  • Finally, process the file(s) as normal.

Other than asciidoctor.js itself processing the file(s) at the end, this should be completely async and should be quite efficient.

I've been going through the asciidoctor user manual and testing a number of scenarios, and I'm finding there's a lot of flexibility, a lot of exceptions and a lot of strange behaviour that is hard to account for. Asynchronously fetching includes is relatively easy, even doing nested includes: as long as it's not conditional. Doing conditional resolution of includes though... a lot of these edge-cases can be pretty nightmarish.

  • You can have attributes defined anywhere, except a code listing block afaict.
  • You can put content inside the square brackets of an if: ifdef::should-include-partial[include::{partial-name}.adoc[\]]. This one also has an attribute in the square brackets.
  • You can do nested ifs, but how these work is unclear.
  • When there is a "malformed" if (e.g. ifdef::some-attribute[] followed later by endif::a-different-attribute[]), I can't follow the logic at all.
  • What about this (if partials):
    image

I guess it stems from the fact that includes are just points in the target file at which you can substitute the include text with the contents of the referenced file, i.e. partials. There are no syntactical rules for when an include is valid or not. All includes are valid. Some of this unfortunate flexibility is also found with the ifs and attributes.

This is an aside, but are syntactical changes within the scope of Asciidoctor 2.x? If so, there are a few ideas I have to make it more logical and tree-like:

  • In large part, they involve adding syntactical limitations: some of the things we do with "hacks" should be first-class-citizens in the syntax (e.g. via macros), but the "hacks" themselves should be removed.
  • What might make things easier is also to add some "closing tags" so to speak: the treatment of how different blocks / inline structures treats "opening tags" without corresponding "closing" tags would have to be documented.
  • This should probably be its own issue in core, but: Any time leading and trailing spaces (i.e. indentation) on lines have meaning that isn't tied to structure severely limits how the writer can format the source text, so I would remove the first method of defining a literal block described in https://asciidoctor.org/docs/user-manual/#literal-text-and-blocks (Indenting the first line of a paragraph by one or more spaces): there are two other methods that I think are reasonable and so I don't think it adds value, considering the restriction it places on the writer. This would be a breaking change of course. For migration, you can simply place [literal,indent=0] before any literal paragraph.
  • This should probably be its own issue in core, but: Why can attributes be defined everywhere? Ideally they should only be at the top of a file (a syntactical rule) and ignored elsewhere, or should be "hoisted" (and the CLI should warn when there are duplicate attributes). Attributes should be file-scoped (basically, defined only for the file itself), and when including files, if you want to pass attributes in or out, the attributes need to be explicitly handled, i.e. when getting attributes from a file use a macro e.g. importattrs::attrs.adoc[attribute1,attribute2] (also opens it up to importing JSON, TOML and other formats), and when processing an included file use options e.g. include::author.adoc[attrs=firstname,surname,picture]. These would be breaking changes of course.

I'd also suggest building it async-first.

A JSON AST as an output format would be excellent as well: I've dabbled with writing an extension to do this, but haven't gotten very far yet.

@ggrossetie
Copy link
Member

@tigregalis No worries! Again thanks for taking the time to write your thoughts.

I think we should iterate on this feature and take one step at a time. While I agree with a lot of things you said I think we should leave attributes for now and focus on unconditional includes that don't rely on attributes.
I think it set an achievable goal in a near future and we will still learn a lot in the process:

  • Introduce a virtual file system
  • Fetch files and populate the "virtual file system" (file://, http://, https://)
  • Introduce a cache mechanism to avoid fetching the same file again

Even with this limited scope we have a few things to decide.
For instance, should we use an include extension to resolve files from the "virtual file system" or should we add a "native" support in Asciidoctor core ? If we add a native support, Asciidoctor core could also ask the VFS to resolve images when using the data-uri attribute.
We could create an API to manipulate the VFS (could be useful if the user wants to contribute to it):

vfs.addFile({path: '/path/to/file.adoc', contents: Buffer()})

or create/provide one:

asciidoctor.convert(input, {vfs: vfs})

At some point we will also need to decide if we should replace the include directive with the file content when it's "safe". Might be a premature optimization but we don't need to parse an include twice with this simple example:

main.adoc

include::a.adoc[]

a.adoc

a

In this case we could resolve the include and call Asciidoctor with the following input:

main.adoc

a

Anyway first thing first, let's work on the virtual file system and on the process to fetch files efficiently.

@tigregalis
Copy link
Author

Haven't been keeping up with this for a while. Has this moved at all?

If I want to contribute to the codebase, where would be the best place to start (reading)?

@ggrossetie
Copy link
Member

Haven't been keeping up with this for a while. Has this moved at all?

It's still a work in progress: #644
But given the fact that more and more JavaScript libraries are "async first" and that browsers might disallow synchronous XMLHttpRequest in a mid-term future I think it's important to find a solution.

If I want to contribute to the codebase, where would be the best place to start (reading)?

If you want to understand how Asciidoctor resolves the include directive then you should read the Asciidoctor Ruby code base. In fact, the code generated from Opal is harder to read.

The Asciidoctor.js code base is now in the packages/core directory. Please note that most of the code is generated from the Asciidoctor Ruby library.

If you want to override a method from Asciidoctor Ruby you should create a Ruby file. For instance we override the Document#fill_datetime_attributes method in lib/asciidoctor/js/asciidoctor_ext/document.rb (and the file is required in asciidoctor_ext.rb).

@mojavelinux
Copy link
Member

@mojavelinux I'm wondering how we should proceed ?

To be honest, I just don't know. The language was designed in such a way that includes are synchronous and trying to decouple that in any way has side effects that may cause the parser to produce a different result. (Yes, I said what I said above, but in reality it's just more complex).

One possible solution is to do something like what Asciidoctor Reducer does, which is to expand all the includes before parsing (i.e., preprocess the document from top to bottom). The benefit there is that you can leverage the existing parser without having to resort to using what amounts to an alternate parser. Of course, that means Asciidoctor Reducer would need to be able to work with async operations itself...so changes may still be needed, but it's perhaps less invasive.

In the AsciiDoc Language, what we really need to do is be able to walk a document to identify preprocessor directives without parsing...which may even end up being a separate step in the parsing of AsciiDoc. It's a major problem for defining a grammar as well.

@mojavelinux
Copy link
Member

browsers might disallow synchronous XMLHttpRequest

IMHO, this is just narrow minded of them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants