Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce an async convert #633

Closed
wants to merge 10 commits into from
Closed

Introduce an async convert #633

wants to merge 10 commits into from

Conversation

ggrossetie
Copy link
Member

@ggrossetie ggrossetie commented Dec 21, 2018

The idea is to introduce an experimental API to resolve unconditional includes (first level only). All other includes will be resolved synchronously by Asciidoctor core.

In the benchmark directory I've added a "best case scenario" with 20 unconditional includes.

Result

Method Time (ms)
convert 780
convertAsync 180

Ref: #630

TODO

  • Trigger preprocessor ?
  • Handle or ignore include directive with attributes (leveloffset, lines, tags or indent)
  • Handle or ignore target URI (remote or local include)
  • Ignore target that contains { (ie. contains a variable that need to be substituted)
  • Trigger include processor or delegate to the processor (ie. if there's an include processor registered for the target, ignore the include directive)
    • ⚠️ we do not have a reader object!
    • Use a special kind of include processor (that returns a content asynchronously) ?
  • Use the Asciidoctor Logger to print an error when the include is not readable
  • Skip block comments

@ggrossetie
Copy link
Member Author

This could work if we limit the scope but it has many limitations.
Since we are resolving includes before the parsing the extension registry is not yet activated, attributes are not resolved, the Document and Reader object are not initialized...

Currently if one the following is true we will skip the include directive (ie. Asciidoctor core will resolve the file synchronously):

  • If there's a Preprocessor extension
  • If there's an Include Processor extension for the target
  • If the target contains an attribute
  • If the include directive contains leveloffset, tag, tag or lines
  • ...

@mojavelinux
Copy link
Member

We desperately need to split out the logic that selects lines by tags and line numbers. I really don't like to see this logic repeated here (and I don't like that it is repeated in Antora either). This should be a top goal for Asciidoctor 2.0.x, and I think we should ✋ on adding it to Asciidoctor.js until that's available.

@mojavelinux
Copy link
Member

On the other hand, I don't even think processing line numbers and tags is right here. We should be taking and caching the whole file. All we're really doing is populating a virtual file system so that those lines can be read later by the real include processor. So the filtering should be deferred until then. Isn't that correct?

@mojavelinux
Copy link
Member

This really opens some fundamental questions about how includes are processed, and what limits there should be. This is definitely an area I want to focus on in the spec, because right now there are so many sharp edges.

@mojavelinux
Copy link
Member

One thing is for certain in my mind. This should be marked as highly experimental because it has important consequences we still need to understand ourselves.

@ggrossetie
Copy link
Member Author

ggrossetie commented Dec 23, 2018

We desperately need to split out the logic that selects lines by tags and line numbers. I really don't like to see this logic repeated here (and I don't like that it is repeated in Antora either). This should be a top goal for Asciidoctor 2.0.x, and I think we should ✋ on adding it to Asciidoctor.js until that's available.

👍
I was just experimenting to see how far I can get 😄

On the other hand, I don't even think processing line numbers and tags is right here. We should be taking and caching the whole file. All we're really doing is populating a virtual file system so that those lines can be read later by the real include processor. So the filtering should be deferred until then. Isn't that correct?

Sure it's the plan.
I was poking around to see if there's a notable improvement in performance.
Do you think we should use an include processor ? there can only be one at a time right ? and if we do that we will need to implement the tags/lines filtering.
We've talked about an "include resolver" extension, maybe we could also have an "include reader" (that returns the content for a given target) ? Or at least a method that can be overridden (like the resolve_include_path method)

This really opens some fundamental questions about how includes are processed, and what limits there should be. This is definitely an area I want to focus on in the spec, because right now there are so many sharp edges.

Indeed, I read the code over and over again and I'm still not sure what happens if you include a remote file and then use a relative include inside this remote file 🤔
And If you mix that with a browser environment where an URI can be in fact a "local resource" 💥

One thing is for certain in my mind. This should be marked as highly experimental because it has important consequences we still need to understand ourselves.

💯
Potentially it can degraded performance and the current implementation is just a proof of concept so it will be a "use it at your own risk" API 💀

@ggrossetie
Copy link
Member Author

I will extract the new Registry APIs (hasPreprocessors, getPreprocessors...) and the conversion from Opal.nil to undefined in two pull requests.

@ggrossetie ggrossetie changed the title (wip) Introduce an async convert Introduce an async convert Dec 29, 2018
@ggrossetie ggrossetie closed this Dec 29, 2018
@ggrossetie
Copy link
Member Author

Superseded by #644

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants