Fallback to RegEx based parser when using custom transformers or extractors #11335
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Right now the Rust based parser can't work with custom
transfomers
orextractors
.In a perfect world we can implement as many custom parsers/extractors in Rust such that we don't need this at all. In an almost perfect world we can pass the transformer and extractor to the Rust based parser and call the callback functions to handle all of this. This is probably what we are going to do in the future but this requires more work to make sure that:
Since it currently doesn't work with the Rust based parser, we can implement a fix for this in the meantime before we reach the "perfect" solution.
One solution to this problem is to check if we do have a custom transformer or a custom extractor and if we do, then we can bail on the Rust parser completely and just use the current regex based parser.
An alternative solution, the solution implemented here, is that we group the
changedContent
into 2 buckets. The bucket where we rely on the default transformer and extractor and a bucket where a custom transformer or extractor is used.Then, the bucket where we use the default transformer and extractor can still rely on the way faster Rust based parser. For the other bucket we fallback to the regex based parser.
The nice part about this is that we can use both parsers at the same time, and the majority of the use cases should use the faster Rust based parser.