Subtext recursive descent parser#3
Conversation
Otherwise we backtrack, because this is just prose.
| markup: markup, | ||
| range: selection | ||
| ).renderMarkup(url: Slashlink.slashlinkToURLString) | ||
| Subtext(markup: markup) |
| var span: Substring | ||
| } | ||
|
|
||
| struct Bracketlink { |
There was a problem hiding this comment.
I only know about the hyperlink and slashlink link forms - what is the format/semantics of bracketlink?
There was a problem hiding this comment.
@cdata Covered here in the specification: https://github.com/gordonbrander/subtext/blob/main/specification.md#bracketed-urls. TLDR, they are a syntax form that allows linking to non-http/https protocols.
This won't link:
ipfs://asdfasdfasfasdf
dat://asdfasdfasfasdf
This will link:
<ipfs://asdfasdfasfasdf>
<dat://asdfasdfasfasdf>
You can write HTTP urls either way:
This will link: https://example.com
So will this: <https://example.com>
URL syntax is very very open-ended, so it is difficult to auto-link the general form of URLs. Bracket links mean we can autolink http urls, and support other protocols without forcing parsers to maintain a whitelist of exotic protocols. Easy things are easy, difficult things are possible.
|
|
||
| import Foundation | ||
|
|
||
| struct Tape<T> |
| } | ||
|
|
||
| /// Move forward one element | ||
| @discardableResult mutating func consume() -> T.SubSequence { |
There was a problem hiding this comment.
What does @discardableResult imply?
(my intuition is that this has something to do with reference counting)
There was a problem hiding this comment.
@cdata Ordinarily XCode will complain if you don't assign a return value to a variable, reasoning that this is a mistake. The language provides this decorator thingy to say "it's ok to discard return value, sometimes I just use this function for its side effects".
|
|
||
| /// Move forward one element | ||
| @discardableResult mutating func consume() -> T.SubSequence { | ||
| let subsequence = collection[currentIndex...currentIndex] |
There was a problem hiding this comment.
Is a range operator (...) required here?
There was a problem hiding this comment.
@cdata It is! It causes us to get a T.Subsequence, instead of a T.Item. We use Subsequences everywhere in Tape because they maintain index references to the underlying collection.
| } | ||
|
|
||
| /// Get current subsequence | ||
| var subsequence: T.SubSequence { |
There was a problem hiding this comment.
IIUC this is the form of an accessor (a "getter" in this case) in Swift 📝
There was a problem hiding this comment.
Yes, this is shorthand for a get-only accessor.
| } | ||
|
|
||
| /// Peek forward, and consume if match | ||
| mutating func consumeMatch(_ subsequence: T.SubSequence) -> Bool { |
There was a problem hiding this comment.
@cdata Yeah, _ here means "don't use argument label on call-side".
Swift is strange in that arguments are both positional at call-site, AND labeled. I think this is an Objective-C holdover? Anyway, you can avoid having to use the argument label on call side this way. I am of two minds about this. Some swift code uses this form for first argument, and labels for extra arguments. In more recent code, I've defaulted to keeping labels more of the time, even though it is verbose.
| return self.subsequence | ||
| } | ||
|
|
||
| /// Get a single-item SubSequence offset by `forward` of the `currentStartIndex`. |
There was a problem hiding this comment.
Maybe this comment is out of date - forward does not appear in the method implementation (perhaps you meant offset).
|
|
||
| /// Capture word-boundary-delimited forms at beginning of line. | ||
| tape.start() | ||
| if let inline = consumeInlineWordBoundaryForm(tape: &tape) { |
There was a problem hiding this comment.
It may be possible to satisfy this condition as part of the loop to reduce the method's complexity a little.
) Yet another riff on #1104 ## Design This PR changes the design by making removing the concept of an orchestrator. Instead, we have - Classifiers - May be composed together into a single classifier - Routes - Receives a request struct which contains the input and classifications, and a `process(:)` function. - Returns a string or nil. - Routes may safely recursively call the parent router using the `process(:)` function. - Router - Has a classifier, which it runs for each request. Classifier may be composed of multiple classifiers. - Has an array of routes. Routes are run top-to-bottom for each request. - Router exits on the first route match. This gives routes a high degree of expressivity, since a route may rewrite the input using information from the classifications, and then recurse back into the router. Routes may also call out to specialized sub-routers, allowing us to construct trees of routing, with each router able to recurse on itself. This is similar in principle to many rule-based NLP systems such as AIML or ChatScript that use hierarchy and recursion to pick apart an input and dispatch parts of it to different subsystems, before returning a result. The previous orchestrator model only had weights to work with to make a choice between results. This approach has much more nuanced control over the result, since the result can be returned from specific and specialized branches within the tree of routers. ## Concepts - `PromptClassification`: an individual classification with weight - `PromptClassifierProtocol`: given an input, produces classifications - `PromptClassifier`: composes classifiers together - `PromptRouteRequest`: a request struct containing the context for a given route request, including original input, classifications, and a function to recursively call into the routes router. - `PromptRouteProtocol`: a route - `PromptRoute`: a concrete implementation of `PromptRouteProtocol` taking a closure as a definition - `PromptRouter`: given an array of routes and a classifier, produces a result with `process(:) -> String?` - `PromptRouterRequest`: an ephemeral actor that exists for the duration of a `PromptRouter.process(:)` request. This actor tracks the recursion depth and makes sure requests don't recurse too deeply. - Aside: I was pleased to learn that actors do not cost any more than classes - `PromptService`: configures classifiers, routes, and routers. --------- Co-authored-by: Ben Follington <5009316+bfollington@users.noreply.github.com>
Re-implementing Subtext via recursive descent.