-
Notifications
You must be signed in to change notification settings - Fork 69
Description
I thought the whole
buffer.next()
while buffer.current != nil {
if let unicode = buffer.current { // ... somewhere, buffer.next() is called
dance was kind of ugly: you're dealing with the overhead of using a generator, but receiving none of the benefits it provides (e.g. for in loops). Also, using a struct for your BufferedGenerator seems odd -- you end up using a class as a backing store anyway, and having it as a struct means using inout parameters all over the place. There's a discussion on the dev forums that argues the case why GeneratorTypes should, in general, just be reference types.
Anyway, I took a different view on the parsing issue -- it's not that you need access to the "current" element so much as it is that determining what sub-parser to use forces the generator to consume one too many characters. Really, if you could just rewind the generator back a character, everything would be simple. So I wrote a class that lets you do that:
/// Creates a `GeneratorType`/`SequenceType` that allows rewinding and can be passed around.
final public class RewindableGenerator<S : CollectionType where S.Index : BidirectionalIndexType> : GeneratorType, SequenceType {
typealias Sequence = S
private var currentIndex: Sequence.Index
private let prestartIndex: Sequence.Index
private let seq: Sequence
// store the `current` element, but it's not really necessary
private(set) var current: Sequence.Generator.Element?
/// Initializes a new `RewindableGenerator<S>` with an underlying `CollectionType`.
/// Requires that `CollectionType.Index` be a `BidirectionalIndexType`.
///
/// :param: sequence the sequence that will be used to traverse the content.
public init(_ sequence: Sequence) {
self.seq = sequence
self.currentIndex = sequence.startIndex
self.prestartIndex = sequence.startIndex.predecessor()
}
/// Moves the `current` element to the next element if one exists.
///
/// :return: The `current` element or `nil` if the element does not exist.
public func next() -> Sequence.Generator.Element? {
if currentIndex == seq.endIndex {
return nil
}
currentIndex = currentIndex.successor()
if currentIndex != seq.endIndex {
self.current = seq[currentIndex]
} else {
self.current = nil
}
return self.current
}
/// Moves the `current` element to the previous element if one exists.
///
/// :return: The `current` element or `nil` if the element does not exist.
public func previous() -> Sequence.Generator.Element? {
if currentIndex == self.prestartIndex {
return nil
}
currentIndex = currentIndex.predecessor()
if currentIndex != self.prestartIndex {
self.current = seq[currentIndex]
} else {
self.current = nil
}
return self.current
}
public func generate() -> Self {
self.previous()
return self
}
}I don't have time to do a full pull request, but I believe this technique should enable you to use for scalar in buffer loops in all the various parsing functions; each new time you start to iterate through the JSON in one of the sub-parsers, there will implicitly be a call to generate() which will in turn automatically rewind the parsing by a character, revealing that lost character as the first item of iteration.
Here's a goofy example use from my testing, that just grabs the characters between paired brackets and adds them to an array:
var words: [String] = []
var unmatched: [String] = []
func parseBrackets(gen: RewindableGenerator<String.UnicodeScalarView>) {
var sbuf = ""
for (idx, c) in enumerate(gen) {
if idx == 0 {
if c == "[" {
continue
} else {
// error or something
return
}
}
switch c {
case "[":
// nested brackets
parseBrackets(gen)
case "]":
words.append(sbuf)
return
default:
sbuf.append(c)
}
}
unmatched.append(sbuf)
}
let s = "[This] [is ][a] weird [ne[st]e[d]] [\u{aaef}string!"
let g = RewindableGenerator(s.unicodeScalars)
for c in g {
if c == "[" {
parseBrackets(g)
}
}
// words == ["This", "is ", "a", "st", "d", "nee"]
// unmatched == "ꫯstring!"