Skip to content

Alternative to BufferedGenerator #12

@Westacular

Description

@Westacular

I thought the whole

buffer.next()
while buffer.current != nil {
    if let unicode = buffer.current { // ... somewhere, buffer.next() is called

dance was kind of ugly: you're dealing with the overhead of using a generator, but receiving none of the benefits it provides (e.g. for in loops). Also, using a struct for your BufferedGenerator seems odd -- you end up using a class as a backing store anyway, and having it as a struct means using inout parameters all over the place. There's a discussion on the dev forums that argues the case why GeneratorTypes should, in general, just be reference types.

Anyway, I took a different view on the parsing issue -- it's not that you need access to the "current" element so much as it is that determining what sub-parser to use forces the generator to consume one too many characters. Really, if you could just rewind the generator back a character, everything would be simple. So I wrote a class that lets you do that:

/// Creates a `GeneratorType`/`SequenceType` that allows rewinding and can be passed around.
final public class RewindableGenerator<S : CollectionType where S.Index : BidirectionalIndexType> : GeneratorType, SequenceType {
    typealias Sequence = S

    private var currentIndex: Sequence.Index
    private let prestartIndex: Sequence.Index

    private let seq: Sequence

    // store the `current` element, but it's not really necessary
    private(set) var current: Sequence.Generator.Element?

    /// Initializes a new `RewindableGenerator<S>` with an underlying `CollectionType`.
    /// Requires that `CollectionType.Index` be a `BidirectionalIndexType`.
    ///
    /// :param: sequence the sequence that will be used to traverse the content.
    public init(_ sequence: Sequence) {
        self.seq = sequence
        self.currentIndex = sequence.startIndex
        self.prestartIndex = sequence.startIndex.predecessor()
    }

    /// Moves the `current` element to the next element if one exists.
    ///
    /// :return: The `current` element or `nil` if the element does not exist.
    public func next() -> Sequence.Generator.Element? {

        if currentIndex == seq.endIndex {
            return nil
        }

        currentIndex = currentIndex.successor()

        if currentIndex != seq.endIndex {
            self.current = seq[currentIndex]
        } else {
            self.current = nil
        }
        return self.current
    }

    /// Moves the `current` element to the previous element if one exists.
    ///
    /// :return: The `current` element or `nil` if the element does not exist.
    public func previous() -> Sequence.Generator.Element? {

        if currentIndex == self.prestartIndex {
            return nil
        }

        currentIndex = currentIndex.predecessor()

        if currentIndex != self.prestartIndex {
            self.current = seq[currentIndex]
        } else {
            self.current = nil
        }
        return self.current
    }

    public func generate() -> Self {
        self.previous()
        return self
    }
}

I don't have time to do a full pull request, but I believe this technique should enable you to use for scalar in buffer loops in all the various parsing functions; each new time you start to iterate through the JSON in one of the sub-parsers, there will implicitly be a call to generate() which will in turn automatically rewind the parsing by a character, revealing that lost character as the first item of iteration.

Here's a goofy example use from my testing, that just grabs the characters between paired brackets and adds them to an array:

var words: [String] = []
var unmatched: [String] = []

func parseBrackets(gen: RewindableGenerator<String.UnicodeScalarView>) {
    var sbuf = ""

    for (idx, c) in enumerate(gen) {
        if idx == 0 {
            if c == "[" {
                continue
            } else {
                // error or something
                return
            }
        }

        switch c {
        case "[":
            // nested brackets
            parseBrackets(gen)
        case "]":
            words.append(sbuf)
            return
        default:
            sbuf.append(c)
        }
    }
    unmatched.append(sbuf)
}


let s = "[This] [is ][a] weird [ne[st]e[d]] [\u{aaef}string!"
let g = RewindableGenerator(s.unicodeScalars)

for c in g {
    if c == "[" {
        parseBrackets(g)
    }
}

// words == ["This", "is ", "a", "st", "d", "nee"]
// unmatched == "ꫯstring!"

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions