Extract from buffer in edge case of multi-character delimiter #342

Closed
wants to merge 1 commit into
from

4 participants

@johnkchow

The BufferedTokenizer works great with single character delimiters, but with multi-character delimiter it doesn't address the edge case of extracting the delimiter out of the internal buffer. The original discussion as well as code examples of the edge case can be found in another pull request:

#338

@stakach

Good stuff!
I'd recommend adding a test too.

@rud

Yeah, the test described in #338 would do nicely here. Thoughts?

@stakach

@rud @johnkchow I ended up building an eventmachine replacement however you could port the tokeniser I wrote to eventmachine.

https://github.com/cotag/uv-rays
https://github.com/cotag/uv-rays/blob/master/lib/uv-rays/buffered_tokenizer.rb
with tests: https://github.com/cotag/uv-rays/blob/master/spec/buffered_tokenizer_spec.rb

Supports delimiters, optional start of stream indicators, regular expressions and handles common 'edge cases' such as multibyte delimiters

@rud

@stakach ah, brilliant. Thanks for sharing, looks neat.

@sodabrew sodabrew referenced this pull request Feb 2, 2015
Merged

Update buftok.rb #547

@sodabrew

Resolved by #547

@sodabrew sodabrew closed this Feb 3, 2015
@sodabrew sodabrew added this to the v1.0.6 milestone Feb 3, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment