Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add example of swapping from linereader's next_batch() method #1

Open
dimo414 opened this issue Oct 5, 2022 · 2 comments
Open

Add example of swapping from linereader's next_batch() method #1

dimo414 opened this issue Oct 5, 2022 · 2 comments

Comments

@dimo414
Copy link

dimo414 commented Oct 5, 2022

I'd like to swap from linereader to this library due to the silently capped line lengths issue you mention in the README, but it's not obvious to me how to replicate that function using the APIs in ripline. It'd be helpful if you could add an example of processing batches of lines at a time. Thanks!

@sstadick
Copy link
Owner

sstadick commented Oct 5, 2022

HI @dimo414! Thanks for making an issue! I think that the Example in the README does actually do what you want. It looks like .next_batch() from linereader returns a slice that is guaranteed end with a newline. the LineBufferReader.fill() method fills the a buffer and does the same thing. So LineBufferReader.buffer() will return the same thing as next_buffer().

Does that answer the question?

(annotated example)

use grep_cli::stdout;
use ripline::{
    line_buffer::{LineBufferBuilder, LineBufferReader},
    lines::LineIter,
    LineTerminator,
};
use std::{env, error::Error, fs::File, io::Write, path::PathBuf};
use termcolor::ColorChoice;

fn main() -> Result<(), Box<dyn Error>> {
    let path = PathBuf::from(env::args().nth(1).expect("Failed to provide input file"));

    let mut out = stdout(ColorChoice::Never);

    let reader = File::open(&path)?;
    let terminator = LineTerminator::byte(b'\n');
    let mut line_buffer = LineBufferBuilder::new().build();
    let mut lb_reader = LineBufferReader::new(reader, &mut line_buffer);

    while lb_reader.fill()? {
       // It's the lb_reader.buffer() here that returns &[u8]. LineIter is just an optimized zero-copy iterator for iterating over those lines. You could use the returned buffer in the same way you were using the buffer from `next_batch`
        let lines = LineIter::new(terminator.as_byte(), lb_reader.buffer());
        for line in lines {
            out.write_all(line)?;
        }
        lb_reader.consume_all();
    }

    Ok(())
}

@dimo414
Copy link
Author

dimo414 commented Oct 15, 2022

Thanks for the help! I've migrated over to ripline and LineBufferReader is working as you describe :) I was confused because the example uses a LineTerminator that is only referenced by LineIter, so it wasn't clear that the LineBufferReader was similarly EOL-aware.

Some thoughts:

  • The use of LineTerminator in the example is distracting, since it's basically just holding onto the b'\n' until it gets passed directly to LineIter (vs. passing in terminator itself). It would be better to use it in both places or neither.
    • In that vein, perhaps LineIter instances should be provided by the LineBufferReader so that they can share the same terminator, rather than risking them being out of sync.
    • It seems like the LineTerminator type is minimally-used in the library itself, it only shows up in ripline::lines::without_terminator() - maybe it would be cleaner to just get rid of that type?
  • The docs on LineBufferReader.fill() are somewhat confusing, since it sounds like false is returned as soon as EOF is reached in the underlying source, but actually it returns false only once the buffer is fully drained.
  • It would be nice if we didn't need to call both fill() and consume_all(); linereader does both operations via next_batch()/next_line().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants