It is possible to use a simple streaming input to make rust-csv allocate an infinite amount of memory parsing a single CSV line:
extern crate csv;
extern crate serde;
#[cfg(test)]
mod tests {
use csv::Reader;
use std::io::repeat;
#[test]
fn test_csv_bomb() {
let mut rdr = Reader::from_reader(repeat(b','));
// This line runs forever and keeps allocating memory
rdr.deserialize::<f64>().next();
}
}
As a user, I had expected that rust-csv would know that each line is supposed to be deserialized into a single f64 and therefore only parse a single field before returning an error.
This would make it risky to use rust-csv to parse data from untrusted streaming sources, even if the user is expecting a finite number of records where each record has a finite size. I can imagine someone writing a server that parses a CSV as it is uploaded, which would be vulnerable to this kind of issue.
I originally thought of this while working on ndarray-csv; it's pretty easy to deal with arrays that have too many rows, but I'm not sure what to do about arrays with too many columns.
By the way, the performance of this is awesome! I am able to fill up about a gigabyte of memory per second on my laptop! 🤣
It is possible to use a simple streaming input to make rust-csv allocate an infinite amount of memory parsing a single CSV line:
As a user, I had expected that rust-csv would know that each line is supposed to be deserialized into a single
f64and therefore only parse a single field before returning an error.This would make it risky to use rust-csv to parse data from untrusted streaming sources, even if the user is expecting a finite number of records where each record has a finite size. I can imagine someone writing a server that parses a CSV as it is uploaded, which would be vulnerable to this kind of issue.
I originally thought of this while working on ndarray-csv; it's pretty easy to deal with arrays that have too many rows, but I'm not sure what to do about arrays with too many columns.
By the way, the performance of this is awesome! I am able to fill up about a gigabyte of memory per second on my laptop! 🤣