The High-Performance Zero-Allocation CSV Parser for .NET
ParseZero is a specialized CSV parsing library focusing on zero allocation. It uses Span<T> and Memory<T> to read data without creating strings until absolutely necessary, keeping GC pressure near zero.
- π Zero Allocation - Uses
Span<T>,Memory<T>, andArrayPool<T>to minimize heap allocations - β‘ SIMD Acceleration - Hardware-accelerated delimiter scanning with AVX2/SSE2 on .NET 8+
- π IDataReader Support - Plug directly into
SqlBulkCopy.WriteToServer()for blazing-fast database inserts - π Streaming - Process files of any size using
System.IO.Pipelines - π― Multi-Targeted - Supports both .NET 8+ and .NET Framework 4.7.2+ (via .NET Standard 2.0)
- π‘οΈ Robust - Handles quoted fields, escaped quotes, BOM, and mixed line endings
dotnet add package ParseZerousing ParseZero;
// Simple row-by-row parsing
await foreach (var row in CsvReader.ReadAsync("data.csv"))
{
int id = row[0].ParseInt32();
string name = row[1].ToString();
decimal price = row[2].ParseDecimal();
}using ParseZero;
using ParseZero.Schema;
public record Trade(int Id, string Symbol, decimal Price, DateTime Timestamp);
var schema = Schema.For<Trade>()
.Column(t => t.Id)
.Column(t => t.Symbol)
.Column(t => t.Price, format: "C2")
.Column(t => t.Timestamp, format: "yyyy-MM-dd HH:mm:ss");
await foreach (var trade in CsvReader.ReadAsync<Trade>("trades.csv", schema))
{
Console.WriteLine($"{trade.Symbol}: {trade.Price}");
}using ParseZero;
using ParseZero.Data;
using Microsoft.Data.SqlClient;
await using var connection = new SqlConnection(connectionString);
await connection.OpenAsync();
await using var reader = CsvDataReader.Create("large-dataset.csv", options);
using var bulkCopy = new SqlBulkCopy(connection);
bulkCopy.DestinationTableName = "Trades";
await bulkCopy.WriteToServerAsync(reader);var options = new CsvOptions
{
Delimiter = ',',
HasHeader = true,
Encoding = Encoding.UTF8,
BufferSize = 4096,
MaxLineLength = 64 * 1024, // DoS protection
TrimFields = true,
AllowQuotedFields = true
};
await foreach (var row in CsvReader.ReadAsync("data.csv", options))
{
// Process rows
}ParseZero achieves zero allocation by:
- Buffer Pooling - Reuses byte arrays from
ArrayPool<T> - Span Slicing - Returns
ReadOnlySpan<char>slices instead of allocating strings - SIMD Scanning - Uses AVX2/SSE2 to scan 32 bytes at a time for delimiters
- Pipelines - Streams data through
System.IO.Pipelinesfor optimal I/O
Parsing 100,000 rows (10 columns each) on .NET 8.0 with AVX2:
| Method | Mean | Allocated | Alloc Ratio |
|---|---|---|---|
| ParseZero (ForEach) | 22.1 ms | 2.9 MB | 0.27x |
| ParseZero (File) | 23.1 ms | 2.9 MB | 0.27x |
| Sylvan.Data.Csv (File) | 25.8 ms | 4.3 MB | 0.42x |
| Sylvan.Data.Csv (Stream) | 38.6 ms | 15.4 MB | 1.51x |
| ParseZero (Stream) | 67.4 ms | 10.2 MB | 1.00x (baseline) |
Key takeaways:
- ParseZero's file-based parsing is 10% faster than Sylvan with 33% less memory
- The
FieldCountOnlymode achieves near-zero allocation (280 bytes for 100K rows) - SIMD-accelerated delimiter scanning on .NET 8+ provides additional throughput
Benchmark: BenchmarkDotNet, Intel Core i7-10870H, .NET 8.0, Release build
- UTF-8 (with and without BOM)
- UTF-16 LE/BE (with BOM detection)
- ISO-8859-1 (Latin-1)
- Windows-1252
| Framework | SIMD Support |
|---|---|
| .NET 8.0+ | β AVX2/SSE2 |
| .NET Standard 2.0 | β Scalar only |
| .NET Framework 4.7.2+ | β Scalar only |
MIT License - see LICENSE for details.
Contributions are welcome! Please read our Contributing Guide for details.