Skip to content

akeit0/Csv-CSharp

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Csv-CSharp

NuGet Releases

img

Csv-CSharp is a highly performant CSV (TSV) parser for .NET and Unity. It is designed to parse UTF-8 binaries directly and leverage Source Generators to enable serialization/deserialization between CSV (TSV) and object arrays with zero (or very low) allocation.

Installation

NuGet packages

Csv-CSharp requires .NET Standard 2.1 or higher. The package can be obtained from NuGet.

.NET CLI

dotnet add package CsvCSharp

Package Manager

Install-Package CsvCSharp

Unity

You can install Csv-CSharp in Unity by using NugetForUnity. For details, refer to the NugetForUnity README.

Quick Start

Csv-CSharp serializes/deserializes CSV data to and from arrays of classes/structs.

Define a class/struct and add the [CsvObject] attribute and the partial keyword.

[CsvObject]
public partial class Person
{
    [Column(0)]
    public string Name { get; set; }

    [Column(1)]
    public int Age { get; set; }
}

All public fields/properties of a type marked with [CsvObject] must have either the [Column] or [IgnoreMember] attribute. (An analyzer will output a compile error if it does not find either attribute on public members.)

The [Column] attribute can specify a column index as an int or a header name as a string.

To serialize this type to CSV or deserialize it from CSV, use CsvSerializer.

var array = new Person[]
{
    new() { Name = "Alice", Age = 18 },
    new() { Name = "Bob", Age = 23 },
    new() { Name = "Carol", Age = 31 },
}

// Person[] -> CSV (UTF-8)
byte[] csv = CsvSerializer.Serialize(array);

// Person[] -> CSV (UTF-16)
string csvText = CsvSerializer.SerializeToString(array);

// CSV (UTF-8) -> Person[]
array = CsvSerializer.Deserialize<Person>(csv);

// CSV (UTF-16) -> Person[]
array = CsvSerializer.Deserialize<Person>(csvText);

Serialize has an overload that returns a UTF-8 encoded byte[], and you can also pass a Stream or IBufferWriter<byte> for writing. Deserialize accepts UTF-8 byte arrays as byte[] and also supports string, Stream, and ReadOnlySequence<byte>.

The default supported types for fields are sbyte, byte, short, ushort, int, uint, long, ulong, char, string, Enum, Nullable<T>, DateTime, TimeSpan, and Guid. To support other types, refer to the Extensions section.

Serialization

The class/struct passed to CsvSerializer should have the [CsvObject] attribute and the partial keyword.

By default, fields and properties with the [Column] attribute are the targets for serialization/deserialization. The [Column] attribute is mandatory for public members, but you can target private members by adding the [Column] attribute.

[CsvObject]
public partial class Person
{
    [Column(0)]
    public string Name { get; set; }

    [Column(1)]
    int age;

    [IgnoreMember]
    public int Age => age;
}

To specify header names instead of indices, use a string key.

[CsvObject]
public partial class Person
{
    [Column("name")]
    public string Name { get; set; }

    [Column("age")]
    public int Age { get; set; }
}

To use member names as keys, specify [CsvObject(keyAsPropertyName: true)]. In this case, the [Column] attribute is not required.

[CsvObject(keyAsPropertyName: true)]
public partial class Person
{
    public string Name { get; set; }
    public int Age { get; set; }
}

Note

Currently, deserialization with a specified constructor is not implemented. Types marked with [CsvObject] require a parameterless constructor. This feature is expected to be implemented by version 1.0.

CsvDocument

If you need to directly parse CSV fields, you can use CsvDocument.

var array = new Person[]
{
    new() { Name = "Alice", Age = 18 },
    new() { Name = "Bob", Age = 23 },
    new() { Name = "Carol", Age = 31 },
};

byte[] csv = CsvSerializer.Serialize(array);

// CSV (UTF-8) -> CsvDocument
var document = CsvSerializer.ConvertToDocument(csv);

foreach (var row in document.Rows)
{
    var name = row["Name"].GetValue<string>();
    var age = row["Age"].GetValue<int>();
}

Options

You can change CSV settings by passing CsvOptions to Serialize/Deserialize.

CsvSerializer.Serialize(array, new CsvOptions()
{
    HasHeader = true, // Include header row
    AllowComments = true, // Allow comments starting with '#''
    NewLine = NewLineType.LF, // Newline type
    Separator = SeparatorType.Comma, // Separator character
    QuoteMode = QuoteMode.Minimal, // Conditions for quoting fields (Minimal quotes only strings containing escape characters)
    FormatterProvider = StandardFormatterProvider.Instance, // ICsvFormatterProvider to use
});

CSV Specifications

The default settings of Csv-CSharp generally follow the specifications outlined in RFC 4180. However, please note that for performance and practicality reasons, some specifications may be disregarded.

  • The default newline character is LF instead of CRLF.
  • Records with a mismatch in the number of fields can be read without errors being output; missing fields will be set to their default values.

Extensions

Interfaces ICsvFormatter<T> and ICsvFormatterProvider are provided to customize field serialization/deserialization.

Use ICsvFormatter<T> for type serialization/deserialization. Here is an example of implementing a formatter for a struct wrapping an int.

public struct Foo
{
    public int Value;

    public Foo(int value)
    {
        this.Value = value;
    }
}

public sealed class FooFormatter : ICsvFormatter<Foo>
{
    public Foo Deserialize(ref CsvReader reader)
    {
        var value = reader.ReadInt32();
        return new Foo(value);
    }

    public void Serialize(ref CsvWriter writer, Foo value)
    {
        writer.WriteInt32(value.Value);
    }
}

Next, implement a formatter provider to retrieve the formatter.

public class CustomFormatterProvider : ICsvFormatterProvider
{
    public static readonly ICsvFormatterProvider Instance = new CustomFormatterProvider();

    CustomFormatterProvider()
    {
    }

    static CustomFormatterProvider()
    {
        FormatterCache<Foo>.Formatter = new FooFormatter();
    }

    public ICsvFormatter<T>? GetFormatter<T>()
    {
        return FormatterCache<T>.Formatter;
    }

    static class FormatterCache<T>
    {
        public static readonly ICsvFormatter<T> Formatter;
    }
}

You can set the created formatter provider in CsvOptions. The above CustomFormatterProvider only supports the Foo struct, so combine it with the standard formatter provider StandardFormatterProvider.

var array = new Foo[10];

// Create a composite formatter provider combining multiple formatter providers
var provider = CompositeFormatterProvider.Create(
    CustomFormatterProvider.Instance,
    StandardFormatterProvider.Instance
);

CsvSerializer.Serialize(array, new CsvOptions()
{
    FormatterProvider = provider
});

License

This library is released under the MIT license.

About

Fast CSV Serializer for .NET and Unity

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C# 100.0%