Skip to content

b-maslennikov/CsvSorter

Repository files navigation

CsvSorter

Build Tests Issues License Nuget

It is a small package that will help you to sort your large CSV files.

Instalation

PM> Install-Package CsvSorter

Dependencies

Package name Version
CsvHelper >=30.0.1

Avaliable methods

Method name Parameter type Description
OrderBy<T> string
int
Describes CSV field type and name (or index) that are going to be used during the sorting.
Sort direction: ascending.
OrderByDescending<T> string
int
Describes CSV field type and name (or index) that are going to be used during the sorting.
Sort direction: descending.
Using CsvConfiguration Sets CsvConfiguration. See CsvHelper documentation
TypeConverterOptions Sets TypeConverterOptions. See CsvHelper documentation
IIndexProvider<T> Alows to set index provider. Default: MemoryIndexProvider
ToFile string Saves sorted data to a file
ToWriter TextWriter Saves sorted data using provided writer

Usage example

using CsvSorter;
new StreamReader(@"C:\my_large_file.csv")
    .OrderBy<int>("id")
    .ToFile(@"C:\my_large_file_sorted_by_id.csv");

Index provider

Default index provider is MemoryIndexProvider<T>. It stores index data in the memory. You can create your own provider (DatabaseIndexProvider for example) by implementing IIndexProvider<T> interface:

public interface IIndexProvider<T> where T: IComparable
{
    void Add(CsvSorterIndex<T> record);
    IEnumerable<CsvSorterIndex<T>> GetSorted(bool descending);
    void Clear();
}

Events

You can specify 4 events: OnIndexCreationStarted, OnIndexCreationFinished, OnSortingStarted and OnSortingFinished

new StreamReader(@"C:\my_large_file.csv")
    .OrderBy<int>(0)
    .OnIndexCreationStarted(() => { logger.Info("Index creation has started"); })
    .OnIndexCreationFinished(() => { logger.Info("Index creation completed"); })
    .OnSortingStarted(() => { logger.Info("Sorting has started"); })
    .OnSortingFinished(() => { logger.Info("Sorting completed"); })
    .ToWriter(writer);

A few more examples

var csvConfig = new CsvConfiguration(CultureInfo.InvariantCulture)
{
    HasHeaderRecord = false
};

var dateTimeConverterOptions = new TypeConverterOptions
{ 
    Formats = new[] { "dd_MM_yyyy" }
};

new StreamReader(@"C:\my_large_file.csv")
    .OrderByDescending<DateTime>(3)
    .Using(csvConfig)
    .Using(dateTimeConverterOptions)
    .ToFile(@"C:\my_large_file_sorted_by_date.csv");
var csvConfig = new CsvConfiguration(CultureInfo.InvariantCulture)
{
    Delimiter = "|"
};

new StreamReader(@"C:\my_large_file.csv")
    .OrderBy<string>("email")
    .Using(csvConfig)
    .Using(new AzureIndexProvider<string>())
    .ToWriter(writer);