😈 Lean mean SQL Server upsert machine
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.github
Debaser.Tests
Debaser
artwork
scripts
tools
.gitignore
CHANGELOG.md
Debaser.sln
LICENSE.md
README.md
appveyor.yml

README.md

Debaser

Have you ever had the need to insert/update many rows in SQL Server? (i.e. "upsert" them)

Did you try to use the MERGE INTO ... syntax then? (and did you enjoy it?)

Did you know that ADO.NET has an API that allows for STREAMING rows to a temporary table in tempdb making SQL Server call a stored procedure for each row very quickly?

Did you know that Debaser can do these things for you?

How to do it?

Create a class that looks the way you want it to look:

class SomeDataRow
{
    public SomeDataRow(int id, decimal number, string text)
    {
        Id = id;
        Number = number;
        Text = text;
    }

    public int Id { get; }
    public decimal Number { get; }
    public string Text { get; }
}

(Debaser supports using a default constructor and properties with setters, or using a constructor with parameters matching the properties like this example shows)

and then you create the UpsertHelper for it:

var upsertHelper = new UpsertHelper<SomeDataRow>("db");

(where db in this case is the name of a connection string in the current app.config)

and then you do this once:

upsertHelper.CreateSchema();

(which will create a table, a table data type, and a stored procedure)

and then you insert 100k rows like this:

var rows = Enumerable.Range(1, 100000)
	.Select(i => new SomeDataRow(i, i*2.3m, $"This is row {i}"))
	.ToList();

var stopwatch = Stopwatch.StartNew();

await upsertHelper.Upsert(rows);

var elapsedSeconds = stopwatch.Elapsed.TotalSeconds;

Console.WriteLine($"Upserting {rows.Count} rows took {elapsedSeconds} - that's {rows.Count / elapsedSeconds:0.0} rows/s");

which on my machine yields this:

Upserting 100000 rows took 0.753394 - that's 132732.7 rows/s

which I think is fair.

Merging with existing data

By default each row is identified by its ID property. This means that any IDs matching existing rows will lead to an update instead of an insert.

Let's hurl 100k rows into the database again:

var rows = Enumerable.Range(1, 100000)
	.Select(i => new SomeDataRow(i, i*2.3m, $"This is row {i}"))
	.ToList();

await upsertHelper.Upsert(rows);

and then let's perform some iterations where we pick 10k random rows, mutate them, and upsert them:

var stopwatch = Stopwatch.StartNew();

var updatedRows = rows.InRandomOrder().Take(rowsToChange)
    .Select(row => new SomeDataRow(row.Id, row.Number + 1, string.Concat(row.Text, "-HEJ")));

await upsertHelper.Upsert(updatedRows);

which on my machine yields this:

Updating 10000 random rows in dataset of 100000 took average of 0.2 s - that's 57191.1 rows/s

which (again) I think is OK.

Maturity

Debaser is not mature. For now, it's just an experiment, and there's probably a bunch of data types that cannot be handled. Please use with caution.