Skip to content

Conversation

@alfeg
Copy link
Contributor

@alfeg alfeg commented May 14, 2025

Hi,

Thanks for great lib.

We use DuckDB for in-process analyze of large amount of data from MSSQL. This involves movement of billion rows into DuckDB using Appender.

Rider noticed large amount of allocations from AppendValue()

I've rewrite GuidConverter to use direct bytes manipulation instead of strings for DuckDBHugeInt conversion from/to Guid. Under .Net 9.0 there are no allocations, under .Net 8.0 - a bit reduced. Perf gains not that visible

BenchmarkDotNet v0.14.0, Windows 11 (10.0.26100.3915)
AMD Ryzen 7 7800X3D, 1 CPU, 16 logical and 8 physical cores
.NET SDK 9.0.300
  [Host]               : .NET 9.0.5 (9.0.525.21509), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  .NET 8.0             : .NET 8.0.16 (8.0.1625.21506), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  .NET 9.0             : .NET 9.0.5 (9.0.525.21509), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  .NET Framework 4.8.1 : .NET Framework 4.8.1 (4.8.9310.0), X64 RyuJIT VectorSize=256
Method Runtime data Mean Error StdDev Median Gen0 Allocated
FromHugeInt .NET 8.0 DuckDBHugeInt[100] 7.904 μs 0.2789 μs 0.8179 μs 7.561 μs 0.3815 19200 B
FromHugeIntNew .NET 8.0 DuckDBHugeInt[100] 7.334 μs 0.1393 μs 0.2651 μs 7.217 μs 0.3815 19200 B
FromHugeInt .NET 9.0 DuckDBHugeInt[100] 1.526 μs 0.0302 μs 0.0589 μs 1.524 μs - -
FromHugeIntNew .NET 9.0 DuckDBHugeInt[100] 1.412 μs 0.0144 μs 0.0113 μs 1.411 μs - -
FromHugeInt .NET Framework 4.8.1 DuckDBHugeInt[100] 26.506 μs 0.4553 μs 0.6953 μs 26.564 μs 3.1738 20059 B
FromHugeIntNew .NET Framework 4.8.1 DuckDBHugeInt[100] 27.066 μs 0.5328 μs 0.9471 μs 27.037 μs 3.1738 20059 B
ToHugeInt .NET 8.0 Guid[100] 3.693 μs 0.0294 μs 0.0246 μs 3.679 μs 0.1717 8800 B
ToHugeIntNew .NET 8.0 Guid[100] 3.290 μs 0.1059 μs 0.3123 μs 3.306 μs 0.0763 4000 B
ToHugeInt .NET 9.0 Guid[100] 2.884 μs 0.0699 μs 0.2050 μs 2.871 μs 0.1717 8800 B
ToHugeIntNew .NET 9.0 Guid[100] 1.734 μs 0.0468 μs 0.1379 μs 1.735 μs - -
ToHugeInt .NET Framework 4.8.1 Guid[100] 13.161 μs 0.2592 μs 0.2774 μs 13.037 μs 1.5259 9628 B
ToHugeIntNew .NET Framework 4.8.1 Guid[100] 4.632 μs 0.0904 μs 0.1110 μs 4.591 μs 0.6332 4012 B

Source code of benchmark: DuckDbNet.GuidConverterBench.zip

@coveralls
Copy link

coveralls commented May 14, 2025

Pull Request Test Coverage Report for Build 15107804566

Details

  • 49 of 49 (100.0%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.2%) to 90.036%

Totals Coverage Status
Change from base Build 15029732985: 0.2%
Covered Lines: 2124
Relevant Lines: 2324

💛 - Coveralls

@Giorgi Giorgi requested a review from Copilot May 14, 2025 20:08
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR updates the Guid conversion routines to reduce allocations by using direct byte manipulation for converting between DuckDBHugeInt and Guid types under .NET 6.0 or greater. Key changes include:

  • Using ArrayPool with direct byte conversion in ConvertToGuid.
  • Implementing a new ToHugeInt method that reconstructs a Guid's byte order via direct memory manipulation.
  • Maintaining backward compatibility for earlier .NET versions with the original string-based conversion.
Comments suppressed due to low confidence (1)

DuckDB.NET.Data/Extensions/GuidConverter.cs:59

  • [nitpick] Consider renaming 'ConvertToGuid' to 'ToGuid' to align with the naming convention used in 'ToHugeInt', enhancing consistency.
public static Guid ConvertToGuid(this DuckDBHugeInt input)

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

The PR replaces string-based GUID conversion with direct byte manipulation and ArrayPool usage to reduce allocations in GuidConverter.

  • Introduces an allocation-free ConvertToGuid implementation for .NET 6+ using ArrayPool<byte> and bit reordering.
  • Adds a new ToHugeInt method that writes GUID bytes into a 32-byte buffer and reorders them for DuckDB.
  • Retains the original fallback logic for earlier frameworks under !NET6_0_OR_GREATER.

@Giorgi Giorgi changed the base branch from develop to Guid-Converter May 29, 2025 08:46
@Giorgi Giorgi merged commit 585aea1 into Giorgi:Guid-Converter May 29, 2025
9 of 10 checks passed
@Giorgi
Copy link
Owner

Giorgi commented May 29, 2025

@alfeg
Copy link
Contributor Author

alfeg commented May 30, 2025

@Giorgi links looks to be same, but I do like this implementation.

Seems to work nice and fast

  • ToHugeInt - original
  • ToHugeIntNew - from this PR
  • ToHugeIntNewAlloc - is the method implementation You provide
Method Runtime Mean StdDev Median Ratio Gen0 Allocated Alloc Ratio
ToHugeInt .NET 8.0 63.66 us 4.544 us 62.99 us baseline 1.7090 88000 B
ToHugeIntNew .NET 8.0 31.43 us 1.708 us 30.79 us -50% 0.7935 40000 B -55%
ToHugeIntNewAlloc .NET 8.0 21.44 us 1.908 us 20.66 us -66% 1.8921 96000 B +9%
ToHugeInt .NET 9.0 24.13 us 0.508 us 23.92 us -62% 1.7395 88000 B +0%
ToHugeIntNew .NET 9.0 15.03 us 0.069 us 14.99 us -76% - - -100%
ToHugeIntNewAlloc .NET 9.0 14.77 us 0.392 us 14.93 us -77% - - -100%
ToHugeInt .NET Framework 4.8.1 185.40 us 2.842 us 185.72 us +193% 15.1367 96284 B +9%
ToHugeIntNew .NET Framework 4.8.1 48.72 us 0.478 us 48.83 us -23% 6.3477 40118 B -54%
ToHugeIntNewAlloc .NET Framework 4.8.1 49.19 us 0.212 us 49.19 us -22% 15.2588 96283 B +9%

I'm very surprised how much a difference .Net 9 provide against .Net 8

@Giorgi
Copy link
Owner

Giorgi commented May 30, 2025

What we could also do is allocate only 16 bytes and do in-place swapping instead of copying 16 bytes to another 16 bytes. Not sure if that will have any impact though.

@alfeg
Copy link
Contributor Author

alfeg commented May 30, 2025

@Giorgi for .Net 8/9 they are allocated in stack. I don't think we will be able to find any measurable difference there. I guess currently compiler is able to unwind loop and produce highly optimized code. Not sure if we will be able to do better.

@Giorgi
Copy link
Owner

Giorgi commented May 30, 2025

That's right but I wonder why on .Net 8 there is more memory allocation with the new implementation considering that the string is no longer allocated.

@alfeg
Copy link
Contributor Author

alfeg commented May 30, 2025

@Giorgi seems to be some issue in guid.TryWriteBytes(bytes); that were fixed in .Net 9. But I cannot confirm this.

On the bright side - it's now 3x times faster

@Giorgi
Copy link
Owner

Giorgi commented May 30, 2025

@alfeg That's a great improvement but it's kind of funny that we started with memory allocation improvement and ended with a bit of regression 😀

@Giorgi
Copy link
Owner

Giorgi commented May 30, 2025

By the way, did you benchmark the other method as well?

@alfeg
Copy link
Contributor Author

alfeg commented May 30, 2025

@Giorgi no I didn't.

By the way, thanks for this project!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants