Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.NET ElasticLowLevelClient 7.4.1 serializer extremely slow #4260

Closed
ewolfman opened this issue Dec 10, 2019 · 8 comments · Fixed by #4263
Closed

.NET ElasticLowLevelClient 7.4.1 serializer extremely slow #4260

ewolfman opened this issue Dec 10, 2019 · 8 comments · Fixed by #4263

Comments

@ewolfman
Copy link

NEST/Elasticsearch.Net version: 7.4.1

Elasticsearch version: 7.4.2

Description of the problem including expected versus actual behavior:
I recently switched from ES 6.8.5 to 7.4.2. I noticed that bulk loading became extremely slow. I narrowed down the problem to the .NET's ElasticLowLevelClient serialization.

For example, serializing the following code 100K times in ES 6 takes ~1 second where as in ES 7 takes about 70 seconds (!!). As I have millions of records, this becomes a real problem.

var o = es.Serializer.SerializeToString(new
{
    index = new
    {
        _index = indexName,
        _id = session.SESSION_ID
    }
}, SerializationFormatting.None);

I also tried using MemoryStreamFactory.Default for the serialization but results are the same.

When the serialized class is bigger, the performance is even worse.

Currently I temporarily (?) switched to Newtonsoft JSON and the results are satisfying.
Can you please advise? Is there anyway to get the previous serializer or its performance?

Steps to reproduce:
Run the code below with Elasticsearch.Net 7.4.1 and with Elasticsearch.Net 6.8.3.

With 6.8.3 the results are:

ES: 75ms

With 7.4.1 the results are:

ES: 6824ms

        static void Main(string[] args)
        {
            var settings = new ConnectionConfiguration(new Uri("http://localhost:9200"));

            var es = new ElasticLowLevelClient(settings);
            var sw = Stopwatch.StartNew();
            for (int i = 0; i < 10000; i++)
            {
                var o = es.Serializer.SerializeToString(new
                {
                    index = new
                    {
                        _index = "test",
                        _id = i
                    }
                }, SerializationFormatting.None);
            }
            sw.Stop();

            Console.WriteLine($"ES: {sw.ElapsedMilliseconds}ms");
            Console.ReadKey();
        }
@russcam
Copy link
Contributor

russcam commented Dec 10, 2019

Thanks for reporting @ewolfman, I can replicate your findings.

We'll need to investigate this further.

@russcam
Copy link
Contributor

russcam commented Dec 10, 2019

This looks to be isolated to the low level serializer implementation; the high level client serializer is unaffected. The output from

var settings = new ConnectionSettings(new Uri("http://localhost:9200"));
var es = new ElasticClient(settings);
var sw = Stopwatch.StartNew();
for (int i = 0; i < 10000; i++)
{
	es.SourceSerializer.SerializeToString(new
	{
		index = new
		{
			_index = "test",
			_id = i
		}
	}, MemoryStreamFactory.Default);
}

sw.Stop();

Console.WriteLine($"ES: {sw.ElapsedMilliseconds}ms");

is around 5ms on my machine. As a workaround for now, you may want to consider using the high level client serializer.

@Mpdreamz
Copy link
Member

On master

var es = new ElasticLowLevelClient(new ConnectionSettings(new Uri("http://localhost:9200")));
es.Serializer.SerializeToString(new { index = new { _index = "test", _id = 0 } }, MemoryStreamFactory.Default);
var sw = Stopwatch.StartNew();
for (var i = 0; i < 10000; i++)
{
	es.Serializer.SerializeToString(new
	{
		index = new
		{
			_index = "test",
			_id = i
		}
	}, MemoryStreamFactory.Default);
}

sw.Stop();

Console.WriteLine($"ES: {sw.ElapsedMilliseconds}ms");

Yields ES: 18ms

If i change from the high level ConnectionSettings to the lowlevel base ConnectionConfiguration

var es = new ElasticLowLevelClient(new ConnectionConfiguration(new Uri("http://localhost:9200")));
es.Serializer.SerializeToString(new { index = new { _index = "test", _id = 0 } }, MemoryStreamFactory.Default);
var sw = Stopwatch.StartNew();
for (var i = 0; i < 1000; i++)
{
	es.Serializer.SerializeToString(new
	{
		index = new
		{
			_index = "test",
			_id = i
		}
	}, MemoryStreamFactory.Default);
}

sw.Stop();

Console.WriteLine($"ES: {sw.ElapsedMilliseconds}ms");

And lower the iteration to 1000 I get 4588ms which is pretty abysmal.

In most of our tests we pass a high level subclass of ConnectionSettings called TestConnectionSettings to both the low and high level client which explains why this went undetected.

We'll look into getting this rectified asap thanks for reporting @ewolfman

@Mpdreamz
Copy link
Member

This is now fixed on our master and 7.x branches.

The low level client's serializer was new'ing a Formatter on each invocation which thus acted as a cache buster.

#4263 addresses this.

I've self pulled this in so the fix is available on our nightlies on the https://ci.appveyor.com/nuget/elasticsearch-net nuget feed.

Will wait for my Australian colleagues to review and release a patch when their timezone hits morning again.

@russcam
Copy link
Contributor

russcam commented Dec 10, 2019

I've tested the change in #4263 locally and concur that it resolves the issue. I'll release a 7.4.2 version to nuget now, with this fix.

@niemyjski
Copy link
Contributor

Nice find! I wonder if it's worth taking a second look over all the Expression Bodied Members

@russcam
Copy link
Contributor

russcam commented Dec 11, 2019

7.4.2 is now released with this fix. Thanks again, @ewolfman 👍

I wonder if it's worth taking a second look over all the Expression Bodied Members

I think that's a worthwhile venture for static properties 👍 Wonder if someone has written a Roslyn Analyzer for this already?

@ewolfman
Copy link
Author

Thanks for the quick fix. I confirm that it works well now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants