Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FileTarget - Faster async processing of LogEvents for the same file #1683

Merged
merged 4 commits into from
Nov 7, 2016

Conversation

snakefoot
Copy link
Contributor

@snakefoot snakefoot commented Oct 5, 2016

When performing async batch writing, then LogEvents destination (file or database) can be decided by a Layout.

The default situation is that all LogEvents should go to the same destination. Have changed BucketSort() to optimize for this case.

This skips allocation of Dictionary and inserting all LogEvents into a new List of this dictionary


This change is Reviewable

@304NotModified
Copy link
Member

Cool!

Could you send a test program for this to https://github.com/NLog/NLog-Performance-tests ?

@@ -61,14 +62,14 @@ internal static class SortHelpers
/// <returns>
/// Dictionary where keys are unique input keys, and values are lists of <see cref="AsyncLogEventInfo"/>.
/// </returns>
public static Dictionary<TKey, List<TValue>> BucketSort<TValue, TKey>(this IEnumerable<TValue> inputs, KeySelector<TValue, TKey> keySelector)
public static IDictionary<TKey, IList<TValue>> BucketSort<TValue, TKey>(this IEnumerable<TValue> inputs, KeySelector<TValue, TKey> keySelector)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK the usage of interfaces is always slower than the direct type. Not sure if this is also for these cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any idea about this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the IList prevents us from using the List-struct-enumerator. This is also why I have changed the iteration of a single bucket into a for-loop.

@codecov-io
Copy link

codecov-io commented Oct 5, 2016

Current coverage is 80% (diff: 84%)

Sunburst

No coverage report found for master at 97d6bc5.

Powered by Codecov. Last update 97d6bc5...c5c6425

@snakefoot
Copy link
Contributor Author

snakefoot commented Oct 5, 2016

When using the following setup with Tester1, then the new BucketSort gives 1.200.000 msg/sec (Before 1.000.000 msgs/sec). Together with less garbage allocation.

  <target xsi:type="BufferingWrapper"
      name="mipLoggertarget"
      slidingTimeout="false"
      bufferSize="10000"
      flushTimeout="0">
    <target name ="mipLoggertargetWrapped" xsi:type="File"
            fileName="c:/temp/Log/test1.log"
            layout="${message}"
            keepFileOpen="true" AutoFlush="true" ConcurrentWrites="false" CleanupFileName="false" />
  </target>

@304NotModified
Copy link
Member

That would be an very nice improvement!

@304NotModified
Copy link
Member

This also counts for the aysnc usage, isn't it?

@304NotModified
Copy link
Member

Reviewed 7 of 11 files at r1, 2 of 2 files at r2, 2 of 2 files at r3.
Review status: all files reviewed at latest revision, all discussions resolved.


Comments from Reviewable

@304NotModified 304NotModified added this to the 4.3.10 milestone Oct 5, 2016
@304NotModified
Copy link
Member

will try to test this, this week

@snakefoot
Copy link
Contributor Author

snakefoot commented Oct 5, 2016

Yes it will also improve the async-wrapper-performance. But because of the timer-throttle, then it is difficult to see this improvement (Requires very big batch-sizes). But together with #1669 (with sleeptime=0) then it should give a significant performance boost.

@304NotModified
Copy link
Member

Luckily #1669 has been merged :)

@snakefoot
Copy link
Contributor Author

snakefoot commented Oct 6, 2016

Just merged with master to try the async-wrapper. This PR now improves from 950.000 msgs/sec to 1.050.000 msgs/sec.

  <target name="mipLoggertarget" xsi:type="AsyncWrapper" overflowAction="Block" queueLimit="40000" batchSize="20000" timeToSleepBetweenBatches="0">
    <target name ="mipLoggertargetWrapped" xsi:type="File"
            fileName="c:/temp/Log/test1.log"
            layout="${message}"
            keepFileOpen="true" AutoFlush="true" ConcurrentWrites="false" CleanupFileName="false" />
  </target>

When using timeToSleepBetweenBatches="1" then this PR improves performance from 800.000 msgs/sec. to 850.000 msgs/sec.

…by skipping dictionary allocation and insert
@304NotModified
Copy link
Member

304NotModified commented Oct 10, 2016

Just merged with master to try the async-wrapper. This PR now improves from 950.000 msgs/sec to 1.050.000 msgs/sec.

When using timeToSleepBetweenBatches="1" then this PR improves performance from 800.000 msgs/sec. to 850.000 msgs/sec.

I have tested this case with master (639b40f) and pr/1683 (49d3e3d) and don't see any performance difference. Both will get around 240-260K/s. Any idea why?

start test with 1.000.000 messages
1.000.000,00 messages. Time taken: 4.096,90ms. 244.086,89 / sec

Release build, https://github.com/NLog/NLog-Performance-tests/tree/7d3bbe83db1c42195c78cb313203b8f792e6019d

update:
timeToSleepBetweenBatches="1" gives ~200-210K/s (also no significant difference).

@snakefoot
Copy link
Contributor Author

snakefoot commented Oct 10, 2016

I have modified the Tester1 program to just do logger.Debug("Line : 99999999"); , instead of adding extra allocations to the test.

I performed a test with the above async-wrapper config, and it improved from 1.000.000 msgs/sec to 1.100.000 msg/sec.

When using the buffer-wrapper-config then it improved from 1.000.000 msgs/sec to 1.200.000 msgs/sec.

If I add {threadname} into the layout (so the Logger-thread performs the rendering), then the buffer-wrapper improves from 900.000 msgs/sec to 1.000.000 msgs/sec.

When comparing in the profiler then the old BucketSort takes 3% of the total time, where the new takes 0.5% of the total time.

@304NotModified
Copy link
Member

I have modified the Tester1 program to just do logger.Debug("Line : 99999999"); , instead of adding extra allocations to the test.

So there is only a difference when using fixed values?

Which is a valid case, as long as the non-fixed values (the one I have tested), won't suffer - (and that is not the case, no performance impact, also not lower)

@snakefoot
Copy link
Contributor Author

snakefoot commented Oct 10, 2016

There is also an improvement when adding {threadname} to the layout, so not just fixed values (See previous comment)

@304NotModified 304NotModified modified the milestones: 4.3.10, 4.3.11 Oct 11, 2016
@snakefoot snakefoot force-pushed the SortBucketHelperDictionary branch 2 times, most recently from 8ec38dd to d5886b1 Compare October 16, 2016 21:07
@snakefoot
Copy link
Contributor Author

Removed code regarding ArraySegment, as this logic is for another PR

@304NotModified
Copy link
Member

Reviewed 11 of 11 files at r5.
Review status: all files reviewed at latest revision, 3 unresolved discussions.


src/NLog/Internal/SortHelpers.cs, line 95 at r5 (raw file):

        /// Dictionary where keys are unique input keys, and values are lists of <see cref="AsyncLogEventInfo"/>.
        /// </returns>
        public static ReadOnlySingleBucketDictionary<TKey, IList<TValue>> BucketSort<TValue, TKey>(this IList<TValue> inputs, KeySelector<TValue, TKey> keySelector)

Are you sure this code is used? I only see it in the tests? (reviewing in the webinterface)


src/NLog/Targets/FileTarget.cs, line 1019 at r5 (raw file):

                    LogEventInfo firstLogEvent = null;

                    for (int i = 0; i < bucket.Value.Count; ++i)

why ++i and not i++?


Comments from Reviewable

@snakefoot
Copy link
Contributor Author

src/NLog/Internal/SortHelpers.cs, line 95 at r5 (raw file):

Previously, 304NotModified (Julian Verdurmen) wrote…

Are you sure this code is used? I only see it in the tests? (reviewing in the webinterface)

It should have higher priority as it takes an IList (array) instead of IEnumerable. When debugging MultiFileWrite-test then it hits this method from FileTarget.

Comments from Reviewable

@snakefoot
Copy link
Contributor Author

src/NLog/Targets/FileTarget.cs, line 1019 at r5 (raw file):

Previously, 304NotModified (Julian Verdurmen) wrote…

why ++i and not i++?

Old C++ habits are hard to kill :)

Comments from Reviewable

@304NotModified
Copy link
Member

Review status: all files reviewed at latest revision, 2 unresolved discussions.


src/NLog/Internal/SortHelpers.cs, line 95 at r5 (raw file):

Previously, snakefoot (Rolf Kristensen) wrote…

It should have higher priority as it takes an IList (array) instead of IEnumerable. When debugging MultiFileWrite-test then it hits this method from FileTarget.

A yes, that sounds logical

src/NLog/Targets/FileTarget.cs, line 1019 at r5 (raw file):

Previously, snakefoot (Rolf Kristensen) wrote…

Old C++ habits are hard to kill :)

LOL. Please change it :)

Comments from Reviewable

@snakefoot
Copy link
Contributor Author

src/NLog/Targets/FileTarget.cs, line 1019 at r5 (raw file):

Previously, 304NotModified (Julian Verdurmen) wrote…

LOL. Please change it :)

Guess you should make a file search for "++i") in the NLog-source (with your favorite index-variable-name). Apparently not the first :)

But will change it.


Comments from Reviewable

@304NotModified
Copy link
Member

Review status: all files reviewed at latest revision, 1 unresolved discussion.


src/NLog/Targets/FileTarget.cs, line 1019 at r5 (raw file):

Previously, snakefoot (Rolf Kristensen) wrote…

Guess you should make a file search for "++i") in the NLog-source (with your favorite index-variable-name). Apparently not the first :)

But will change it.

Will do if im bored :p.

Comments from Reviewable

@snakefoot
Copy link
Contributor Author

src/NLog/Targets/FileTarget.cs, line 1019 at r5 (raw file):

Previously, 304NotModified (Julian Verdurmen) wrote…

Will do if im bored :p.

Done.

Comments from Reviewable

@snakefoot snakefoot force-pushed the SortBucketHelperDictionary branch 2 times, most recently from 4fdd7e1 to 8d01583 Compare October 31, 2016 22:15
…by skipping dictionary allocation and insert (cleanup)
@304NotModified
Copy link
Member

Reviewed 3 of 3 files at r6.
Review status: all files reviewed at latest revision, all discussions resolved.


Comments from Reviewable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants