Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C#] Could Disk Reads be FASTER? #238

Closed
ThiagoT1 opened this issue Feb 1, 2020 · 5 comments
Closed

[C#] Could Disk Reads be FASTER? #238

ThiagoT1 opened this issue Feb 1, 2020 · 5 comments

Comments

@ThiagoT1
Copy link
Contributor

ThiagoT1 commented Feb 1, 2020

I've got pretty much every basic operation covered on my POC.

Repo: https://github.com/ThiagoT1/FasterDictionary/tree/unsafe-0.0.3-fastread

Iteration thru the whole KV has ok(ish) performance.

public IAsyncEnumerator<KeyValuePair<TKey, TValue>> GetAsyncEnumerator(CancellationToken cancellationToken = default)
{
	return new AsyncEnumerator(this);
}
class AsyncEnumerator : IAsyncEnumerator<KeyValuePair<TKey, TValue>>
{
	FasterDictionary<TKey, TValue> _fasterDictionary;

	public ValueTask StartTask { get; }

	public AsyncEnumerator(FasterDictionary<TKey, TValue> fasterDictionary)
	{
		_fasterDictionary = fasterDictionary;
		StartTask = _fasterDictionary.AquireIterator();
	}

	KeyValuePair<TKey, TValue> _current;
	public KeyValuePair<TKey, TValue> Current => _current;

	public ValueTask DisposeAsync()
	{
		return _fasterDictionary.ReleaseIterator();
	}

	public async ValueTask<bool> MoveNextAsync()
	{
		await StartTask;

		_current = default;
		var readResult = await _fasterDictionary.IteratePair();
		if (!readResult.Found)
			return false;

		_current = new KeyValuePair<TKey, TValue>(readResult.Key, readResult.Value);
		return true;
	}
}

https://github.com/ThiagoT1/FasterDictionary/blob/96407cf3e476f612eb2eb72e895af4c6f9358850/src/FasterDictionary/FasterDictionary.Iterator.cs#L68-L71

Issue

What's killing me now is the random key read performance. I'm talking "mono-threaded" access to KV where a single reader has to retrieve values for thousands of random keys.

Adding, Updating once and Iterating over 1M keys takes less than 2 minutes:

[InlineData(1_000_000)]
public async Task AddUpdateIterate(int loops)
{
	FasterDictionary<string[], string>.ReadResult result;
	using (var dictionary = new FasterDictionary<string[], string>(TestHelper.GetKeyComparer<string[]>(), GetOptions($"{nameof(AddIterate)}-{loops}")))
	{
		var keys = new List<string[]>();

		for (var i = 0; i < loops; i++)
		{
			keys.Add(new[] { i.ToString() });
			dictionary.Upsert(keys[i], (i + 1).ToString()).Dismiss();
		}

		await dictionary.Ping();

		for (var i = 0; i < loops; i++)
			dictionary.Upsert(keys[i], (i + 10).ToString()).Dismiss();

		await dictionary.Ping();

		var count = 0;
		await foreach (var entry in dictionary)
		{
			count++;
			Assert.Equal((int.Parse(entry.Key[0]) + 10).ToString(), entry.Value);
		}

		result = await dictionary.TryGet(new[] { loops.ToString() });
		Assert.False(result.Found);

		Assert.Equal(loops, count);
	}
}

https://github.com/ThiagoT1/FasterDictionary/blob/026248db7bfca8f5b72c22481f732babd8237c8e/tests/FasterDictionary.Tests/UseCases/ArrayAsKeyTests.cs#L157-L190

However, doing the same while replacing the iteration with a 1M key lookup takes 30 minutes:

		for (var i = 0; i < loops; i += step)
		{
			result = await dictionary.TryGet(keys[i]);
			Assert.True(result.Found);
			Assert.Equal((i + 10).ToString(), result.Value);
		}

https://github.com/ThiagoT1/FasterDictionary/blob/026248db7bfca8f5b72c22481f732babd8237c8e/tests/FasterDictionary.Tests/UseCases/ArrayAsKeyTests.cs#L136-L141

This is a regular SSD, with roughly 500 MB/s read/write:
image

Looking at CPU time eaters on VS 2019 we see that the culprit is ClientSession<,,,,,>CompletePending, which in turn blames the Thread.Yield() inside FasterKV<,,,,,>.InternalCompletePending.

image

Replacing the Thread.Yield() with a SpinWait.SpinOnce() maintains functionality while greatly reducing CPU Usage.
Throughput, however, stays low, since the real bottleneck is the granularity of disk reads (the 30 minute mark is with this one):

image

Use Cases

Theres two scenarios where one of those strategies - mono-threaded iteration or massive random key lookups - will come into play:

1) Iterate thru a subset of the key set (Read Only)

This assumes a function to calculate all keys to iterate thru is in place (given a query or something).

In this case, every key in the subset will have to be passed to KVSession.Read, which will have awful performance. This strategy could be replaced (by the KV user) with a whole KV iteration, dismissing keys not in the subset. Not great, though.

Proposal

The knowledge about multiple keys in need of retrieval should be passed to KV, by means of a Read method that would accept multiple keys, while returning a AsyncEnumerable<ReadResult> that could be used to process all recovered values.

Such KV method would use that key array to optimize disk access (leveraging address and segment locallity), by means of reading larger chunks less often, whenever possible.

2) Massive RMW of Unitary Keys Already on Disk

This assumes that as soon as an RMW operation starts, all reads on the same key should retrieve the new value or wait for it (see below). It's pretty much the same as the tombstone mark, that informs us that the record is gone.

In this case, currently both the RMW'er and the Reader are slow, plus (AFAIK) the reader may get an old value, while we wait for the disk access to complete (and the RMW continuation to complete).

Proposal

RMW`s should atomically mark the record as dirty and cause any Read result of that key to yield a status that means we are waiting on a RMW to complete. The reader could optionally continue the read of a (possibly) old value, or place a continuation for when the RMW completes, which would garantee a fresh read up to that point.

This would also let the RMW'er fastly keep issuing the RMW command for the next keys, since a continuation for the actual RMW operation can be placed and there's a guarantee that any readers can wait for the RMW to complete.

The code that actually does the RMW after the disk read could be provided by a new Functions instance, as per #230.

It should also be noted that FASTER was firstly written without TLP in mind, and full adherence to TLP semantics would see the Functions class callbacks gone. But that's probably for another day, isn't it? :)


Also, please note that the same tests without updates complete under 40 seconds. The detail here is that we are changing the value length on purpose, with those updates. This forces the concurrentwriter call to return false, which in turn causes the need of a new slot on the log.

BTW, all of this is without serializers or object log, using only the unsafe approach with var length types

Thanks a lot!

@ThiagoT1 ThiagoT1 changed the title [C#] Could Cold Reads be FASTER? [C#] Could Disk Reads be FASTER? Feb 1, 2020
@badrishc
Copy link
Contributor

There is a tension between sync and async code here. For sync, the better alternative is Thread.Yield (benchmarked in the past). But for async, the better alternative might be what you suggest (Spin), although I have not benchmarked this yet. Is the observed cost (throughput) with async significantly degraded to justify a change or a config here?

@ThiagoT1
Copy link
Contributor Author

Actually, async code is already avoiding the Thread.Yeild.
So, this could be a left as it is, as I predict async API will be way more popular, even making all sync code obsolete.

The other issues, though, like batching reads and chaining RMW's, are still alive.

@badrishc
Copy link
Contributor

Then is this PR still relevant: #241?

Will consider the other comments separately.

@ThiagoT1
Copy link
Contributor Author

I closed the PR.

@badrishc
Copy link
Contributor

Closing as we have optimized the main goal of reads. We can create new issues if we plan to work on any other issues/ideas mentioned here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants