Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues around IScanIteratorFunctions #829

Closed
Tornhoof opened this issue May 2, 2023 · 1 comment · Fixed by #826
Closed

Issues around IScanIteratorFunctions #829

Tornhoof opened this issue May 2, 2023 · 1 comment · Fixed by #826

Comments

@Tornhoof
Copy link
Contributor

Tornhoof commented May 2, 2023

PR #817 added the functionality of PushIterators.
I apologize in advance for the lengthy text, but I think I ran into a blocker, at least from my understanding, which prevents me from using the current version of FASTER.

I tried to migrate my Wrapper around Faster Cache, which used the pull iterator to iterate through the key/value pairs.
The wrapper type has a custom enumerator struct which basically looks like the example below, this is a typical .NET style Enumerator implementation, nothing special.
Now with the current build of Faster, the exception Cannot run pull Iterate() in the mutable region when locking is enabled; is thrown.
So far so good, let's convert the code at the end to to a Push Iterator.
Let's look at the example in the basics doc, https://github.com/microsoft/FASTER/blob/main/docs/_docs/20-fasterkv-basics.md:

struct MyScanFunctions : IScanIteratorFunctions<MyKey, MyValue>
{
    // ... Implementation ...
}

MyScanFunctions myScanFunctions = new();
using var iter = session.Iterate(ref myScanFunctions);

The above example does not work, the Iterate method, taking the ref to the IteratorFunctions returns bool:

public bool Iterate<Input, Output, Context, Functions, TScanFunctions>(Functions functions, ref TScanFunctions scanFunctions, long untilAddress = -1)
where Functions : IFunctions<Key, Value, Input, Output, Context>
where TScanFunctions : IScanIteratorFunctions<Key, Value>

Now, the enumeration method in the iterator is this code:

public bool Iterate<Input, Output, Context, Functions, TScanFunctions>(Functions functions, ref TScanFunctions scanFunctions, long untilAddress = -1)
where Functions : IFunctions<Key, Value, Input, Output, Context>
where TScanFunctions : IScanIteratorFunctions<Key, Value>
{
if (untilAddress == -1)
untilAddress = Log.TailAddress;
using FasterKVIterator<Key, Value, Input, Output, Context, Functions> iter = new(this, functions, untilAddress, isPull: false, loggerFactory: loggerFactory);
if (!scanFunctions.OnStart(iter.BeginAddress, iter.EndAddress))
return false;
long numRecords = 1;
bool stop = false;
for ( ; !stop && iter.PushNext(ref scanFunctions, numRecords, out stop); ++numRecords)
;
scanFunctions.OnStop(!stop, numRecords);
return !stop;
}

If you look closely at the code, this is basically more or less a .NET style Enumerator, but since the for loop is already integrated into that piece of code above, it is not possible to wrap the Iterate method in anything like a typical .NET enumerator as seen below, which is in your logic of pull/push iterators a pull iterator.

How to solve it?
We need access to the internal type FasterKVIterator<Key, Value, Input, Output, Context, Functions>, e.g. via another Iterate method IterateWithScanFunction, with a signature similar to this:

 public IFasterScanIterator<Key, Value> IterateWithScanFunction<Input, Output, Context, Functions>(Functions functions,  ref TScanFunctions scanFunctions, long untilAddress = -1)
            where Functions : IFunctions<Key, Value, Input, Output, Context>
           where TScanFunctions : IScanIteratorFunctions<Key, Value>

which then returns the Iterator, with isPull: false, then we can basically just copy the iteration loop and wrap it into a .NET style Enumerator. From a developer perspective this is far from ideal as internal logic then needs to be copied around into the dev's implementation, but developers using FASTER should be already aware that breaking changes are common enough.

Without such a method, conversion to a .NET style Enumerator requires either some Queueing mechanism (e.g. Channels) or blocking via a signalling mechanism, e.g. a Semaphore to allow the "current" value to be read from the outer enumerator, the first kinda defeats the idea of having iterators, because it just will push all the values into a temp list and then the outer enumerator consumes those values. The second one, via blocking is equally bad.

The examples for the push iterator, https://github.com/microsoft/FASTER/blob/main/cs/samples/StoreVarLenTypes/SpanByteSample.cs
and https://github.com/microsoft/FASTER/blob/main/cs/samples/ReadAddress/VersionedReadApp.cs
aren't really doing anything useful with iterated values, they basically just iterate the data and compare the count, they are more tests than an example on how to use Iterators.

Enumerator Example for the old pull iterator:

// please ignore DisposableKeyValuePair and IMemoryCacheKeyValue, they are just wrappes for the types to the stored in the cache.
public struct Enumerator : IEnumerator<DisposableKeyValuePair>, IEnumerator<IMemoryCacheKeyValue>
{
    private readonly FasterCache m_Cache;
    private readonly ClientSession m_Session;
    private IFasterScanIterator<SpanByte, SpanByte> m_Iterator;

    public Enumerator(FasterCache cache)
    {
        m_Cache = cache;
        m_Session = cache.GetSession();
        m_Iterator = m_Session.Iterate();
    }

    public bool MoveNext()
    {
        return m_Iterator.GetNext(out _);
    }

    public void Reset()
    {
        m_Iterator.Dispose();
        m_Iterator = m_Session.Iterate();
    }

    IMemoryCacheKeyValue IEnumerator<IMemoryCacheKeyValue>.Current => Current;

    public DisposableKeyValuePair Current
    {
        get
        {
            ref var key = ref m_Iterator.GetKey();
            ref var value = ref m_Iterator.GetValue();
            return new DisposableKeyValuePair(ref key, ref value);
        }
    }

    object IEnumerator.Current => Current;

    public void Dispose()
    {
        m_Iterator.Dispose();
        m_Cache.ReleaseSession(m_Session);
    }
}
@TedHartMS
Copy link
Contributor

Thanks for the detailed explanation. There is a pending PR that removes this restriction on pull iterators.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants