Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Theory MemberData to allow awaitable method #1698

Closed
simonetino opened this issue May 7, 2018 · 6 comments
Closed

Theory MemberData to allow awaitable method #1698

simonetino opened this issue May 7, 2018 · 6 comments

Comments

@simonetino
Copy link

Hello,
I have a typical test case scenario where I need to test each instance in a collection returned by an asynchronous call to my service class, which looks like the following:

public class DataService {
    ...
    public async Task<List<DataEntry>> GenerateData() {
        ...
    }
    ...
}

To do so, I am writing my test as a Theory with a MemberData using a method to retrieve all occurrences I need to test.
My test looks like the following:

[Theory]
[MemberData(nameof(GetDataEntries))]
public void Test_DataEntry(DataEntry entry) {
    // my assertions
    Assert.NotNull(entry);
    ...
}

public static IEnumerable<object[]> GetDataEntries() {
    var service = new DataService();
    List<DataEntry> entries = service.GenerateData().Result;
    return entries.Select(e => new object[] { e });
}

This is working fine, however in my GetDataEntries() static method, I am using .Result on my asynchronous service call.
Considering this service call can be, resource wise, quite expensive and that I have several other very similar test cases, I think it would be beneficial to allow usage of async/await to avoid blocking calls in my tests, like when my GetDataEntries() method gets called.
Ideally, my code would look like the following:

[Theory]
[MemberData(nameof(GetDataEntries))]
public void Test_DataEntry(DataEntry entry) {
    // my assertions
    Assert.NotNull(entry);
    ...
}

public static async Task<IEnumerable<object[]>> GetDataEntries() {
    var service = new DataService();
    List<DataEntry> entries = await service.GenerateData().ConfigureAwait(false);
    return entries.Select(e => new object[] { e });
}

However, this causes a compile time error as it looks such setup not to be allowed:

MemberData must reference a data type assignable to 'System.Collections.Generic.IEnumerable<object[]>'. The referenced type 'System.Threading.Tasks.Task<System.Collections.Generic.IEnumerable<object[]>>' is not valid.

Would not it be beneficial to allow usage of awaitable methods in MemberData?

For reference, I have opened a question on StackOverflow about this very same case:
StackOverflow - xUnit Theory with async MemberData

Thanks.

@simonetino simonetino changed the title Theory with async MemberData Theory MemberData to allow async/await on method May 7, 2018
@simonetino simonetino changed the title Theory MemberData to allow async/await on method Theory MemberData to allow awaitable method May 7, 2018
@bradwilson bradwilson mentioned this issue Aug 15, 2020
@Ilchert
Copy link

Ilchert commented Aug 25, 2020

Because you are going to use IAsyncDisposable it will be great to use IAsyncEnumerable<object[]> in this case.

bradwilson added a commit that referenced this issue Aug 28, 2021
@R0boC0p
Copy link

R0boC0p commented May 17, 2024

@bradwilson Hi, it's an old/closed issue I know, but I just stumbled on it and wanted to know, if there is any possibility in xunit v2 (2.7.0 for me) to have a Task awaited by the DataAttribute.GetData() ... somehow.
There is an answer in the SO link posted above, but looking at the v2 implementation of xunit, it is not apparent to me, how this would ever be possible.
Is this just a myth in v2 and only be avail in v3?

Thanks in advance for the clarification.

Cheers R0b

@bradwilson
Copy link
Member

@R0boC0p In v3, the change was fairly significant, because it encompasses a few new features. The real change exists in IDataDiscoverer.

v2:

IEnumerable<object[]> GetData(
    IAttributeInfo dataAttribute,
    IMethodInfo testMethod
);

v3:

ValueTask<IReadOnlyCollection<ITheoryDataRow>?> GetData(
    _IAttributeInfo dataAttribute,
    _IMethodInfo testMethod,
    DisposalTracker disposalTracker
);

I was comfortable make that kind of breaking change across major versions.

I obviously can't just change things like this in v2 without breaking everybody, so making this work would involve adding a second interface and second base class, and then updating the discovery logic to look for and support both. We couldn't make the two signatures identical, either, because the target frameworks for v2 don't support ValueTask (and I don't intend to back-port the new ITheoryDataRow interface). I've been reluctant to make these kinds of changes in v2, though obviously it would be possible to do them if needed.

Alternatively, they could be done via extensibility. It is the TheoryDiscoverer (and XunitTheoryTestCaseRunner) that finds and uses the IDataDiscoverer instances, so replacing [Theory] and it discoverer (and the runner, for non-serializable data) would allow someone to have purely async data sources. Let me see how difficult that would be, and maybe I can provide a sample that you could basically lift out and use as-is.

@bradwilson
Copy link
Member

@R0boC0p It wasn't as difficult as I thought it would be, because I didn't actually need to override as much as I expected to. I could make it work within the existing API surface area, mostly by "cheating" on the await needed for the data by forcing it off onto a thread pool thread and then blocking the synchronous data discovery system.

Sample lives here: https://github.com/xunit/samples.xunit/tree/main/AsyncDataExample

There's a lot of copied code in the AsyncMemberDataAttribute class, mostly because the existing attributes aren't good extensibility candidates, so I mostly had to steal code from the existing MemberDataAttribute and MemberDataAtributeBase classes. I also overrode the discoverer to turn off discovery data enumeration, on the assumption that anything that needs to be async is too heavy weight (and perhaps too much in need of initialization work) for discovery time.

Let me know if you have any questions.

@R0boC0p
Copy link

R0boC0p commented May 21, 2024

@bradwilson Thank you so much for your time and consideration! Taking a look at your code, it indeed is now able to get async member data, but as you already mentioned it is going to block the data discovery system.

What I am trying to solve:
I am working with x-unit and Autofixture. My Autofixture does a table setup for a local database, which is indirectly handled by the DataAttributes.GetData() method (The main reason I do it this way is the ease of use. I know this could be handled differently but also more boiler-plate involved which I am currently against). I think, that this might be a massive slowdown when starting the tests, as the fixtures setup is run synchronously and blocking. I assumed that in my case a fully async-capable system (like in v3 now), would bring a massive cut in execution time.
There is already a GetWaiter().GetResult() in my code and I think it doesn't really matter if its in the fixture setup (called by GetData()) or the a few calls earlier in the GetData() itself.

Autofixture is not compatible with v3 atm.

I hope I could make sense of myself.

Thanks again for your support!!

@bradwilson
Copy link
Member

I don't necessarily know whether an in-built async data retrieval would be any faster in v3, because we don't generally do data retrieval in parallel. We assume data retrieval done during discovery is effectively instant, and that any long-duration data retrieval will get marked by the user as delayed to execution time so as not to slow down the discovery process.

Parallel data discovery would also only help you here if you're doing this database setup process many times (as opposed to once), and even then the parallelism would likely be gated by how parallel the database engine itself is. Merely reading data from existing database tables would obviously be something that could probably parallelize okay, but if you're actually doing data setup where you're manipulating the tables to put them into a consistent state, then parallelization would potentially run afoul where you might be trying to mutate the same table(s) at the same time and end up colliding with yourself.

I'd like to ponder on the idea of parallelizing data discovery before committing to it, because I can see there are at least as many potential pitfalls as benefits, and for the vast majority of users who are creating synthetic data (the majority via [InlineData], I suspect), such parallelization might not actually provide any benefit, and instead cost in terms of absolute performance due to the overhead. Parallelization really requires something truly asynchronous (like your database example) and/or things which are exceptionally compute intensive. Neither of those describe the typical data sources in unit tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants