New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create analyzer for non-serializable data in Theory
#2866
Comments
Theory
Theory
Hi Andrew. ProblemCould you post a code snippet to demonstrate the issue? Did your test method have a dictionary-type parameter declaration? Or did it have an For context, could you clarify why this problem was a source of pain and how a solution would be valuable? I'm guessing it has to do with the fact that, if a theory test method has any non-serializable test case value, then none of its test cases will be enumerated in the test runner, collapsing them into a single item for the test method as a whole. As a result, you will need to run all of the method's test cases together every time, and you won't be able to run a single test case or subset of test cases in isolation, which might be especially problematic for longer-running tests or test methods with many test cases. SolutionCould you clarify what you had in mind for a diagnostic produced by such an analyzer? Here are a few alternatives I imagine you might have in mind.
Here are my thoughts on each of these. Am I missing anything? If anyone has any additional thoughts or suggestions, please let me know. |
1. Diagnostic on a theory method parameterRelatively straightforward to implement, but sometimes would produce less useful diagnostics.
Expected diagnostics[Theory]
[InlineData(...)]
[MemberData(...)]
[ClassData(...)]
public void IntegerTestMethod(int number) // No diagnostic
{ ... }
[Theory]
[MemberData(...)]
[ClassData(...)]
public void DictionaryTestMethod(Dictionary<int, string> dictionary) // DIAGNOSTIC
{ ... }
[Theory]
[InlineData(...)] // Test cases might actually be serializable
[MemberData(...)] // Test cases might actually be serializable
[ClassData(...)] // Test cases might actually be serializable
public void ObjectTestMethod(object? parameter) // DIAGNOSTIC, with possible false positives
{ ... }
[Theory]
[InlineData(...)] // Test cases might actually be serializable
[MemberData(...)] // Test cases might actually be serializable
[ClassData(...)] // Test cases might actually be serializable
public void EnumerableTestMethod(IEnumerable<int> numbers) // DIAGNOSTIC, with possible false positives
{ ... } |
2. Diagnostic on a theory data attributeVery ambitious and complex to implement, but would generally produce more useful diagnostics.
public static IEnumerable<object[]> TestCases => GetTestCases();
private static IEnumerable<object[]> GetTestCases() => new object[][] { GetTestCase1(), GetTestCase2() };
private static object[] GetTestCase1() => new object[] { GetValue1() };
private static object[] GetTestCase2() { var array = new object[1]; array[0] = GetValue2(); return array; }
private static object GetValue1() => 1;
private static object GetValue2() => GetCollection(array: false);
private static object GetCollection(bool array) => array ? new int[0] : new Dictionary<int, string>();
Alternative suggestionMaybe such an analyzer could have more modest ambitions?
Expected diagnosticspublic enum NonSerializableEnum { Zero } // Enumeration from a non-local assembly
public class SerializableClass : IEnumerable<object[]>
{
public IEnumerator<object[]> GetEnumerator()
{
yield return new object[] { 1 };
yield return new object[] { false };
yield return new object[] { "Text" };
}
IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
}
public class NonSerializableClass : IEnumerable<object[]>
{
public IEnumerator<object[]> GetEnumerator()
{
yield return new object[] { 1 };
yield return new object[] { false };
yield return new object[] { "Text" };
yield return new object[] { NonSerializableEnum.Zero };
yield return new object[] { new Dictionary<int, string> { { 1, "1" }, { 2, "2" } } };
}
IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
}
public class TestClass
{
public static IEnumerable<object[]> SerializableMember =>
[
[1],
[false],
["Text"]
];
public static IEnumerable<object[]> NonSerializableMember =>
[
[1],
[false],
["Text"],
[NonSerializableEnum.Zero],
[new Dictionary<int, string> { { 1, "1" }, { 2, "2" } }]
];
[Theory]
[InlineData(1)]
[InlineData(false)]
[InlineData("Text")]
[MemberData(nameof(SerializableMember))]
[ClassData(typeof(SerializableClass))]
public void SerializableTestMethod(object? parameter)
{ ... }
[Theory]
[InlineData(NonSerializableEnum.Zero)] // DIAGNOSTIC
[InlineData(new NonSerializableEnum[] { NonSerializableEnum.Zero })] // DIAGNOSTIC
[MemberData(nameof(NonSerializableMember))] // DIAGNOSTIC
[ClassData(typeof(NonSerializableClass))] // DIAGNOSTIC
public void NonSerializableTestMethod(object? parameter)
{ ... }
} |
3. Diagnostic on a test case argument expressionSimilar assessment as for section 2.
Alternative suggestionMaybe such an analyzer could have more modest ambitions?
Expected diagnosticspublic enum NonSerializableEnum { Zero } // Enumeration from a non-local assembly
public class SerializableClass : IEnumerable<object[]>
{
public IEnumerator<object[]> GetEnumerator()
{
yield return new object[] { 1 };
yield return new object[] { false };
yield return new object[] { "Text" };
}
IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
}
public class NonSerializableClass : IEnumerable<object[]>
{
public IEnumerator<object[]> GetEnumerator()
{
yield return new object[] { 1 };
yield return new object[] { false };
yield return new object[] { "Text" };
yield return new object[]
{
NonSerializableEnum.Zero // DIAGNOSTIC
};
yield return new object[]
{
new Dictionary<int, string> { { 1, "1" }, { 2, "2" } } // DIAGNOSTIC
};
}
IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
}
public class TestClass
{
public static IEnumerable<object[]> SerializableMember =>
[
[1],
[false],
["Text"]
];
public static IEnumerable<object[]> NonSerializableMember =>
[
[1],
[false],
["Text"],
[
NonSerializableEnum.Zero // DIAGNOSTIC
],
[
new Dictionary<int, string> { { 1, "1" }, { 2, "2" } } // DIAGNOSTIC
]
];
[Theory]
[InlineData(1)]
[InlineData(false)]
[InlineData("Text")]
[MemberData(nameof(SerializableMember))]
[ClassData(typeof(SerializableClass))]
public void SerializableTestMethod(object? parameter)
{ ... }
[Theory]
[InlineData(
NonSerializableEnum.Zero // DIAGNOSTIC
)]
[InlineData(
new NonSerializableEnum[] { NonSerializableEnum.Zero } // DIAGNOSTIC
)]
[MemberData(nameof(NonSerializableMember))]
[ClassData(typeof(NonSerializableClass))]
public void NonSerializableTestMethod(object? parameter)
{ ... }
} |
@andrewlock wrote:
It's possible, but the difficulty depends heavily on the code in question. Here are just two scenarios: You use The simplest version is when you use public static IEnumerable<object[]> MyDataSource()
{
yield return new object[] { new object() };
} We can see the type from Now add a layer of indirection: public static IEnumerable<object[]> MyDataSource()
{
var data = new object();
yield return new object[] { data };
} I think Roslyn would be able to tell us the type of I think this is doable here. Next up... You use Let's convert the second sample: public static IEnumerable<object[]> MyDataSource()
{
var result = new List<object[]>();
var data = new object();
result.Add(new[] { data });
return result;
} Here's where I'm not sure how complex everything gets. We start by looking at You can see how this becomes complicated very quickly. We can consider adding some of the simpler scenarios, but it would be helpful to have real world examples of situations you're running into, and then it would be easier to understand whether that's a scenario that's easily covered by an analyzer or not. It's not easy to solve this hypothetically without knowing what the actual code is that you're using, hoping we can catch. The lowest hanging fruit (easiest for us to identify) would be if you are using Interested in hearing thoughts. |
I think this is a good idea. Doing this, if nothing else, would strike a good balance between value provided and effort required to implement. Here are some of my thoughts. Let me know what you think, and please point out anything I might have gotten wrong. Possibly serializable types vs. definitely non-serializable types
Yes, you're right. But it doesn't stop there. Any interfaces, unsealed classes, and unsealed records that are not themselves serializable might be used as Here are a few examples of possibly serializable types:
On the other hand, any structs, record structs, sealed classes, and sealed records that are non-serializable are definitely non-serializable. Plus various other type kinds are never serializable, such as delegates and pointers (if I'm not mistaken), etc. Definitely a problem vs. possibly a problem
For a large number of possible It seems to me that completely ignoring these cases, and providing a diagnostic only for definitely non-serializable types, would significantly take away from the value that this analyzer would provide. On the other hand, I'm reluctant to annoy developers with excessive warnings that may include many false positives. Some developers will know what they are doing, and will intentionally use only serializable values in test cases when using polymorphic But of course, if the diagnostics are obtrusive, developers can manually suppress or configure them according to their preferences. I'd suggest providing a diagnostic for both non-serializable and possibly serializable
Diagnostic severityDo you think No diagnostics when discovery enumeration is disabledOne key point to consider is that, if a developer has explicitly set Statically analyzing serializability of enumerationsCan static analysis determine whether an
But analyzers are banned from using reflection, so another approach would be needed. I did not find any way of determining this through static analysis. Is there some way of doing this? If not, then since we can never know whether they're local or from the GAC, would it be best to ignore enumeration types and not provide diagnostics for them? |
Sorry for the delay in responding to this. I will look at the PR today.
This sounds reasonable. These should reported as two different diagnostics so that developers could easily disable the rule about "possibly not serializable value" without having to give up the "definitely not serializable value" rule. The rule text should also include some language about why, so maybe the wording could be something like:
Since the issue is one of usability (in Test Explorer) and not one of correctness (since the tests run correctly), I think
Definitely agree with this (and a
A quick and obvious optimization: there is only a GAC with .NET Framework, so this test is unnecessary unless the test project targets .NET Framework. As for a definitive determination? I'm not aware of one. While checking an enum, we could try to see if Roslyn has source-level access to the enum's definition (assuming that's doable and cheap), in which case we'd know it was safe since it came from the test project itself or one of the projects it references. Otherwise, everything else would fall into the "not sure" category. |
No problem!
Sounds good.
Yes, I noticed that and wondered why Alright, I'll work on adding logic to ignore methods with one or more Though what about
Alright, I'll look into this some more. |
I agree, though file I/O is supposedly prohibited in analyzers, so I'm not sure how you figure this out (not to mention the logic for configuration files, while not super complex, is also not exactly trivial). |
Fixed in xunit/xunit.analyzers#183 |
We just discovered that someone had used non-serializable data (a dictionary) in a Theory test case, because we happened to look at the CI logs.
It would be nice if there was an analyzer that could warn you about this ahead of time. I had a look around but couldn't find any mention of one.
Thanks!
The text was updated successfully, but these errors were encountered: