[UnitTests] Parametrized Unit Tests #7

Lunderberg · 2021-06-09T17:16:40Z

Migrating earlier conversation from https://discuss.tvm.apache.org/t/rfc-parametrized-unit-tests/9946. RFC should be up to date with all changes/conversations that occurred in that thread.

@areusch Let's continue any discussion here, to avoid having multiple places to check for the same RFC.

areusch · 2021-06-10T18:53:54Z

@Lunderberg I think there were two open items from the discuss post:

shadowing local vars with module-level vars containing their fixtures
caching non-serializable data

on 1, it sounds like you had found a solution, so I think we're resolved there.

on 2, you'd suggested to table the discussion of how to cache non-serializable objects and just cache serializable ones. that sounds great to me.

so then, the real reason I was having this debate with you: i'd like to propose that in all cases (cache hot or cold), we always provide test functions with a brand-new deserialized copy of the data. i think that solves all of my concerns around tests poisoning downstream tests with cached data, and hopefully has not-too-significant performance impact. what are your thoughts?

Lunderberg · 2021-06-11T15:33:57Z

@areusch

I think that makes sense for now, and is a good way to maintain separation between unit tests. Running through copy.deepcopy would be the easiest way to do the serialize/deserialize steps on arbitrary data. It throws a TypeError if a non-serializable value (e.g. tvm.nd.array) is passed, so that gives a good way to have the serializable-only restriction be checked by the CI.

I ran a few quick benchmarks for numpy arrays, since I think those would be the most frequently cached result. copy.deepcopy is a little bit slower than calling dedicated copying functions (e.g. my_numpy_array.copy()), but wouldn't require any type-checking on our part.

areusch · 2021-06-11T15:45:11Z

@Lunderberg I was thinking you could just require tests provide bytes to be cached.

Lunderberg · 2021-06-11T22:33:07Z

How would we deserialize the objects in that case? I thought that __bytes__ was a one-way conversion from an object to a sequence of bytes, and didn't require objects to be deserializable.

Lunderberg · 2021-06-18T21:33:02Z

@areusch I've added verbiage to the RFC to indicate use of copy.deepcopy for the results of cached fixtures, and have updated the PR with the copying. The error message if a non-serializable type is cached gives a link to this RFC, to avoid any confusion in the future.

areusch · 2021-06-25T15:34:42Z

.gitignore

@@ -0,0 +1,2 @@
+# Emacs temporary files


i think these should go in your global .gitignore not the project-specific one

Fair enough, and I can remove it if that is preferable. I try to avoid using the global .gitignore in most cases, so that git add . is safe to use in general, and not just on my environment. The tvm repo already has this line in its .gitignore, so I figured it was good to add here. (Though in retrospect it probably should have been a separate PR.)

okay, please remove this from the change then. we can debate it on another RFC

Lunderberg · 2021-06-29T18:49:51Z

Copying some notes from a 1:1, along with commentary at the bottom.

Caching options listed below, arranged from most flexibility at the top to least flexibility at the bottom.

All caching allowed. A fixture tagged with cache_return_value=True can return any python object. All tests that use the fixture receive the same python object as input.
- Pros
  - Most potential to speed up running of CI.
  - Most potential to speed up by cross-function/file use of shared setup (e.g. np.random.uniform)
  - Works for any type.
- Cons
  - Most potential to have interacting unit tests.
  - Could make somebody think that interactions are allowed between unit tests.
Cache only copy-able types. A fixture tagged with cache_return_value=True has cached return values, but each parametrized unit test receives a copy made by copy.deepcopy prior to returning the object. Any types that raise an error during copying result in a failed setup for the test.
- Pros
  - Very simple to add caching to existing functions.
- Cons
  - Unintentionally copy global state that is pointed to by local state. Doesn't impact correctness, but would increase runtime.
  - Less flexibility, an object that cannot be deepcopy-ed results in a type error.
Cache only copy-able types, but with a custom implementation of copy.deepcopy. Would follow the same rules if a __deepcopy__ or pair of __getstate__/ __setstate__ exists, but would otherwise require the type to be in a list of allowed types. List of types
- Pros
  - Opt-in both by fixture and by data-type.
- Cons
  - Re-implements a common utility.
Cache using explicitly specified serialize/deserialize functions. A fixture tagged with cache_return_value=True must return three values. The first is the fixture value itself. The second/third are serialize/deserialize functions for that value.
- Pros
- Cons
Only caching of bytes type is allowed. A fixture tagged with cache_return_value=True must return a bytes object. This object is deserialized at the start of the test function into the desired type.
- Pros
- Cons
No caching. All fixtures are recomputed for each parametrized value.
- Pros
  - All unit tests are independent by design. Zero interaction possible.
- Cons
  - Most potential impact to CI runtime.

Overall, @areusch 's concern is avoiding hidden failure modes, especially where there could be tests that only appear to pass due to cross-talk. As such, he is more comfortable with more explicit usage. and would be comfortable with any of implementations 3-5. @Lunderberg 's main concern is usability and ease of enabling, such that a slow test can have caching enabled with minimal effort where it is needed, and would be comfortable with any of implementations 2-3. One key point that we came up with is that we expect fixtures to be uncached by default (@tvm.testing.fixture), and to have caching enabled on a per-fixture basis (@tvm.testing.fixture(cache_return_value=True) where there are performance benefits to be gained.

Since #3 is the overlap we both like between having opt-in caching, both on a per-return-value and per-fixture basis, I've put together a test implementation. It calls copy.deepcopy, but passes in a custom memo argument that performs type-checking of the copied type. Types can be explicitly listed as being safe to copy, but the use of the default object.__reduce__ is disabled. This implementation minimizes the amount of re-implementing that would need to be done, but does rely on ctypes to identify the object being checked.

@areusch Can you check this over to make sure I've given an accurate description of your comments?

areusch · 2021-07-07T18:30:39Z

@Lunderberg i think that sounds right. i agree #3 is the overlap between our positions. i'd like for others to comment whether it may be too complex to understand in debugging; otherwise i think we can proceed here. to be explicit: we intend to copy only whitelisted types, and not allow copy.deepcopy to work on arbitrary objects.

Lunderberg · 2021-07-30T15:20:20Z

@areusch Any status changes on this?

areusch

hi @Lunderberg sorry for the long delay here. I reread this and just have a few minor clarifications I'd like you to make on the wording. Other than that, looks great! Feel free to ping aggressively when these are addressed and we can merge right away.

areusch · 2021-08-28T14:28:14Z

.gitignore

@@ -0,0 +1,2 @@
+# Emacs temporary files


okay, please remove this from the change then. we can debate it on another RFC

areusch · 2021-08-28T14:35:11Z