[TEST] Conv2dGrad Takes long time to finish #8579

tqchen · 2021-07-28T22:24:53Z

Looking at the test time log, https://ci.tlcpack.ai/job/tvm/job/main/1356/testReport/ctypes.tests.python.relay/test_op_grad_level2/

conv2d grad takes a long time to finish(1h)..

Possible cause: this is likely due to our numerical gradient checker trying to do numerical gradient for all directions which scales linearly to the number of entries.

Possible resolutions:

K0: Use memoization to cache the numerical gradient needs.
K1: run a different kind of test that randomizes the direction and check consistency, see page 8 of this slide

I would suggest we go with K1 with a few directions instead of the full direction checking.

tqchen · 2021-07-28T22:38:55Z

Followup with @altanh showing that this might be related to a perf regression, we will follow up on this thread

altanh · 2021-07-28T23:28:45Z

After analysis we believe this regression is due to a combination of changes in #8486 and also an inefficient loop in the check_grad function:

[RELAY]Switch from CompileEngine to TECompiler in Interpreter #8486 replaced the previous global (or thread-local) CompileEngine with a fresh TECompiler per Interpreter instance. This means that the previous behavior of caching lowered functions (for given input types) globally across all interpreters no longer holds.
in check_grad at

tvm/python/tvm/relay/testing/__init__.py

Line 163 in 720e7b1

fwd_plus = intrp.evaluate(fwd_func)(*inputs).numpy().astype("float64")

notice that the forward function is being recompiled by the interpreter executor for every element of the input tensor. Important to note here that the interpreter executor is actually creating a new Interpreter object for every evaluated expression (and hence losing the cache) causing a complete recompile on each evaluation.

Proposed Solution

The second point can be easily rectified by hoisting the forward function evaluation outside the hot loop, as the function doesn't change. In fact it only needs to be evaluated once per device and target combination. This will immediately fix the extreme regression. A PR will be posted soon.

Regarding the first point, we would like to move away from relying on hidden global caching for performance as much as possible as this has caused confusion in the past. Thus we will not modify the new interpreter behavior.

altanh · 2021-07-29T06:01:46Z

Update: the problem goes a bit deeper with the Interpreter executor. Specifically, the function returned by intrp.evaluate(fwd_func) is actually a Python closure which itself creates the new interpreter each time it is invoked. This means we can't even get around the caching problem by just calling .evaluate once.

Possible solutions:

just give up on using the interpreter executor for now
rewrite the evaluate function so that it only creates 1 interpreter (and has the closure call it) rather than a new interpreter each time the closure is invoked

I'll go ahead with option 2 in the hot fix PR for now

tqchen · 2021-07-29T13:02:32Z

Let us go with 1 and just skip the interpreter executor in the grad evaluation.

tqchen · 2021-07-30T14:00:04Z

Closing this for now, opened #8601 to track the followup item

tqchen changed the title ~~[TEST] cImprove Numerical Gradient Checker~~ [TEST] Improve Numerical Gradient Checker Jul 28, 2021

tqchen changed the title ~~[TEST] Improve Numerical Gradient Checker~~ [TEST] Conv2dGrad Takes long time to finish Jul 28, 2021

altanh mentioned this issue Jul 28, 2021

[FIX][CI] hotfix check_grad perf regression #8581

Merged

junrushao linked a pull request Jul 29, 2021 that will close this issue

[FIX][CI] hotfix check_grad perf regression #8581

Merged

ganler mentioned this issue Jul 29, 2021

[Tutorial][Executor] Fix the usage of executors in tutorials #8586

Merged

leandron closed this as completed in #8581 Jul 30, 2021

tqchen mentioned this issue Jul 30, 2021

[TEST] Improve CheckGrad Numerical Gradient checking #8601

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TEST] Conv2dGrad Takes long time to finish #8579

[TEST] Conv2dGrad Takes long time to finish #8579

tqchen commented Jul 28, 2021 •

edited

Loading

tqchen commented Jul 28, 2021 •

edited

Loading

altanh commented Jul 28, 2021

altanh commented Jul 29, 2021

tqchen commented Jul 29, 2021

tqchen commented Jul 30, 2021

[TEST] Conv2dGrad Takes long time to finish #8579

[TEST] Conv2dGrad Takes long time to finish #8579

Comments

tqchen commented Jul 28, 2021 • edited Loading

tqchen commented Jul 28, 2021 • edited Loading

altanh commented Jul 28, 2021

Proposed Solution

altanh commented Jul 29, 2021

tqchen commented Jul 29, 2021

tqchen commented Jul 30, 2021

tqchen commented Jul 28, 2021 •

edited

Loading

tqchen commented Jul 28, 2021 •

edited

Loading