Investigate TVM playing nice with Docker #163

ninehusky · 2022-01-24T04:10:52Z

docker build --tag glenside .
docker run glenside cargo test --no-default-features --features tvm

On a clean copy of the repository, running the commands above works on the first iteration.

However, subsequent runs of the test suite produce the following output:

failures:

---- codegen::tests::relay_op_softmax stdout ----
thread 'codegen::tests::relay_op_softmax' panicked at 'Running Relay code failed with code Some(1).
stdout:

stderr:
Traceback (most recent call last):
  File "/root/glenside/src/language/from_relay/run_relay.py", line 42, in <module>
    output = relay.create_executor(mod=expr, kind="graph").evaluate()(*inputs)
  File "/root/tvm/python/tvm/relay/backend/interpreter.py", line 172, in evaluate
    return self._make_executor()
  File "/root/tvm/python/tvm/relay/build_module.py", line 395, in _make_executor
    mod = build(self.mod, target=self.target)
  File "/root/tvm/python/tvm/relay/build_module.py", line 277, in build
    tophub_context = autotvm.tophub.context(list(target.values()))
  File "/root/tvm/python/tvm/autotvm/tophub.py", line 116, in context
    if not check_backend(tophub_location, name):
  File "/root/tvm/python/tvm/autotvm/tophub.py", line 158, in check_backend
    download_package(tophub_location, package_name)
  File "/root/tvm/python/tvm/autotvm/tophub.py", line 184, in download_package
    os.mkdir(path)
FileExistsError: [Errno 17] File exists: '/root/.tvm'
', src/codegen.rs:1836:9

---- codegen::tests::relay_op_relu stdout ----
thread 'codegen::tests::relay_op_relu' panicked at 'Running Relay code failed with code Some(1).
stdout:

stderr:
Traceback (most recent call last):
  File "/root/glenside/src/language/from_relay/run_relay.py", line 42, in <module>
    output = relay.create_executor(mod=expr, kind="graph").evaluate()(*inputs)
  File "/root/tvm/python/tvm/relay/backend/interpreter.py", line 172, in evaluate
    return self._make_executor()
  File "/root/tvm/python/tvm/relay/build_module.py", line 395, in _make_executor
    mod = build(self.mod, target=self.target)
  File "/root/tvm/python/tvm/relay/build_module.py", line 277, in build
    tophub_context = autotvm.tophub.context(list(target.values()))
  File "/root/tvm/python/tvm/autotvm/tophub.py", line 116, in context
    if not check_backend(tophub_location, name):
  File "/root/tvm/python/tvm/autotvm/tophub.py", line 158, in check_backend
    download_package(tophub_location, package_name)
  File "/root/tvm/python/tvm/autotvm/tophub.py", line 184, in download_package
    os.mkdir(path)
FileExistsError: [Errno 17] File exists: '/root/.tvm/tophub'
', src/codegen.rs:1836:9

---- codegen::tests::relay_op_batchflatten stdout ----
thread 'codegen::tests::relay_op_batchflatten' panicked at 'Running Relay code failed with code Some(1).
stdout:

stderr:
Traceback (most recent call last):
  File "/root/glenside/src/language/from_relay/run_relay.py", line 42, in <module>
    output = relay.create_executor(mod=expr, kind="graph").evaluate()(*inputs)
  File "/root/tvm/python/tvm/relay/backend/interpreter.py", line 172, in evaluate
    return self._make_executor()
  File "/root/tvm/python/tvm/relay/build_module.py", line 395, in _make_executor
    mod = build(self.mod, target=self.target)
  File "/root/tvm/python/tvm/relay/build_module.py", line 277, in build
    tophub_context = autotvm.tophub.context(list(target.values()))
  File "/root/tvm/python/tvm/autotvm/tophub.py", line 116, in context
    if not check_backend(tophub_location, name):
  File "/root/tvm/python/tvm/autotvm/tophub.py", line 158, in check_backend
    download_package(tophub_location, package_name)
  File "/root/tvm/python/tvm/autotvm/tophub.py", line 184, in download_package
    os.mkdir(path)
FileExistsError: [Errno 17] File exists: '/root/.tvm/tophub'
', src/codegen.rs:1836:9


failures:
    codegen::tests::relay_op_batchflatten
    codegen::tests::relay_op_relu
    codegen::tests::relay_op_softmax

test result: FAILED. 300 passed; 3 failed; 8 ignored; 0 measured; 0 filtered out; finished in 33.86s

Sometimes, clearing the Docker cache and rebuilding the image can fix this issue, but it doesn't always fix it for some reason.

We should look into this!

The text was updated successfully, but these errors were encountered:

gussmith23 · 2022-01-24T19:15:26Z

@ninehusky can you see what happens when you run the tests on a single thread? See: https://doc.rust-lang.org/book/ch11-02-running-tests.html#running-tests-in-parallel-or-consecutively

I suspect what is happening is this:
cargo test runs tests in parallel. Multiple tests which use TVM get started at the same time. When TVM gets used for the first time by these tests, it does some kind of initialization in which it initializes the /root/.tvm/tophub directory. So when multiple tests trigger this initialization in parallel, there's a race condition to see which thread creates the directory first.

If that's the case, we'll probably need to find a way to trigger that setup before running the tests.

gussmith23 · 2022-01-24T19:19:49Z

Oh, lol, this has already been fixed:
apache/tvm@bf20107

I was looking in the tophub.py file from which the error is triggered. It seemed like the error had been anticipated and fixed, though, so I checked the git blame and found the above PR, in which someone fixed the issue.

So to fix this issue we should just need to update TVM. This may be an easy fix; I'll give it a go right now.

ninehusky added the continuous integration Issues of Continuous Integration (aka Github Actions) label Jan 24, 2022

gussmith23 mentioned this issue Jan 24, 2022

Update TVM #164

Merged

ninehusky closed this as completed in #164 Jan 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate TVM playing nice with Docker #163

Investigate TVM playing nice with Docker #163

ninehusky commented Jan 24, 2022 •

edited

gussmith23 commented Jan 24, 2022

gussmith23 commented Jan 24, 2022

Investigate TVM playing nice with Docker #163

Investigate TVM playing nice with Docker #163

Comments

ninehusky commented Jan 24, 2022 • edited

gussmith23 commented Jan 24, 2022

gussmith23 commented Jan 24, 2022

ninehusky commented Jan 24, 2022 •

edited