Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] lack of feedbacks to debug failed submission #450

Closed
Deathn0t opened this issue Jun 30, 2020 · 6 comments
Closed

[BUG] lack of feedbacks to debug failed submission #450

Deathn0t opened this issue Jun 30, 2020 · 6 comments
Assignees
Projects

Comments

@Deathn0t
Copy link

Deathn0t commented Jun 30, 2020

Hello,

I am currently trying to run a competition (link to competition). The bundle (zip) was loaded properly and the competition created. Then, I tried to do a sample submission to test my competition. The submission failed but when I try to look at the logs trough the website I can't see any error (see attached screenshots). I am having difficulties to debug this because I am not even sure about the root of the failure.

Screenshot 2020-06-29 at 16 30 20

Screenshot 2020-06-29 at 16 30 56

Screenshot 2020-06-29 at 16 31 10

@madclam madclam changed the title [BUG] lake of feedbacks to debug failed submission [BUG] lack of feedbacks to debug failed submission Jun 30, 2020
@acletournel
Copy link
Collaborator

In case it helps, for Eric: I have extracted the docker logs from the v2-production compute worker at the timestamp of the described issue:
20200626-Dockerlogs-wk1.txt

@ckcollab
Copy link
Contributor

ckcollab commented Jul 2, 2020

Can you attach the bundle + submission here?

@Deathn0t
Copy link
Author

Deathn0t commented Jul 3, 2020

The bundle:
bundle-0.zip

The submission:
ZeroWorflow.zip

@ckcollab ckcollab added this to the Codabench Sprint #1 milestone Jul 3, 2020
@ckcollab ckcollab added this to Icebox in Codabench via automation Jul 3, 2020
@ckcollab
Copy link
Contributor

I believe this produces more output now, I see this in Ingestion stderr:

2020-07-27 23:28:39.366515: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1 
2020-07-27 23:28:39.366586: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: UNKNOWN ERROR (-1) 
2020-07-27 23:28:39.366622: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (dc39eb8f039f): /proc/driver/nvidia/version does not exist 
2020-07-27 23:28:39.709189: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA 
2020-07-27 23:28:39.867807: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 2299995000 Hz 
2020-07-27 23:28:39.869088: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7f97c8000b20 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 
2020-07-27 23:28:39.869124: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version

@dde6khkg
Copy link
Contributor

@Deathn0t Can you close this guy if it's good to close?

@Deathn0t
Copy link
Author

@dde6khkg I just verified on the platform. I am closing this issue thanks!

Codabench automation moved this from Icebox to Completed Jul 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Codabench
  
Completed
Development

No branches or pull requests

4 participants