-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wandb: Network error (ConnectionError), entering retry loop. #83
Comments
#34 here is same issue but not able to solve |
solved the issue by adding it inside my .sh file Kindly use the below proxy ip addrss for wandb connection. export http_proxy=http://dgx-proxy-mn.mgmt.siddhi.param:9090/ export ftp_proxy=http://dgx-proxy-mn.mgmt.siddhi.param:9090/ export https_proxy=http://dgx-proxy-mn.mgmt.siddhi.param:9090/ |
WandB Internal User commented: |
im using NVIDIA A100 40GB GPU to train my object detection model , im using ubantu 20.04 server machine & https://github.com/WongKinYiu/yolov7 this repo / but not able to do trainig of model because of wamdb issue
im using batch scripting to launch my job
error log
YOLOR 🚀 v0.1-43-g8b72ac7 torch 1.9.0+cu111 CUDA:0 (A100-SXM4-40GB, 40537.1875MB)
Retry attempt failed:
Traceback (most recent call last):
File "/nlsasfs/home/reflexion/chandnip/Conda/envs/yolov7/lib/python3.8/site-packages/urllib3/connection.py", line 174, in _new_conn
conn = connection.create_connection(
File "/nlsasfs/home/reflexion/chandnip/Conda/envs/yolov7/lib/python3.8/site-packages/urllib3/util/connection.py", line 72, in create_connection
for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
File "/nlsasfs/home/reflexion/chandnip/Conda/envs/yolov7/lib/python3.8/socket.py", line 918, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -2] Name or service not known
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/nlsasfs/home/reflexion/chandnip/Conda/envs/yolov7/lib/python3.8/site-packages/urllib3/connectionpool.py", line 703, in urlopen
httplib_response = self._make_request(
File "/nlsasfs/home/reflexion/chandnip/Conda/envs/yolov7/lib/python3.8/site-packages/urllib3/connectionpool.py", line 386, in _make_request
self._validate_conn(conn)
File "/nlsasfs/home/reflexion/chandnip/Conda/envs/yolov7/lib/python3.8/site-packages/urllib3/connectionpool.py", line 1042, in _validate_conn
conn.connect()
File "/nlsasfs/home/reflexion/chandnip/Conda/envs/yolov7/lib/python3.8/site-packages/urllib3/connection.py", line 358, in connect
self.sock = conn = self._new_conn()
File "/nlsasfs/home/reflexion/chandnip/Conda/envs/yolov7/lib/python3.8/site-packages/urllib3/connection.py", line 186, in _new_conn
raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7f93b64ab520>: Failed to establish a new connection: [Errno -2] Name or service not known
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/nlsasfs/home/reflexion/chandnip/.local/lib/python3.8/site-packages/requests/adapters.py", line 489, in send
resp = conn.urlopen(
File "/nlsasfs/home/reflexion/chandnip/Conda/envs/yolov7/lib/python3.8/site-packages/urllib3/connectionpool.py", line 787, in urlopen
retries = retries.increment(
File "/nlsasfs/home/reflexion/chandnip/Conda/envs/yolov7/lib/python3.8/site-packages/urllib3/util/retry.py", line 592, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='api.wandb.ai', port=443): Max retries exceeded with url: /graphql (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f93b64ab520>: Failed to establish a new connection: [Errno -2] Name or service not known'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/nlsasfs/home/reflexion/chandnip/Conda/envs/yolov7/lib/python3.8/site-packages/wandb/sdk/lib/retry.py", line 108, in call
result = self._call_fn(*args, **kwargs)
File "/nlsasfs/home/reflexion/chandnip/Conda/envs/yolov7/lib/python3.8/site-packages/wandb/sdk/internal/internal_api.py", line 158, in execute
return self.client.execute(*args, **kwargs)
File "/nlsasfs/home/reflexion/chandnip/Conda/envs/yolov7/lib/python3.8/site-packages/wandb/vendor/gql-0.2.0/wandb_gql/client.py", line 52, in execute
result = self._get_result(document, *args, **kwargs)
File "/nlsasfs/home/reflexion/chandnip/Conda/envs/yolov7/lib/python3.8/site-packages/wandb/vendor/gql-0.2.0/wandb_gql/client.py", line 60, in _get_result
return self.transport.execute(document, *args, **kwargs)
File "/nlsasfs/home/reflexion/chandnip/Conda/envs/yolov7/lib/python3.8/site-packages/wandb/vendor/gql-0.2.0/wandb_gql/transport/requests.py", line 38, in execute
request = requests.post(self.url, **post_args)
File "/nlsasfs/home/reflexion/chandnip/.local/lib/python3.8/site-packages/requests/api.py", line 115, in post
return request("post", url, data=data, json=json, **kwargs)
File "/nlsasfs/home/reflexion/chandnip/.local/lib/python3.8/site-packages/requests/api.py", line 59, in request
return session.request(method=method, url=url, **kwargs)
File "/nlsasfs/home/reflexion/chandnip/.local/lib/python3.8/site-packages/requests/sessions.py", line 587, in request
resp = self.send(prep, **send_kwargs)
File "/nlsasfs/home/reflexion/chandnip/.local/lib/python3.8/site-packages/requests/sessions.py", line 701, in send
r = adapter.send(request, **kwargs)
File "/nlsasfs/home/reflexion/chandnip/.local/lib/python3.8/site-packages/requests/adapters.py", line 565, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='api.wandb.ai', port=443): Max retries exceeded with url: /graphql (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f93b64ab520>: Failed to establish a new connection: [Errno -2] Name or service not known'))
wandb: Network error (ConnectionError), entering retry loop.
wandb: W&B API key is configured. Use
wandb login --relogin
to force reloginwandb: Network error (ConnectionError), entering retry loop.
The text was updated successfully, but these errors were encountered: