Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

requests.exceptions.HTTPError: 503 Server Error: Service Unavailable for url: http://localhost: 8080/get_cookie #262

Closed
HuntZhaozq opened this issue Nov 27, 2023 · 15 comments
Labels
bug Something isn't working

Comments

@HuntZhaozq
Copy link

HuntZhaozq commented Nov 27, 2023

Issue Description / 问题描述

requests.exceptions.HTTPError: 503 Server Error: Service Unavailable for url: http://localhost: 8080/get_cookie

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "project/XAgent/XAgentServer/server.py", line 93, in interact
raise XAgentRunningError(str(e)) from e
XAgentServer.exts.exception_ext.XAgentRunningError: 503 Server Error: Service Unavailable for url: http://localhost:8080/get_cookie

During handling of the above exception, another exception occurred:
AttributeError: 'ToolServerInterface' object has no attribute 'cookies'

Steps to Reproduce / 复现步骤

docker compose up
XAgentGen docker run
python run.py --task "find all the prime numbers <=100" --model "xagentllm" --config-file "assets/xagentllama.yml"
then get the above error.

Environment / 环境信息

  • Operating System / 操作系统:Ubuntu 22.04
  • Python Version / Python 版本:3.10
  • Other Relevant Information / 其他相关信息:The docker service is built on a docker. And only can get access to this docker with 5 ports.
@HuntZhaozq HuntZhaozq added the bug Something isn't working label Nov 27, 2023
@nlpbin
Copy link
Contributor

nlpbin commented Nov 27, 2023

Attached some log from ToolServerManager-1.

2023-11-27 11:31:47 [2023-11-27 03:31:47 +0000] [1] [INFO] Starting gunicorn 21.2.0
2023-11-27 11:31:47 [2023-11-27 03:31:47 +0000] [1] [INFO] Listening at: http://0.0.0.0:8080 (1)
2023-11-27 11:31:47 [2023-11-27 03:31:47 +0000] [1] [INFO] Using worker: uvicorn.workers.UvicornWorker
2023-11-27 11:31:47 [2023-11-27 03:31:47 +0000] [6] [INFO] Booting worker with pid: 6
2023-11-27 11:31:47 [2023-11-27 03:31:47 +0000] [7] [INFO] Booting worker with pid: 7
2023-11-27 11:31:48 [2023-11-27 03:31:48 +0000] [7] [INFO] Database connected
2023-11-27 11:31:48 [2023-11-27 03:31:48 +0000] [6] [INFO] Database connected
2023-11-27 11:31:48 [2023-11-27 03:31:48 +0000] [7] [INFO] Docker client connected
2023-11-27 11:31:48 [2023-11-27 03:31:48 +0000] [6] [INFO] Docker client connected
2023-11-27 11:31:49 [2023-11-27 03:31:49 +0000] [7] [INFO] Started server process [7]
2023-11-27 11:31:49 [2023-11-27 03:31:49 +0000] [7] [INFO] Waiting for application startup.
2023-11-27 11:31:49 [2023-11-27 03:31:49 +0000] [6] [INFO] Started server process [6]
2023-11-27 11:31:49 [2023-11-27 03:31:49 +0000] [6] [INFO] Waiting for application startup.
2023-11-27 11:31:49 [2023-11-27 03:31:49 +0000] [7] [ERROR] Traceback (most recent call last):
2023-11-27 11:31:49   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 677, in lifespan
2023-11-27 11:31:49     async with self.lifespan_context(app) as maybe_state:
2023-11-27 11:31:49   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 566, in __aenter__
2023-11-27 11:31:49     await self._router.startup()
2023-11-27 11:31:49   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 654, in startup
2023-11-27 11:31:49     await handler()
2023-11-27 11:31:49   File "/app/main.py", line 35, in startup
2023-11-27 11:31:49     async for checker in NodeChecker.find_all():
2023-11-27 11:31:49   File "/usr/local/lib/python3.10/site-packages/beanie/odm/queries/cursor.py", line 53, in __anext__
2023-11-27 11:31:49     return parse_obj(projection, next_item, lazy_parse=self.lazy_parse)  # type: ignore
2023-11-27 11:31:49   File "/usr/local/lib/python3.10/site-packages/beanie/odm/utils/parsing.py", line 110, in parse_obj
2023-11-27 11:31:49     result = parse_model(model, data)
2023-11-27 11:31:49   File "/usr/local/lib/python3.10/site-packages/beanie/odm/utils/pydantic.py", line 37, in parse_model
2023-11-27 11:31:49     return model_type.model_validate(data)
2023-11-27 11:31:49   File "/usr/local/lib/python3.10/site-packages/pydantic/main.py", line 503, in model_validate
2023-11-27 11:31:49     return cls.__pydantic_validator__.validate_python(
2023-11-27 11:31:49 pydantic_core._pydantic_core.ValidationError: 1 validation error for NodeChecker
2023-11-27 11:31:49 pid
2023-11-27 11:31:49   Field required [type=missing, input_value={'_id': ObjectId('6563558...ed62b', 'interval': 1.0}, input_type=dict]
2023-11-27 11:31:49     For further information visit https://errors.pydantic.dev/2.5/v/missing
2023-11-27 11:31:49 
2023-11-27 11:31:49 [2023-11-27 03:31:49 +0000] [7] [ERROR] Application startup failed. Exiting.
2023-11-27 11:31:49 [2023-11-27 03:31:49 +0000] [7] [INFO] Worker exiting (pid: 7)
2023-11-27 11:31:49 [2023-11-27 03:31:49 +0000] [6] [ERROR] Traceback (most recent call last):
2023-11-27 11:31:49   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 677, in lifespan
2023-11-27 11:31:49     async with self.lifespan_context(app) as maybe_state:
2023-11-27 11:31:49   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 566, in __aenter__
2023-11-27 11:31:49     await self._router.startup()
2023-11-27 11:31:49   File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 654, in startup
2023-11-27 11:31:49     await handler()
2023-11-27 11:31:49   File "/app/main.py", line 35, in startup
2023-11-27 11:31:49     async for checker in NodeChecker.find_all():
2023-11-27 11:31:49   File "/usr/local/lib/python3.10/site-packages/beanie/odm/queries/cursor.py", line 53, in __anext__
2023-11-27 11:31:49     return parse_obj(projection, next_item, lazy_parse=self.lazy_parse)  # type: ignore
2023-11-27 11:31:49   File "/usr/local/lib/python3.10/site-packages/beanie/odm/utils/parsing.py", line 110, in parse_obj
2023-11-27 11:31:49     result = parse_model(model, data)
2023-11-27 11:31:49   File "/usr/local/lib/python3.10/site-packages/beanie/odm/utils/pydantic.py", line 37, in parse_model
2023-11-27 11:31:49     return model_type.model_validate(data)
2023-11-27 11:31:49   File "/usr/local/lib/python3.10/site-packages/pydantic/main.py", line 503, in model_validate
2023-11-27 11:31:49     return cls.__pydantic_validator__.validate_python(
2023-11-27 11:31:49 pydantic_core._pydantic_core.ValidationError: 1 validation error for NodeChecker
2023-11-27 11:31:49 pid
2023-11-27 11:31:49   Field required [type=missing, input_value={'_id': ObjectId('6563558...ed62b', 'interval': 1.0}, input_type=dict]
2023-11-27 11:31:49     For further information visit https://errors.pydantic.dev/2.5/v/missing
2023-11-27 11:31:49 
2023-11-27 11:31:49 [2023-11-27 03:31:49 +0000] [6] [ERROR] Application startup failed. Exiting.
2023-11-27 11:31:49 [2023-11-27 03:31:49 +0000] [6] [INFO] Worker exiting (pid: 6)
2023-11-27 11:31:50 [2023-11-27 03:31:50 +0000] [1] [ERROR] Worker (pid:7) exited with code 3
2023-11-27 11:31:50 [2023-11-27 03:31:50 +0000] [1] [ERROR] Worker (pid:6) was sent SIGTERM!
2023-11-27 11:31:50 [2023-11-27 03:31:50 +0000] [1] [ERROR] Shutting down: Master
2023-11-27 11:31:50 [2023-11-27 03:31:50 +0000] [1] [ERROR] Reason: Worker failed to boot.

@luyaxi
Copy link
Collaborator

luyaxi commented Nov 27, 2023

Try to remove the collection NodeChecker in database TSM with mongosh, or just remove the entire db volumes with docker volume rm xagent_xagentmongodb (Note the specific volume name may be vary, check with docker volume ls)

@HuntZhaozq
Copy link
Author

Try to remove the collection NodeChecker in database TSM with mongosh, or just remove the entire db volumes with docker volume rm xagent_xagentmongodb (Note the specific volume name may be vary, check with docker volume ls)

@ I try docker volume rm xagent_xagentmongodb. But meet the following error:
Error response from daemon: remove xagent_xagentmongodb: volune is in use - [a82620b1952480fc06e503654f9b198cfbe331155a164cad9351a8609fa5cce3]

@Umpire2018
Copy link
Collaborator

Try to remove the collection NodeChecker in database TSM with mongosh, or just remove the entire db volumes with docker volume rm xagent_xagentmongodb (Note the specific volume name may be vary, check with docker volume ls)

@ I try docker volume rm xagent_xagentmongodb. But meet the following error: Error response from daemon: remove xagent_xagentmongodb: volune is in use - [a82620b1952480fc06e503654f9b198cfbe331155a164cad9351a8609fa5cce3]

Stop that container please.

@HuntZhaozq
Copy link
Author

HuntZhaozq commented Nov 27, 2023

@Umpire2018 @luyaxi I stop the container and remove the entire db volumes with docker volume rm xagent_xagentmongodb. Then, I restart the service and python run.py. But I meet the same error as the begining.

requests.exceptions.HTTPError: 503 Server Error: Service Unavailable for url: http://localhost/: 8080/get_cookie

@luyaxi
Copy link
Collaborator

luyaxi commented Nov 27, 2023

@Umpire2018 @luyaxi I stop the container and remove the entire db volumes with docker volume rm xagent_xagentmongodb. Then, I restart the service and python run.py. But I meet the same error as the begining.

requests.exceptions.HTTPError: 503 Server Error: Service Unavailable for url: http://localhost/: 8080/get_cookie

Please try with latest images ( run docker compose pull to get latest images)

@HuntZhaozq
Copy link
Author

@Umpire2018 @luyaxi I stop the container and remove the entire db volumes with docker volume rm xagent_xagentmongodb. Then, I restart the service and python run.py. But I meet the same error as the begining.
requests.exceptions.HTTPError: 503 Server Error: Service Unavailable for url: http://localhost/: 8080/get_cookie

Please try with latest images ( run docker compose pull to get latest images)

Hello, I try the latest images. But meet the same problems. It still doesn't work.

@luyaxi
Copy link
Collaborator

luyaxi commented Nov 27, 2023

@Umpire2018 @luyaxi I stop the container and remove the entire db volumes with docker volume rm xagent_xagentmongodb. Then, I restart the service and python run.py. But I meet the same error as the begining.
requests.exceptions.HTTPError: 503 Server Error: Service Unavailable for url: http://localhost/: 8080/get_cookie

Please try with latest images ( run docker compose pull to get latest images)

Hello, I try the latest images. But meet the same problems. It still doesn't work.

Does the docker logs still the same? Can you try to remove all containers with docker compose down and restart ?

@HuntZhaozq
Copy link
Author

HuntZhaozq commented Nov 27, 2023

How to confirm the docker logs are the same? I try to restart all containers. But still occurs the same error....

I found that the cmd logs of docker compose up also has some error.

XAgent-Server:
ConnectionRefusedError: [Errno 111] Connection refused.
......
pymysql.err.OperationalError: (2003, "Can't connect to MySQL server on 'xagent-mysql' ([Errno 111] Connection refused)")
......
sqlalchemy.exc.OperationError: (pymysql.err.OperationalError) (2003, "can't connect to MySQL server on 'xagent-mysql' ([Errno 111] Connection refused)")

@luyaxi
Copy link
Collaborator

luyaxi commented Nov 27, 2023

How to confirm the docker logs are the same?

I found that the logs has some error.

XAgent-Server: ConnectionRefusedError: [Errno 111] Connection refused. ...... pymysql.err.OperationalError: (2003, "Can't connect to MySQL server on 'xagent-mysql' ([Errno 111] Connection refused)") ...... sqlalchemy.exc.OperationError: (pymysql.err.OperationalError) (2003, "can't connect to MySQL server on 'xagent-mysql' ([Errno 111] Connection refused)")

Please check the ToolServerManager logs again. A complete delete of the db volumes should fix the first issue though. For the pymysql error, it caused by temporarily disconnection in setup db. You can skip it.

@HuntZhaozq
Copy link
Author

The logs seems the same as the begining. The first issue still exists.
The ToolServerManager logs shows below:

2023-11-27 11:31:47 [2023-11-27 03:31:47 +0000] [1] [INFO] Starting gunicorn 21.2.0
2023-11-27 11:31:47 [2023-11-27 03:31:47 +0000] [1] [INFO] Listening at: http://0.0.0.0:8080 (1)
2023-11-27 11:31:47 [2023-11-27 03:31:47 +0000] [1] [INFO] Using worker: uvicorn.workers.UvicornWorker
2023-11-27 11:31:47 [2023-11-27 03:31:47 +0000] [7] [INFO] Booting worker with pid: 7
2023-11-27 11:31:47 [2023-11-27 03:31:47 +0000] [8] [INFO] Booting worker with pid: 8
2023-11-27 11:31:47 [2023-11-27 03:31:47 +0000] [9] [INFO] Booting worker with pid: 9
2023-11-27 11:31:47 [2023-11-27 03:31:47 +0000] [10] [INFO] Booting worker with pid: 10
2023-11-27 11:31:48 [2023-11-27 03:31:48 +0000] [7] [INFO] Started server process [7]
2023-11-27 11:31:48 [2023-11-27 03:31:48 +0000] [8] [INFO] Started server process [8]
2023-11-27 11:31:49 [2023-11-27 03:31:49 +0000] [7] [INFO] Waiting for application startup.
2023-11-27 11:31:49 [2023-11-27 03:31:49 +0000] [8] [INFO] Waiting for application startup.
2023-11-27 11:31:49 [2023-11-27 03:31:49 +0000] [8] [INFO] Application startup complete.
2023-11-27 11:31:49 [2023-11-27 03:31:49 +0000] [7] [INFO] Application startup complete.
2023-11-27 11:31:49 [2023-11-27 03:31:49 +0000] [9] [INFO] Started server process [9]
2023-11-27 11:31:49 [2023-11-27 03:31:49 +0000] [9] [INFO] Waiting for application startup.
2023-11-27 11:31:49 [2023-11-27 03:31:49 +0000] [9] [INFO] Application startup complete.
2023-11-27 11:31:49 [2023-11-27 03:31:49 +0000] [10] [INFO] Started server process [10]
2023-11-27 11:31:49 [2023-11-27 03:31:49 +0000] [10] [INFO] Waiting for application startup.
2023-11-27 11:31:49 [2023-11-27 03:31:49 +0000] [10] [INFO] Application startup complete.
Node status detection timeout: 4055783f714aedca8734bb278b75be46c3f49ad2e78e8d6cdf86a399f274c571

@luyaxi
Copy link
Collaborator

luyaxi commented Nov 27, 2023

The logs seems the same as the begining. The first issue still exists. The ToolServerManager logs shows below:

2023-11-27 11:31:47 [2023-11-27 03:31:47 +0000] [1] [INFO] Starting gunicorn 21.2.0 2023-11-27 11:31:47 [2023-11-27 03:31:47 +0000] [1] [INFO] Listening at: http://0.0.0.0:8080 (1) 2023-11-27 11:31:47 [2023-11-27 03:31:47 +0000] [1] [INFO] Using worker: uvicorn.workers.UvicornWorker 2023-11-27 11:31:47 [2023-11-27 03:31:47 +0000] [7] [INFO] Booting worker with pid: 7 2023-11-27 11:31:47 [2023-11-27 03:31:47 +0000] [8] [INFO] Booting worker with pid: 8 2023-11-27 11:31:47 [2023-11-27 03:31:47 +0000] [9] [INFO] Booting worker with pid: 9 2023-11-27 11:31:47 [2023-11-27 03:31:47 +0000] [10] [INFO] Booting worker with pid: 10 2023-11-27 11:31:48 [2023-11-27 03:31:48 +0000] [7] [INFO] Started server process [7] 2023-11-27 11:31:48 [2023-11-27 03:31:48 +0000] [8] [INFO] Started server process [8] 2023-11-27 11:31:49 [2023-11-27 03:31:49 +0000] [7] [INFO] Waiting for application startup. 2023-11-27 11:31:49 [2023-11-27 03:31:49 +0000] [8] [INFO] Waiting for application startup. 2023-11-27 11:31:49 [2023-11-27 03:31:49 +0000] [8] [INFO] Application startup complete. 2023-11-27 11:31:49 [2023-11-27 03:31:49 +0000] [7] [INFO] Application startup complete. 2023-11-27 11:31:49 [2023-11-27 03:31:49 +0000] [9] [INFO] Started server process [9] 2023-11-27 11:31:49 [2023-11-27 03:31:49 +0000] [9] [INFO] Waiting for application startup. 2023-11-27 11:31:49 [2023-11-27 03:31:49 +0000] [9] [INFO] Application startup complete. 2023-11-27 11:31:49 [2023-11-27 03:31:49 +0000] [10] [INFO] Started server process [10] 2023-11-27 11:31:49 [2023-11-27 03:31:49 +0000] [10] [INFO] Waiting for application startup. 2023-11-27 11:31:49 [2023-11-27 03:31:49 +0000] [10] [INFO] Application startup complete. Node status detection timeout: 4055783f714aedca8734bb278b75be46c3f49ad2e78e8d6cdf86a399f274c571

Node Status detection timeout mainly caused by the failure in node checker. It seems that your node checker does not work properly. Please make sure you obtain the latest code and set the builtin_monitor: True in assets/config/manager.yml.

@HuntZhaozq
Copy link
Author

HuntZhaozq commented Nov 27, 2023

The first issues has been fixed after set the builtin_monitor: True in assets/config/manager.yml.
But comes out another error when python run.py:

requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
.............
XAgentServer.exts.exception_ext.XAgentRunningError: Expecting value: line 1 column 1 (char 0)
Error in task_handler of 51d8853abc: Expecting value: line 1 column 1 (char 0)

@luyaxi
Copy link
Collaborator

luyaxi commented Nov 27, 2023

The first issues has been fixed after set the builtin_monitor: True in assets/config/manager.yml. But comes out another error when python run.py:

requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

That's can be problem with XAgentGen, please pull latest image for XAgentGen and run again.

@HuntZhaozq
Copy link
Author

Thanks. It works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants