Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create an endpoint that shows agora internals (for debugging) #1934

Open
ferencdg opened this issue Apr 14, 2021 · 5 comments
Open

Create an endpoint that shows agora internals (for debugging) #1934

ferencdg opened this issue Apr 14, 2021 · 5 comments
Assignees
Labels
type-feature An addition to the system introducing new functionalities
Milestone

Comments

@ferencdg
Copy link
Contributor

ferencdg commented Apr 14, 2021

Currently it is very hard to debug complex problems like: #1552 in our production environment.

It would be helpful to have a http endpoint that exposes the most important internals of the Agora instance to help debugging. This includes:

  • last 10 known preimages for each validators and their corresponding heights
  • last 5-10 SCP messages received/sent
  • latest block header/maybe transaction summary
  • summary of transactions in the transaction pool
  • what candidate a node would propose, if it was its turn
  • what nodes we are connected to
  • which nodes are banned
  • ...

Some of these information is already available through other endpoints, but it would be good to see them all on the same endpoint.

Some information like know preimages are also known, if we ssh'd into the machine and read the sqlite db, but that is very cumbersome to do.

Some information like SCP messages are only available in the logs, if we set the level to Trace.

Some information like currently banned nodes are in its own file.

@ferencdg ferencdg added type-feature An addition to the system introducing new functionalities difficulty-easy An issue which is estimated to be relatively easy to resolve and removed difficulty-easy An issue which is estimated to be relatively easy to resolve labels Apr 14, 2021
@Geod24
Copy link
Collaborator

Geod24 commented Apr 18, 2021

What I was actually thinking about is expose an UNIX socket that would allow one to directly interact with the nodes. This would also allow us to control nodes, e.g. by setting log level and co. But all things consider, going the HTTP route, with a control interface, might be much simpler to implement.

@Geod24 Geod24 self-assigned this May 4, 2021
@Geod24 Geod24 added this to the 4. Tool Integration milestone May 4, 2021
@Geod24
Copy link
Collaborator

Geod24 commented May 31, 2021

For the moment this should be added to the admin interface.
One example is that we can now dynamically reconfigure loggers:

/***************************************************************************
Set the configuration of a Logger
***************************************************************************/
public void postLogger (
@viaQuery("name") string name,
@viaQuery("propagate") bool propagate = true,
@viaQuery("level") Nullable!(ILogger.Level) level = Nullable!(ILogger.Level).init,
@viaQuery("additive") Nullable!bool additive = Nullable!(bool).init,
@viaQuery("console") Nullable!bool console = Nullable!(bool).init,
@viaQuery("file") Nullable!string file = Nullable!(string).init);

@Geod24
Copy link
Collaborator

Geod24 commented May 31, 2021

last 10 known preimages for each validators and their corresponding heights

Already part of the API:

public PreImageInfo[] getPreimages (ulong start_height, ulong end_height);

last 5-10 SCP messages received/sent

Currently not very useful, but will keep it in mind for the future.

latest block header/maybe transaction summary
summary of transactions in the transaction pool

We already have getBlocksFrom, but for transactions, we need an endpoint to dump the tx pool / enrollment pool.

what candidate a node would propose, if it was its turn

I think we're better off just logging it for now.

what nodes we are connected to
which nodes are banned

Yup.

Some information like SCP messages are only available in the logs, if we set the level to Trace.

I think the logger reconfigure mentioned above helps with this.

@ferencdg
Copy link
Contributor Author

last 10 known preimages for each validators and their corresponding heights

Already part of the API:

public PreImageInfo[] getPreimages (ulong start_height, ulong end_height);

last 5-10 SCP messages received/sent

Currently not very useful, but will keep it in mind for the future.

latest block header/maybe transaction summary
summary of transactions in the transaction pool

We already have getBlocksFrom, but for transactions, we need an endpoint to dump the tx pool / enrollment pool.

what candidate a node would propose, if it was its turn

I think we're better off just logging it for now.

what nodes we are connected to
which nodes are banned

Yup.

Some information like SCP messages are only available in the logs, if we set the level to Trace.

I think the logger reconfigure mentioned above helps with this.

I think it would be nice to have all that information show up as a result of 1 HTTP/REST call, so we don't have to reconfigure/scavenge the logfiles. We currently have 5 validators, logging into each one of those machines and trying to find relevant log messages will still take a lot of time versus just opening up 5 browser tabs. We might not be able to find the exact reason of the problem we are trying to debug, but we would hopefully at least know if it is SCP/Network/Something else

@Geod24
Copy link
Collaborator

Geod24 commented May 31, 2021

I think it would be nice to have all that information show up as a result of 1 HTTP/REST call, so we don't have to reconfigure/scavenge the logfiles.

For that, maybe a EKL stack would be better suited ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-feature An addition to the system introducing new functionalities
Projects
None yet
Development

No branches or pull requests

2 participants