Create an endpoint that shows agora internals (for debugging) #1934

ferencdg · 2021-04-14T00:47:47Z

Currently it is very hard to debug complex problems like: #1552 in our production environment.

It would be helpful to have a http endpoint that exposes the most important internals of the Agora instance to help debugging. This includes:

last 10 known preimages for each validators and their corresponding heights
last 5-10 SCP messages received/sent
latest block header/maybe transaction summary
summary of transactions in the transaction pool
what candidate a node would propose, if it was its turn
what nodes we are connected to
which nodes are banned
...

Some of these information is already available through other endpoints, but it would be good to see them all on the same endpoint.

Some information like know preimages are also known, if we ssh'd into the machine and read the sqlite db, but that is very cumbersome to do.

Some information like SCP messages are only available in the logs, if we set the level to Trace.

Some information like currently banned nodes are in its own file.

Geod24 · 2021-04-18T12:47:04Z

What I was actually thinking about is expose an UNIX socket that would allow one to directly interact with the nodes. This would also allow us to control nodes, e.g. by setting log level and co. But all things consider, going the HTTP route, with a control interface, might be much simpler to implement.

Geod24 · 2021-05-31T00:55:52Z

For the moment this should be added to the admin interface.
One example is that we can now dynamically reconfigure loggers:

agora/source/agora/api/Admin.d

Lines 30 to 42 in 2809791

    
               /*************************************************************************** 
        
                   Set the configuration of a Logger 
        
               ***************************************************************************/ 
        
               public void postLogger ( 
        
                   @viaQuery("name") string name, 
        
                   @viaQuery("propagate") bool propagate = true, 
        
                   @viaQuery("level") Nullable!(ILogger.Level) level = Nullable!(ILogger.Level).init, 
        
                   @viaQuery("additive") Nullable!bool additive = Nullable!(bool).init, 
        
                   @viaQuery("console") Nullable!bool console = Nullable!(bool).init, 
        
                   @viaQuery("file") Nullable!string file = Nullable!(string).init);

Geod24 · 2021-05-31T01:04:11Z

last 10 known preimages for each validators and their corresponding heights

Already part of the API:

agora/source/agora/api/FullNode.d

Line 246 in 2809791

public PreImageInfo[] getPreimages (ulong start_height, ulong end_height);

last 5-10 SCP messages received/sent

Currently not very useful, but will keep it in mind for the future.

latest block header/maybe transaction summary
summary of transactions in the transaction pool

We already have getBlocksFrom, but for transactions, we need an endpoint to dump the tx pool / enrollment pool.

what candidate a node would propose, if it was its turn

I think we're better off just logging it for now.

what nodes we are connected to
which nodes are banned

Yup.

Some information like SCP messages are only available in the logs, if we set the level to Trace.

I think the logger reconfigure mentioned above helps with this.

ferencdg · 2021-05-31T01:10:02Z

last 10 known preimages for each validators and their corresponding heights

Already part of the API:

agora/source/agora/api/FullNode.d

Line 246 in 2809791

public PreImageInfo[] getPreimages (ulong start_height, ulong end_height);

last 5-10 SCP messages received/sent

Currently not very useful, but will keep it in mind for the future.

latest block header/maybe transaction summary
summary of transactions in the transaction pool

We already have getBlocksFrom, but for transactions, we need an endpoint to dump the tx pool / enrollment pool.

what candidate a node would propose, if it was its turn

I think we're better off just logging it for now.

what nodes we are connected to
which nodes are banned

Yup.

Some information like SCP messages are only available in the logs, if we set the level to Trace.

I think the logger reconfigure mentioned above helps with this.

I think it would be nice to have all that information show up as a result of 1 HTTP/REST call, so we don't have to reconfigure/scavenge the logfiles. We currently have 5 validators, logging into each one of those machines and trying to find relevant log messages will still take a lot of time versus just opening up 5 browser tabs. We might not be able to find the exact reason of the problem we are trying to debug, but we would hopefully at least know if it is SCP/Network/Something else

Geod24 · 2021-05-31T02:44:31Z

I think it would be nice to have all that information show up as a result of 1 HTTP/REST call, so we don't have to reconfigure/scavenge the logfiles.

For that, maybe a EKL stack would be better suited ?

ferencdg added type-feature An addition to the system introducing new functionalities difficulty-easy An issue which is estimated to be relatively easy to resolve and removed difficulty-easy An issue which is estimated to be relatively easy to resolve labels Apr 14, 2021

Geod24 self-assigned this May 4, 2021

Geod24 added this to the 4. Tool Integration milestone May 4, 2021

ferencdg mentioned this issue Jun 9, 2021

Node to node encryption #266

Closed

Geod24 modified the milestones: 4. Tool Integration, 8. MainNet Jul 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create an endpoint that shows agora internals (for debugging) #1934

Create an endpoint that shows agora internals (for debugging) #1934

ferencdg commented Apr 14, 2021 •

edited

Geod24 commented Apr 18, 2021

Geod24 commented May 31, 2021

Geod24 commented May 31, 2021

ferencdg commented May 31, 2021

Geod24 commented May 31, 2021

Create an endpoint that shows agora internals (for debugging) #1934

Create an endpoint that shows agora internals (for debugging) #1934

Comments

ferencdg commented Apr 14, 2021 • edited

Geod24 commented Apr 18, 2021

Geod24 commented May 31, 2021

Geod24 commented May 31, 2021

ferencdg commented May 31, 2021

Geod24 commented May 31, 2021

ferencdg commented Apr 14, 2021 •

edited