Skip to content

Add http endpoint to check query history#5514

Closed
zhztheplayer wants to merge 1 commit intoapache:masterfrom
zhztheplayer:query-history
Closed

Add http endpoint to check query history#5514
zhztheplayer wants to merge 1 commit intoapache:masterfrom
zhztheplayer:query-history

Conversation

@zhztheplayer
Copy link
Member

@zhztheplayer zhztheplayer commented Mar 22, 2018

This is for proposal #5503.

With this feature:

Users could be able to check the executed query ids via interface:

http://{broker-location}/druid/v2/history

As a specific case, users use this interface to check executed SQL queries:

http://{broker-location}/druid/v2/history?sql

Also, a detailed interface is provided:

http://{broker-location}/druid/v2/history/{queryId}

Related data is stored in a new table from metadata storage:

+--------------+--------------+------+-----+---------+----------------+
| Field        | Type         | Null | Key | Default | Extra          |
+--------------+--------------+------+-----+---------+----------------+
| id           | bigint(20)   | NO   | PRI | NULL    | auto_increment |
| query_id     | varchar(255) | NO   |     | NULL    |                |
| created_date | varchar(255) | NO   |     | NULL    |                |
| type         | varchar(255) | NO   |     | NULL    |                |
| payload      | longblob     | NO   |     | NULL    |                |
+--------------+--------------+------+-----+---------+----------------+

Different aspects of a executed query are stored with different values of type field. Below is showing the different types:

  • broker_time
    the metric query time of a query
  • node_time
    the metric ttfb of subqueries run on different nodes
  • sql_query_text
    the sql query text if query is a sql query
  • datasources
    datasources used by the query

QueryHistoryResource.createDetail(List<QueryHistoryEntry> entries) is responsible for merging all aspects of a queryId to a DetailedEntry object, which is going to be jsonized and returned as HTTP response.

The jsonized detail data is like below:

{
  "queryID": "44349763-b3c1-4a8d-a9f1-5013420b46d0",
  "profile": {
    "broker": 45,
    "druid-middlemanager-1:8102": 3,
    "druid-historical-1:8102": 40
  },
  "createDate": "2018-03-19T03:11:08.795Z",
  "status": "FINISHED",
  "datasources": ["table-1"],
  "sql": "SELECT * FROM table-1 LIMIT 10"
}

And this feature is disabled by default. To enable, edit broker runtime config and set druid.broker.history.enable=true.

@drcrallen
Copy link
Contributor

drcrallen commented Mar 23, 2018

@zhztheplayer
Copy link
Member Author

zhztheplayer commented Mar 26, 2018

@drcrallen

Thanks for the link, which is pretty useful for me.

At first I simply want Druid broker to be a query platform that provides nice query tracing ability by it self, without ELK or any other similar tools helping. In my circumstance to build a whole ELK stack will costs much more than just adding this feature to Druid itself. I also planed to add a simple UI to Druid broker to represent query history, just like the way overload represents tasks, which delivers more convenience.

Since the RequestLogger.java interface is a component about logging, and not providing any functions to read/aggregate them back to Druid, I am afraid of that there could be a little hardness to use this way to manage request logs.

Any good ideas?

@drcrallen
Copy link
Contributor

A simple example could include using a io.druid.server.log.LoggingRequestLogger or similar with a custom logging config as per https://logging.apache.org/log4j/2.x/manual/appenders.html that only captures that class's logs. Then you can do whatever you want with the request logs, including piping them to syslog, into a special rolling file location, or into a JDBCAppender. You can do whatever you want with the results. Ship them out. Keep them local. Put a simple http servelet on the log directory. Or whatever.

Usually the request log stream itself includes a bunch of stuff which is not useful, because its meta queries used by clients to build up their state of the cluster.

"I want to be able to collect distributed logs but don't want to manage an ELK cluster" is fine, but your options as I see it are as follows:

  1. Suck it up and deploy a logging service
  2. Use a turnkey logging service (gcp-logging, sumologic, or splunk come to mind, but there are many others)
  3. Just log locally and have a way to inspect logs when you need to

Would any of those, or a combination thereof, work?

@zhztheplayer
Copy link
Member Author

zhztheplayer commented Apr 11, 2018

@drcrallen

Thanks for the suggestions, in my current circumstance, to introduce a new service to manage logs could be a little expensive, that is also why I made these changes. But I could anyway try to do that when the deployment of a logging service becomes considerable, and thank you.

And should I close this for now?

@drcrallen
Copy link
Contributor

We can leave it open for now to see if there's other desires to have some sort of native-logging solution. There is already one for the middle managers, so it is not totally unprecedented. But it is not clear what others would also want from such a solution. for example: Is a "file reader" like the peon one sufficient? Does the request logger need expanded to have more builtin export types?

@stale
Copy link

stale bot commented Feb 28, 2019

This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@druid.apache.org list. Thank you for your contributions.

@stale stale bot added the stale label Feb 28, 2019
@stale
Copy link

stale bot commented Mar 7, 2019

This pull request has been closed due to lack of activity. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time.

@kxbin
Copy link

kxbin commented Mar 19, 2021

yes,this idea is so excellent!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants