Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement nginx's $request_time logging parameter #3547

Open
0x777 opened this issue Dec 16, 2019 · 8 comments
Open

implement nginx's $request_time logging parameter #3547

0x777 opened this issue Dec 16, 2019 · 8 comments
Assignees
Labels
a/performance c/server Related to server k/enhancement New feature or improve an existing feature

Comments

@0x777
Copy link
Member

0x777 commented Dec 16, 2019

Currently, we only log the execution time for a given request. It would be nice to measure the time from 'the arrival of first byte of the request' to the 'last byte sent as response'. This can help people debug network issues easily.

From nginx's docs:

$request_time
request processing time in seconds with a milliseconds resolution; time elapsed between the first bytes were read from the client and the log write after the last bytes were sent to the client

@0x777 0x777 added k/enhancement New feature or improve an existing feature c/server Related to server p/urgent Immediate action required labels Dec 16, 2019
@0x777 0x777 changed the title log response time as observed for a request implement nginx's $request_time logging parameter Dec 16, 2019
@jberryman
Copy link
Collaborator

This makes sense to me at a high level. Could you summarize the network issue and how you ended up debugging it?

@0x777
Copy link
Member Author

0x777 commented Dec 17, 2019

The network issue was that the clients connected to graphql-engine (deployed on k8s with istio) were noticing that the requests take a long time to finish(1-2 seconds). However in the logs of graphql-engine, the execution times were in the order of few milliseconds. The network connection between graphql-engine and the client was the issue. If $request_time was also present in the log, it would've been easier to pin point the issue to the network layer.

@0x777
Copy link
Member Author

0x777 commented Dec 17, 2019

how you ended up debugging it?

By adding logging on the clients and tracing those requests in graphql-engine (through a session variable x-hasura-trace-id).

jberryman added a commit to jberryman/graphql-engine that referenced this issue Jan 6, 2020
Add a new 'query_io_time' to the logging record.

We also use withTiming to get proper monotonic clock for
'query_execution_time' as well.
jberryman added a commit to jberryman/graphql-engine that referenced this issue Jan 17, 2020
Add a new 'query_io_time' to the logging record.

We also use 'withElapsedTime' to get proper monotonic clock for
'query_execution_time' as well.
jberryman added a commit to jberryman/graphql-engine that referenced this issue Jan 21, 2020
Add a new 'query_io_time' to the logging record.

We also use 'withElapsedTime' to get proper monotonic clock for
'query_execution_time' as well.
jberryman added a commit to jberryman/graphql-engine that referenced this issue Jan 23, 2020
Add a new 'request_read_time' to the logging record.

We also use 'withElapsedTime' to get proper monotonic clock for
'query_execution_time' as well.
jberryman added a commit to jberryman/graphql-engine that referenced this issue Jan 30, 2020
Add a new 'request_read_time' to the logging record.

We also use 'withElapsedTime' to get proper monotonic clock for
'query_execution_time' as well.
polRk pushed a commit to polRk/graphql-engine that referenced this issue Feb 12, 2020
Add a new 'request_read_time' to the logging record.

We also use 'withElapsedTime' to get proper monotonic clock for
'query_execution_time' as well.
@0x777 0x777 reopened this Mar 28, 2020
@jberryman jberryman removed their assignment Mar 30, 2020
@lexi-lambda
Copy link
Contributor

@0x777 Why is this reopened?

@tirumaraiselvan tirumaraiselvan removed the p/urgent Immediate action required label Apr 3, 2020
@0x777
Copy link
Member Author

0x777 commented Apr 3, 2020

@lexi-lambda Because @jberryman added the time taken to read the request body, but request_time as described in the issue is not implemented. Looks like this may not be possible with spock currently.

@lexi-lambda
Copy link
Contributor

Got it, thank you! Maybe we need to add it at the WAI level, then?

@Naveenaidu
Copy link
Contributor

Naveenaidu commented Dec 24, 2020

@0x777 IIUC request_time is defined as follows:

request_time = time taken to read the request body +
                         time taken to execute that request and send the response back to client

While I was working on Server Metrics issue - I noticed that in our mkSpockAction function, we have ioWaitTime as well as serviceTime which can be used for calculating the request_time mentioned in this issue.

Referencing from the code:

(ioWaitTime, reqBody) <- withElapsedTime $ liftIO $ Wai.strictRequestBody req
(serviceTime, (result, q)) <- withElapsedTime $ case apiHandler of
        AHGet handler -> do
          res <- lift $ runHandler handler
          return (res, Nothing)
        AHPost handler -> do
          parsedReqE <- runExceptT $ parseBody reqBody
          parsedReq  <- onLeft parsedReqE (logErrorAndResp (Just userInfo) requestId req (reqBody, Nothing) includeInternal headers . qErrModifier)
          res <- lift $ runHandler $ handler parsedReq
          return (res, Just parsedReq)
  • withElapsedTime is defined here
-- | Time an IO action, returning the time with microsecond precision. The
-- result of the input action will be evaluated to WHNF.
--
-- The result 'DiffTime' is guarenteed to be >= 0.
withElapsedTime :: MonadIO m => m a -> m (DiffTime, a)

If I understand all of the above correctly, will the following be a proper definition for $request parameter:

request = time taken to read the request body (ioWaitTime) + 
          time taken to execute that request and send the response back (serviceTime)

where

serviceTime = Service time, not including request IO wait time.
iotWaitTime = Request IO wait time, i.e. time spent reading the full request from the socket.

If the definition of request is correct as mentioned above, does this mean that this issue can now be progressed and we can finally log the $request_time logging parameter?

@andrewloux
Copy link
Contributor

andrewloux commented Oct 30, 2021

@0x777 Just wanted to check in on this - I'm just an invested user, wondering if there is anything I could do to contribute to push this along?

It seems like the data is there, just not exposed in the logging output at this time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a/performance c/server Related to server k/enhancement New feature or improve an existing feature
Projects
None yet
Development

No branches or pull requests

7 participants