Query timeouts caused by interplay between compression, sampling strategy, and underlying data

## TL;DR
A constant underlying datastream (set points, very fast sample rates) will timeout when using the N_QUERIES sampling strategy if the client requests compression and the total amount of data requested is too large to be processed within 60 seconds.

## Details
It appears that response timeouts have been caused by a complicated intersection of parameters, client settings, and underlying data.  I've noticed that queries using the N_QUERIES strategy will timeout after 60 seconds while identical queries using the STREAM strategy will return after processing for a very long time (several minutes at least).  Below is pathological n_queries example that will timeout in firefox plus it's equivalent stream query.

Times out in firefox (N_QUERIES) (502)
https://epicsweb.jlab.org/myquery/mysampler?c=R121GMES%2CR122GMES&b=2026-05-10T00%3A00%3A00&n=50000&m=history&s=10&f=0&v=6&x=n

Works in firefox (STREAM) (< 1 second)
https://epicsweb.jlab.org/myquery/mysampler?c=R121GMES%2CR122GMES&b=2026-05-10T00%3A00%3A00&n=50000&m=history&s=10&f=0&v=6&x=s

At first I thought this is just because the stream query completes before the 60 second timeout, but the timeout isn't for the transfer to complete.  The timeout is the maximum amount of time that the HTTP proxy can go without seeing data from the Tomcat server.  The mysampler endpoint is setup to stream the results out so both end points should easily perform well enough to stream at least one sample every 60 seconds.

Running this in curl show that to be true (`time_starttransfer` is 1.35 seconds) and allows the n_queries query to successfully complete after 104 seconds (longer than the 60 second timeout).
```
> curl -w '%{time_starttransfer} %{time_total}\n' --no-buffer 'https://epicsweb.jlab.org/myquery/mysampler?c=R121GMES%2CR122GMES&b=2026-05-10T00%3A00%3A00&n=50000&m=history&s=10&f=0&v=6&x=n' -o /tmp/test.json
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 3906k    0 3906k    0     0  38119      0 --:--:--  0:01:44 --:--:-- 47294
1.350663 104.943293
```

However adding the compressed flag causes the query to timeout.
```
> curl --compressed -w '%{time_starttransfer} %{time_total}\n' --no-buffer 'https://epicsweb.jlab.org/myquery/mysampler?c=R121GMES%2CR122GMES&b=2026-05-10T00%3A00%3A00&n=50000&m=history&s=10&f=0&v=6&x=n' -o /tmp/test.json
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   443  100   443    0     0      7      0  0:01:03  0:01:00  0:00:03   115
60.090399 60.091799
```

Running a stream query with 100 times the data and compression enabled does not timeout.  The transfer starts after 0.5 seconds.
```
> curl --compressed -w '%{time_starttransfer} %{time_total}\n' --no-buffer 'https://epicsweb.jlab.org/myquery/mysampler?c=R121GMES%2CR122GMES&b=2026-05-10T00%3A00%3A00&n=5000000&m=history&s=10&f=0&v=6&x=s' -o /tmp/test.json
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 1311k    0 1311k    0     0   118k      0 --:--:--  0:00:11 --:--:--  144k
0.518108 11.087084
```

My guess (suggested by Claude AI) is that the compression buffer doesn't fill up fast enough given that the data is practically identical (10 ms intervals) and that the n_queries strategy is about 150-200x slower than the stream strategy for this constant valued data sampling.  Given the transfer start happened after 0.5s in the stream scenario, this math is all roughly consistent (200x0.5s = 100s first n_queries transfer > 60s timeout).

Compression is useful feature here since the uncompressed stream is around 25 Mbps, but having users' queries timeout if they include a low-variance PV (i.e., set point) in their channel list is non-starter.  I think a work around is to periodically trigger a flush.  I will have to look into the details some more since I'm not sure where the compression is performed (Java app, Tomcat server, or Apache ReverseProxy).  A fix like that will probably be needed in each function like the one linked below.
https://github.com/JeffersonLab/myquery/blob/1b5e14b9e15eeb7f041db4b92309dfcf73ebb7f3/src/main/java/org/jlab/myquery/QueryController.java#L459

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Query timeouts caused by interplay between compression, sampling strategy, and underlying data #31

TL;DR

Details

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Query timeouts caused by interplay between compression, sampling strategy, and underlying data #31

Description

TL;DR

Details

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions