very long ResponseContinuation on certain query #61

Geetarman · 2015-10-13T16:07:03Z

If I run this query
SELECT * FROM c WHERE c.ObjectType="Document"
I get a
ResponseContinuation = "+RID:16gQALRRBgM-AAAAAAAAAA==#RT:1"

if I run the following query (on the same collection)
SELECT * FROM c WHERE c.CustomerAreaId = "1"
the ResponseContinuation is 7755 bytes long

ResponseContinuation = "+RID:16gQALRRBgPrggsAAAAAAA==#RT:1#FPC:AgEuPC6KBu8CAPDd9/3e9/f9//f3+3v1/v7/+//v9/e/f3//fn+rRsDf7//ff/W9///96Pt9b/u+f3///v/f7//7+//3/t/vv3///v/+//3v3/v+///Bfb/f37+///++/77fu7uv3e+3vW/+//+/v+2+//vf//7/t/b97/9v2/v+9/v79/f7/f7v7zvt/f37/d9/v/v9ff+/3+v+/vf3//f7/...

Is this ResponseContinuation correct or should it just look like that in the first example.
If I edit it to remove everything after the #RT:1 it appears to work but it is very difficult to be sure with the amount of data in the collection. If the ResponseContinuation is correct I will have to re-engineer my paging mechanism in my web site as this is far too much data to transfer

ghost · 2015-10-13T16:43:39Z

The continuation token helps avoid redundant work in the future roundtrips. We persist information in this token so we know exactly where to resume without needing to repeat any work again in the future round trips. Overall, this significantly reduces RUs cumulative across the query roundtrips.

May I ask, what’s the concern about 7KB. It is within the supported boundary for HTTP headers.

Geetarman · 2015-10-13T17:53:40Z

Just that I am building a highly scalable website and I want to minimize the amount of data transferred. The request continuation is being transferred in the url as I assumed it was always small.. that was obviously a mistaken assumption on my part.

ghost · 2015-10-13T18:32:33Z

Fair enough. Perhaps there is something you can do to keep the token on your server in a session token perhaps instead of round-tripping it to the client? But yes, if you need to send this to the client, then not putting it in the URL would be best.

Geetarman · 2015-10-13T19:12:43Z

How big could that continuation token get.
Knowing that might help me decide the best solution. I might decide to store it in an unindexed collection.

Kevin-TokyWoky · 2017-02-15T16:27:42Z

Same problem here.
We have an issue to transfer this token from the client machine to our server that will then make the docdb request.

We first passed it to our server in a query string. But already it was too long sometimes and the request would fail. So we passed that continuation token using a custom header. It worked for a while. Prolly more than 6 months or so.

And today... I just stumbled on a 12.8kb token.
And now the request is just too long for the browser even using headers. See attached file.
continuationToken.txt

We also base64 encode this token so it can be transferred anywhere safely (since it's a json with some {} and stuff), and it becomes 17.1kb in base64.

I could create a 'short' token on our side, which would be associated to the real docdb token. But that means that for each docdb request, I'd have to query another system to get the real token...
And to me, it's not my job, but m$ job to give us users a token that we can easily manipulate.

Shall this issue be reopened?

Geetarman · 2017-02-15T16:38:48Z

For what it is worth I decided (really had no option) to store the continuation token in a cache object if it was greater that a certain length and replace it with a GUID. Therefore when the request comes in again I can determine if it is a real continuation token or cached upon which case I retrieve the 'true' continuation token. Currently the cache I use is a documentdb cache but you could use a table or Redis Cache.
I guess the problem is how long do you keep it. I keep a collection for caching purposes and it (auto) deletes them after a day (plenty enough time) and it works fine for me at the moment.
I use a dedicated unindexed collection for this purpose.

I would be nice though not to worry about it and remove the need to cache it myself...

Kevin-TokyWoky · 2017-02-15T16:41:08Z

Yes, I plan on using Redis. But first I'm trying to see if I can trim this token and only keep everything but the #FPC... I'm giving myself 30mins of hacking to try this :D

Geetarman · 2017-02-15T16:47:49Z

Good luck. I think the term Continuation Token is a misnomer in this case - it is Continuation Data!

Geetarman · 2017-02-15T16:49:33Z

Put a request on https://feedback.azure.com/forums/263030-documentdb
I'd vote for it....

Kevin-TokyWoky · 2017-02-15T16:51:35Z

Ahah yes.
Well at first glance this shorter token I just created, removing the long FCP part, works.
I gotta compare with what we have on our production server and make sure it's the same results as with this hacked token.

Geetarman · 2017-02-15T16:54:06Z

Hmmm, don't like the sound of that. If it isn't needed it must be a bug in documentdb in which case it should be reported. I have a vague feeling I tried something like that and had problems (it is a while ago)

Kevin-TokyWoky · 2017-02-15T17:00:14Z

Same... I don't like it. Especially since I would rely on their token's format... which could change any day.
:/
But ryancrawcour mentioned that there were persistent data inside this token to save them work for future roundtrips. Which makes us do more work instead. And that's probably inside this FCP part.

Well... I think I'll go the Redis way and generate our own token. Not sure yet.

rnagpal · 2017-02-26T02:48:25Z

@zfang will follow up on this issue.

Kevin-TokyWoky · 2017-02-27T09:04:12Z

Hello, thanks for reopening this.

So a bit of follow up since I almost forgot about this github issue:
following my conversation with Geetarman, I contacted the Azure support since we have a subscription.
I told my concerns about the token and I had a reply from Microsoft. See below:

For the query continuation token, it’s length could go up to 16KB. The query engine utilizes the token to serialize its state so that it could resume execution correctly. Along with the resume state, the query engine would also serialize some of the index lookup work on the continuation token to avoid repeating the same work for each continuation.
If this is really a blocking issue for you, then I could give you some hints on trimming the continuation token before sending it back. By all means we do not recommend this unless this is an absolute must and is meant to be a temporary solution.
From our side, we’re considering allowing the user to specify maximum continuation token length, with the caveat that if serializing the resume state did not fit in the specified max size, the query execution will fail with an error. We don’t have a timeline for this work yet though.

For the short term, you could trim the token by removing #FPC. Please keep in mind that in some cases you might get #FPP (i.e. either #FPC or #FPP).
We’ll sure prioritize this work item and hopefully we could get around to it soon.
Best Regards,

Very nice to see things going forward, +1 to Microsoft. They listen.

As for us, we are indeed trimming the token right now, but we only remove the #FPC part, as at the time I didn't know about the #FPP part. So far seems to work great, but I suspect it must cost us a little more data point in our DocDB subscription since we remove some optimization from the token. Probably.

ansario · 2017-08-03T13:59:48Z

@rnagpal it appears that our token is null even though we do have more results in our query and would need the continuation token to get the next results. We were using the method of stripping FPC and FPP.

"{"token":null,"range":{"min":"05C1E5D191B78A083134303331323800","max":"FF"}}"

Did anything change recently?

Kevin-TokyWoky · 2017-08-03T14:06:53Z

Yep. It changed like... one month ago or something.
I changed my token regex on June 1st. My commit reads:

Instead of FPC at the end there is now FPP

Here is the regex we are now using:

private static readonly Regex ContinuationTokenDataRegex = new Regex(@"(\+RID:.*#RT:.*#TRC:.*#RTD:.*)#[FPC|FPP].*", RegexOptions.Compiled | RegexOptions.Singleline);

BTW as a bonus, here is our code to shorten the tokens:

        private static string ShortenToken(string phatToken)
        {
            try
            {
                dynamic jsonToken = JsonConvert.DeserializeObject(phatToken);
                Match matches = ContinuationTokenDataRegex.Match((string)jsonToken.token);
                if (matches.Groups.Count != 2)
                {
                    return phatToken;
                }

                string shorterToken = matches.Groups[1].Value;
                jsonToken.token = shorterToken;
                return jsonToken.ToString();
            }
            catch (Exception ex)
            {
                return phatToken;
            }
        }

EDIT: However we still have a valid token field

ansario · 2017-08-03T14:09:44Z

@Kevin-TokyWoky that's fine, but the actual token is still coming back as null even though the response continuation itself is not null. So we can't do any regex on a null token.

Kevin-TokyWoky · 2017-08-03T14:11:40Z

Yep, I don't know. I just checked on our side and everything works properly.
I can paginate stuff, the token is not null for us.

kirankumarkolli · 2017-08-14T22:17:14Z

@ansario could you please share the 'activity id' and we will try to take a look.
Also it would be good if you can create a new issue to track it.

kirankumarkolli · 2017-08-17T00:45:23Z

@ansario I am closing this issue as there is lot more other context as well. In-case you are still blocked feel free to raise a new issue.

joopscheer · 2017-08-29T06:23:26Z

@Kevin-TokyWoky Thanks for your code example. I've had to change it a bit to get it to work in my code.

private static readonly Regex ContinuationTokenDataRegex = new Regex(@"(\+RID:.*#RT:.*#TRC:.*#RTD:.*)#[FPC|FPP].*", RegexOptions.Compiled | RegexOptions.Singleline);

private static string ShortenToken(string phatToken)
{
    if (string.IsNullOrEmpty(phatToken))
    {
        return phatToken;
    }

    try
    {
        dynamic jsonToken = JsonConvert.DeserializeObject(phatToken);
        var matches = ContinuationTokenDataRegex.Match((string)jsonToken.token);
        if (matches.Groups.Count != 2)
        {
            return phatToken;
        }

        jsonToken.token = matches.Groups[1].Value;
        return JsonConvert.SerializeObject(jsonToken);
    }
    catch
    {
        return phatToken;
    }
}

jamesthurley · 2019-09-04T14:17:36Z

From our side, we’re considering allowing the user to specify maximum continuation token length.

This has since been implemented as the ResponseContinuationTokenLimitInKb on the FeedOptions object.

thomaslevesque · 2019-09-04T14:56:37Z

This has since been implemented as the ResponseContinuationTokenLimitInKb on the FeedOptions object.

The original quote said (emphasis mine):

From our side, we’re considering allowing the user to specify maximum continuation token length, with the caveat that if serializing the resume state did not fit in the specified max size, the query execution will fail with an error. We don’t have a timeline for this work yet though.

So it's not really a solution. You can specify a max length for the token, but it will cause requests to fail...

jamesthurley · 2019-09-04T15:53:33Z

@thomaslevesque Happily the way they have implemented this is that they simply prune the continuation token to keep it under the desired limit, rather than failing with an error.

The caveat is that resuming the query may take a bit more work (and therefore RUs) if the continuation token has been pruned.

There is a bit more information which I found useful here: https://stackoverflow.com/a/54242859/37725

thomaslevesque · 2019-09-04T15:56:34Z

@jamesthurley good to know, thanks!
Too bad that the max length is expressed in KB, so we can't say e.g. "no more than 128 bytes". The "minimal" continuation token is only a few bytes, so there's still no easy way to get that...

Geetarman closed this as completed Oct 13, 2015

rnagpal added the investigating label Feb 26, 2017

rnagpal reopened this Feb 26, 2017

kirankumarkolli closed this as completed Aug 17, 2017

kirankumarkolli removed the investigating label Aug 17, 2017

joopscheer mentioned this issue Aug 28, 2017

ResponseContinuation token keeps growing #330

Open

altenstedt mentioned this issue Apr 25, 2019

Document if it is secure to pass the continuation to a client #697

Closed

acn-sbuad mentioned this issue Feb 5, 2021

Get instances continuation token is too long and request returns 414 Altinn/altinn-studio#5602

Closed

irinatarn mentioned this issue Jun 10, 2021

Fixed a bug about having multiple labels with fraud status. microsoft/Dynamics-365-Fraud-Protection-ManualReview#31

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

very long ResponseContinuation on certain query #61

very long ResponseContinuation on certain query #61

Geetarman commented Oct 13, 2015

ghost commented Oct 13, 2015

Geetarman commented Oct 13, 2015

ghost commented Oct 13, 2015

Geetarman commented Oct 13, 2015

Kevin-TokyWoky commented Feb 15, 2017 •

edited

Geetarman commented Feb 15, 2017

Kevin-TokyWoky commented Feb 15, 2017

Geetarman commented Feb 15, 2017

Geetarman commented Feb 15, 2017

Kevin-TokyWoky commented Feb 15, 2017 •

edited

Geetarman commented Feb 15, 2017

Kevin-TokyWoky commented Feb 15, 2017 •

edited

rnagpal commented Feb 26, 2017

Kevin-TokyWoky commented Feb 27, 2017 •

edited

ansario commented Aug 3, 2017 •

edited

Kevin-TokyWoky commented Aug 3, 2017 •

edited

ansario commented Aug 3, 2017

Kevin-TokyWoky commented Aug 3, 2017 •

edited

kirankumarkolli commented Aug 14, 2017 •

edited

kirankumarkolli commented Aug 17, 2017

joopscheer commented Aug 29, 2017 •

edited

jamesthurley commented Sep 4, 2019 •

edited

thomaslevesque commented Sep 4, 2019

jamesthurley commented Sep 4, 2019

thomaslevesque commented Sep 4, 2019

very long ResponseContinuation on certain query #61

very long ResponseContinuation on certain query #61

Comments

Geetarman commented Oct 13, 2015

ghost commented Oct 13, 2015

Geetarman commented Oct 13, 2015

ghost commented Oct 13, 2015

Geetarman commented Oct 13, 2015

Kevin-TokyWoky commented Feb 15, 2017 • edited

Geetarman commented Feb 15, 2017

Kevin-TokyWoky commented Feb 15, 2017

Geetarman commented Feb 15, 2017

Geetarman commented Feb 15, 2017

Kevin-TokyWoky commented Feb 15, 2017 • edited

Geetarman commented Feb 15, 2017

Kevin-TokyWoky commented Feb 15, 2017 • edited

rnagpal commented Feb 26, 2017

Kevin-TokyWoky commented Feb 27, 2017 • edited

ansario commented Aug 3, 2017 • edited

Kevin-TokyWoky commented Aug 3, 2017 • edited

ansario commented Aug 3, 2017

Kevin-TokyWoky commented Aug 3, 2017 • edited

kirankumarkolli commented Aug 14, 2017 • edited

kirankumarkolli commented Aug 17, 2017

joopscheer commented Aug 29, 2017 • edited

jamesthurley commented Sep 4, 2019 • edited

thomaslevesque commented Sep 4, 2019

jamesthurley commented Sep 4, 2019

thomaslevesque commented Sep 4, 2019

Kevin-TokyWoky commented Feb 15, 2017 •

edited

Kevin-TokyWoky commented Feb 15, 2017 •

edited

Kevin-TokyWoky commented Feb 15, 2017 •

edited

Kevin-TokyWoky commented Feb 27, 2017 •

edited

ansario commented Aug 3, 2017 •

edited

Kevin-TokyWoky commented Aug 3, 2017 •

edited

Kevin-TokyWoky commented Aug 3, 2017 •

edited

kirankumarkolli commented Aug 14, 2017 •

edited

joopscheer commented Aug 29, 2017 •

edited

jamesthurley commented Sep 4, 2019 •

edited