Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QS library query parsing limits arrays to 20 in length #16745

Closed
nodeworks opened this issue Dec 7, 2022 · 8 comments · Fixed by #20356
Closed

QS library query parsing limits arrays to 20 in length #16745

nodeworks opened this issue Dec 7, 2022 · 8 comments · Fixed by #20356
Assignees
Labels

Comments

@nodeworks
Copy link
Contributor

nodeworks commented Dec 7, 2022

Describe the Bug

When you send a GET request such as:

fields[]=id&fields[]=project_id&fields[]=name&fields[]=long_name&fields[]=default_sub_type.*&fields[]=funding_years.funding_year&fields[]=project_sub_types.project_sub_type.*&fields[]=regions.region.*&fields[]=categories.category.*.*&filter[project_id][_in][0]=1179901&filter[project_id][_in][1]=117991&filter[project_id][_in][2]=117992&filter[project_id][_in][3]=126244&filter[project_id][_in][4]=1223051&filter[project_id][_in][5]=122305&filter[project_id][_in][6]=129388&filter[project_id][_in][7]=131082&filter[project_id][_in][8]=130988&filter[project_id][_in][9]=130989&filter[project_id][_in][10]=130990&filter[project_id][_in][11]=131055&filter[project_id][_in][12]=131221&filter[project_id][_in][13]=131232&filter[project_id][_in][14]=131233&filter[project_id][_in][15]=131234&filter[project_id][_in][16]=131235&filter[project_id][_in][17]=131063&filter[project_id][_in][18]=131095&filter[project_id][_in][19]=2022+NODE+SPLITS+CAP+LABOR&filter[project_id][_in][20]=133863&filter[project_id][_in][21]=133864

If the filter array is longer than 20 items it will convert the array to an object and cause an error that returns no data. This is because the QS library is used without the arrayLimit config:

in /api/src/app.ts

app.set('query parser', (str: string) => qs.parse(str, { depth: 10 }));

where as this fixes the issue:

app.set('query parser', (str: string) => qs.parse(str, { depth: 10, arrayLimit: Infinity }));

To Reproduce

Run a query with an "_in" filter with an array of more than 20 items. OR look at the code sandbox to see the behavior:

https://codesandbox.io/s/node-playground-forked-vxu8zh?file=/src/index.js

Remove arrayLimit param and see how it changes the filter _in to an object.

Errors Shown

No response

What version of Directus are you using?

9.21.0

What version of Node.js are you using?

16.14.2

What database are you using?

SQL Server

What browser are you using?

Chrome

How are you deploying Directus?

Docker

@azrikahar
Copy link
Contributor

Great detective work! Seems like the 20 here isn't that arbitrary after all, as it turns out to be explicitly documented by the qs library (reference) as the default when arrayLimit is not specified:

qs will also limit specifying indices in an array to a maximum index of 20. Any array members with an index of greater than 20 will instead be converted to an object with the index as the key. This is needed to handle cases when someone sent, for example, a[999999999] and it will take significant time to iterate over this huge array.

However with the performance issue described by qs and with #11845 in mind (which added MAX_RELATIONAL_DEPTH to prevent similar performance issue with nested filters), I think a potential solution should probably be allowing this to be configurable with an environment variable rather than a blanket Infinity (but understandably you may have only used Infinity just to point out it works, rather than suggesting to use it directly!).


That being said, even though I don't really have an answer off the top of my head per se, it might also be worth re-visiting the current filtering you are performing to see whether there are other approaches, if any.

@paescuj
Copy link
Member

paescuj commented Dec 8, 2022

However with the performance issue described by qs and with #11845 in mind (which added MAX_RELATIONAL_DEPTH to prevent similar performance issue with nested filters), I think a potential solution should probably be allowing this to be configurable with an environment variable rather than a blanket Infinity (but understandably you may have only used Infinity just to point out it works, rather than suggesting to use it directly!).

A environment variable to make it configurable might be an acceptable solution. I think the most important thing would be to document this limit somewhere.

That being said, even though I don't really have an answer off the top of my head per se, it might also be worth re-visiting the current filtering you are performing to see whether there are other approaches, if any.

If there's no way around using so many filters you could use the HTTP SEARCH method instead of passing them as params.
(Also something which we should add to the docs?)

@azrikahar
Copy link
Contributor

A environment variable to make it configurable might be an acceptable solution. I think the most important thing would be to document this limit somewhere.

Maybe it can be documented around here once there is a dedicated environment variable: https://docs.directus.io/self-hosted/config-options.html#limits-optimizations, but perhaps there can be an additional call out somewhere around https://docs.directus.io/reference/query.html if users are coming across this limit more than anticipated. 🤔

If there's no way around using so many filters you could use the HTTP SEARCH method instead of passing them as params.
(Also something which we should add to the docs?)

It should be! I believe it's already documented here: https://docs.directus.io/reference/introduction.html#search-http-method

@paescuj
Copy link
Member

paescuj commented Dec 8, 2022

Maybe it can be documented around here once there is a dedicated environment variable: https://docs.directus.io/self-hosted/config-options.html#limits-optimizations, but perhaps there can be an additional call out somewhere around https://docs.directus.io/reference/query.html if users are coming across this limit more than anticipated. 🤔

That would be great 👍 After thinking about it, I'm not quite sure whether it's the best solution to make this limit configurable as it might be hard to find a suitable value... Maybe it would be possible instead to catch the case when the array is converted to an object and then process this object?

It should be! I believe it's already documented here: https://docs.directus.io/reference/introduction.html#search-http-method

Nice! 😃 (Sometimes I struggle a bit using the search in the docs. With term "search" alone it doesn't show up, but of course it does when searching for "http search" 😄)

@nodeworks
Copy link
Contributor Author

@paescuj I think there definitely are a couple of options. I know for my sake right now the workaround is to use the SEARCH method, however that get's a little tricky when you're using the directus SDK since you have to create a separate adapter and then pass a flag on the query specifying you want to use the SEARCH method instead of GET. While it's not terrible I don't think it's the cleanest solution. Out of the solutions about I think the best thing would be to handle this transparently on the API side:

  1. configurable environment variable
  2. process / handle when it is in object form

Maybe both? I think the quickest solution would be to have a configurable environment variable.

@paescuj
Copy link
Member

paescuj commented Dec 9, 2022

I'm glad the workaround is working for you so far. I agree that it is not the cleanest solution at this time...

IMO, we should be somewhat cautious when it comes to the introduction of new environment variables. Once the variable is there, theoretically it requires a breaking change if we want to deprecate it at a later time.
Therefore I'd prefer evaluating other solutions first.

@rijkvanzanten
Copy link
Member

Linear: ENG-112

@rijkvanzanten
Copy link
Member

rijkvanzanten commented Jan 10, 2023

Also a quicker workaround here would be to use a CSV of values, rather than multiple query params:

?filter[project_id][_in]=1,2,3,4,5

instead of

?filter[project_id][_in][0]=1
&filter[project_id][_in][1]=2
&filter[project_id][_in][2]=3
&filter[project_id][_in][3]=4
&filter[project_id][_in][4]=5

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
Archived in project
5 participants