Skip to content
This repository has been archived by the owner on Oct 15, 2022. It is now read-only.

Request: Remove the filter for queries longer than 100 characters. #86

Merged
merged 2 commits into from
Sep 11, 2014

Conversation

jagtalon
Copy link
Member

This number set looks arbitrary, and it's breaking the behavior of some IAs such as Calculator in duckduckgo/zeroclickinfo-goodies#283

I ran the tests in Goodies and Spice in both 5.16 and 5.18, and all passed.

CC @moollaza @nilnilnil

This number set looks arbitrary, and it's breaking the behavior of some IAs such as Calculator in duckduckgo/zeroclickinfo-goodies#283
@moollaza
Copy link
Member

@jagtalon it seems like the purpose of that line is to prevent arbitrarily long queries from being handled by default -- my only concern is if this change has any implications on the performance of our IA's, i.e. is it going to make them slower? On the plus, it would mean the entire query would be passed along to the handle function.

@nilnilnil do we often receive queries that are longer than 100 characters?

I think this deserves more testing, but in the interim calculator can just be updated to handle query_raw or query_raw_parts

@mwmiller
Copy link
Contributor

I don't think this needs to be entirely unbounded, but 100 chars is clearly too short. I think moving to a larger, but still arbitrary, limit (I'm thinking 1,000) is in order.

cc: @mintsoft

@mintsoft
Copy link
Collaborator

Hmm, conceptually this does seem like a really odd thing to do; I'm happy to workaround it within URLDecode if there's a hard-limit here however

@nilnilnil
Copy link
Contributor

Thanks for noticing this!

Internally we limit to 500. It should match that. 1000 is as random a number as 100, and it has negative side-effects. Changes like this should always be based on what we know.

Here are some statistics on query length (based on character count).

MEAN: 21
STD: 14
99 Percentile: 64
MAX: 1344

@moollaza to answer your question more directly: no, they're less than 1%.

@nilnilnil
Copy link
Contributor

Also, side question: are those nested greps a stream or do they happen on the whole array after? @mwmiller, do you know? @mintsoft?

@nilnilnil
Copy link
Contributor

@jagtalon would also like a better branch name next time :).

@mwmiller
Copy link
Contributor

are those nested greps a stream or do they happen on the whole array after?

The second argument to grep is evaluated in list context, so the output of the right-hand grep is passed to the left-hand grep as a complete array.

@mwmiller
Copy link
Contributor

Changes like this should always be based on what we know.

No worries, I wasn't actually making the change, just suggesting a number which seemed reasonable absent any other facts. Thanks for arriving with those facts, though! 😁

@nilnilnil
Copy link
Contributor

bummer -- but thanks 😄

@mwmiller
Copy link
Contributor

bummer

Yeah, it kinda sucks for the needle in a haystack sorts of things of which you were likely thinking. I always try to chop off as much as possible as early as possible.. but sometimes you can't know.

@nilnilnil
Copy link
Contributor

👍

nilnilnil added a commit that referenced this pull request Sep 11, 2014
Request: Remove the filter for queries longer than 100 characters.
@nilnilnil nilnilnil merged commit ac7a9a7 into master Sep 11, 2014
@nilnilnil nilnilnil deleted the jag/100 branch September 11, 2014 21:19
@nilnilnil
Copy link
Contributor

129 released. Thanks all!

@nilnilnil
Copy link
Contributor

Actually, weird; this page doesn't show 129 as the latest release. Any idea @mwmiller @jagtalon?https://github.com/duckduckgo/duckduckgo/releases

@mwmiller
Copy link
Contributor

@nilnilnil You have to go in and edit the tag. https://github.com/duckduckgo/duckduckgo/releases/new?tag=0.129

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants