Skip to content


Subversion checkout URL

You can clone with
Download ZIP


Parentheses in search don't work with alternate search terms #261

traviscb opened this Issue · 9 comments

4 participants


Originally on 2010-08-26

Consider the search

author:"dixon, lance j" or (author:"dixon, la j" and year:0->2011)

which is, admittedly a contrived example. Note that it must be equivalent to

author:"dixon, lance j" or author:"dixon, la j"

given that we have no papers with years outside that range.

Yet, on inspirebeta, the first gives 255, while the second 121.

This is because author:"dixon, la j" triggers the:

No exact match found for dixon, la j, using dixon la j instead...

search modification inline when it gets no results. This is just wrong, and is very dangerous with SPIRES-style author searches.

Similar annoyance is caused by the nearest terms box, though that doesn't generally change the search result. (e.g.:


Where there is a bunch of noise (again exacerbated with SPIRES-style author expansion) that might be helpful, but might just be confusing.

I'm submitting a trivial patch that fixes this by setting ap=0 and nearest_terms_box=False when search_pattern_parenthesized calls search_pattern on the subclauses.

Note that this means that we no longer give any "help" to users if a part or all of their parenthesized query gives no results. This seems reasonable to me, but this will extend to authors in the SPIRES syntax, which might be less reasonable.

A better long term solution might be to refactor all of this so that, at the end of the search, one can check on whether 0 results were found, and give suggestions then, rather than changing/suggesting things in the middle of a compound search, with a possible enhancement to let people know, in a subtler way, if one of their subclauses failed to return results.


Originally on 2010-08-26

Please take my branch:


for the fix described above

closes #261


Originally on 2010-09-08

Your point about checking for number of results and making suggestions at the end is well taken. I agree that this is the right thing to do.

But the specific set of search discrepancies you mention are fixed in a branch with master+131...


Originally on 2010-10-04

Replying to [comment:2 jblayloc]:

But the specific set of search discrepancies you mention are fixed in a branch with master+131...

Indeed they are, however the similar search terms box fires anytime we do a search like:

find a brooks, travis

since there are a lot of "ors" in that search expansion of the first name.

This is very bad because the "no results" pushes the actual search result off of the page and a user could easily think that there are no results for the search at all.

At the very least I'd take my patch below (wrapped in CFG_INSPIRE_SITE == 1) and deploy it soon, to fix this immediate problem.

Another option is to move the "no results" to the bottom, below the search results that were found.


Originally on 2010-10-05

1) WRT current behaviour of find a brooks, travis, this query gets expanded by the SPIRES syntax compatibility parser into:

['+', 'author:"brooks, travis*"', '|', 'exactauthor:"brooks, t"', '|',
'exactauthor:"brooks, tr"', '|', 'exactauthor:"brooks, tra"', '|',
'exactauthor:"brooks, trav"', '|', 'exactauthor:"brooks, travi"']

I wonder how common is the need to iterate through all the various travi-trav-tra-tr forms; maybe a simpler expansion like:

author:"brooks, travis*" OR exactauthor:"brooks, t"

would cover the most common use cases well already?

If we manage to simplify the query expansion, it would help in removing the most weird stuff like brooks, tra as if by passing. Although it would not solve the longer-term task fully, of course, e.g. the internal parsing logic may still get exposed for unsuccessful queries like find a brooks, zelda.

2) WRT CFG_INSPIRE_SITE changes, I'll wait for the final updates then, but please check how things work for Invenio syntax vs SPIRES syntax too, because you are addressing search_pattern_parenthesised() globally, which is called_also_ for Invenio syntax. So we may get into troubles for queries like reportnumber:cern/th if they happen to live inside parens.

In longer term, if SPIRES syntax parser does query expansion in the pre-search times, while the Invenio syntax parser leaves this for the later post-search times, so to speak, then indeed the ap behaviour may be harmful in one case, while useful in the other case, and so it would seem necessary to distinguish the two for the whole duration of the search, if we want to offer both alternatives at run time in parallel. This could be done for example by enriching basic search units (opft) to include an information whether ap is wanted for a particular unit or not. If we set ap only globally, then we will loose one way or another. If we set ap differently for paren vs non-paren expressions, we may get into (foo) vs foo differences. Moreover, on a similar note, people may get confused by SPIRES mode vs Invenio mode differences, find a brooks, travis vs brooks, travis in author field.


Originally on 2010-10-06

I've cleaned up the temporary fix to 261 a little more so the SPIRES syntax converter can report whether it's actually responsive to a given search or not; this makes knowing whether the search had started life as a SPIRES syntax query easy. It's pushed to my repository in branch 261-parens-and-alternate-search.

(Travis you should delete your copy (which after all is a proper subset of mine) to avoid confusion.)


Originally on 2010-10-11

Thanks. Since the second commit fully reverted the first "temporary" commit, I preferred to squash them together, keeping only Joe's one, so that we don't have to wonder about the commit history if we happen to have to look at the code in some time from now. I hope this is OK with you.

(It is kind of like when I'm merging and there are some ultra simple issues such as today's syntax ones (#296#comment:4) or datemodified ones (#283#comment:11); then I'm often doing in-situ fixes in the name of the author during merge so that the repo history is clean.)


Originally by Joe Blaylock on 2010-10-11

In [ef470a6]:

#CommitTicketReference repository="" revision="ef470a6aec5f619319d29819dbc0e2edbbaa76b7"
WebSearch: less noisy `ap' for SPIRES syntax

- SPIRES syntax queries are pre-analyzed and may lead to many internal Boolean
  OR queries for various search units that were not present in the original
  query.  Post-search combination of queries for these units at the level of
  hitsets with alternate or no results are therefore not suitable for these
  situations.  ap=0 and display_nearest_terms_box=False is therefore set
  to work around these issues.

- Simplifies semantics around detecting presence of SPIRES syntax,
  improves kwalitee slightly.  Also adds FIXME so we don't lose history.

- This commit addresses #261.

Originally on 2010-10-11

Deployed on TEST and PROD; putting the ticket back to the assigned status for the long-term solution. Though we may perhaps want to close it here and create a new proper ticket for the long-term solution. (e.g. this ticket's subject originated mostly about paren related issues anyway)


Originally on 2010-10-26

Created new ticket, #327

@traviscb traviscb closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.