New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#2052 search reliability #2061

Merged
merged 10 commits into from Aug 6, 2018

Conversation

Projects
None yet
4 participants
@tbaxter-18f
Member

tbaxter-18f commented Jul 20, 2018

Refactors api.views to produce more predictable and safer results.

@tbaxter-18f tbaxter-18f changed the title from 2052 search reliability to #2052 search reliability Jul 20, 2018

@hbillings

This looks good to me! Way cleaner and more logical than it was (and big thumbs-up to the liberal use of bleach).

@@ -556,6 +558,9 @@ def test_filter_by_max_experience(self):
'business_size': None}])
def test_filter_by_zero_experience(self):
"""
Should filter out any results that require greater than 0 experience
"""

This comment has been minimized.

@hbillings

hbillings Jul 24, 2018

Member

💯 on more comments!

* PES
* MOBIS
* Consolidated
* IT Schedule 70

This comment has been minimized.

@hbillings

hbillings Jul 24, 2018

Member

Is this still true now that data admins have the ability to add/modify schedules?

This comment has been minimized.

@tbaxter-18f

tbaxter-18f Jul 24, 2018

Member

I think I need to double-check this change. It's not mine, and I'm not 100% sure where it was picked up from. I THINK it came in with a merge from develop, but I need to track it down and be sure.

This comment has been minimized.

@hbillings

hbillings Jul 24, 2018

Member

Ah, okay! It might be better to strike it, since the list has the potential to be more dynamic now.

This comment has been minimized.

@tbaxter-18f

tbaxter-18f Jul 25, 2018

Member

On further inspection, this looks like a reversion of Atul's work. Correcting now.

# wouldn't match id__in anyway.
# TO-DO: take a list of phrases, pass them through
# `clean_search` and then exclude those phrases

This comment has been minimized.

@hbillings

hbillings Jul 24, 2018

Member

YES YAY

if min_experience:
max_experience = years[1]
# Ensure the input matches expected numeric format to avoid injections
if min_experience and min_experience.isdigit():

This comment has been minimized.

@hbillings
contracts = contracts.filter(business_size__istartswith='o')
# WE NEED TO DOUBLE CHECK SIN AND PRICE.
# THEY DO NOT APPEAR TO BE ON THE SEARCH PAGE.

This comment has been minimized.

@hbillings

hbillings Jul 24, 2018

Member

That's... really weird. Looks like you can issue an API query for price and SIN, even though I don't think we're using them anywhere?
https://calc.gsa.gov/api/docs/#rates

This comment has been minimized.

@tbaxter-18f

tbaxter-18f Jul 24, 2018

Member

Yeah, exactly. Is it some cruft, or is there something I don't know about?

This comment has been minimized.

@hbillings

hbillings Jul 24, 2018

Member

I'm not sure what the original rationale for including it was, but I can see at least price coming in handy if we do anything with the proposed price as a search constraint.

This comment has been minimized.

@tbaxter-18f

tbaxter-18f Jul 25, 2018

Member

I'll leave them in. Wondering if we want to log an issue to fully determine if they should be in there or not.

>>> convert_to_tsquery_union(['foo', '$@#%#@!', 'bar'])
'foo:* | bar:*'
>>> clean_search('sr. frog, sme, disco chicken')
['senior frog', 'subject matter expert', 'disco chicken']

This comment has been minimized.

@hbillings

hbillings Jul 24, 2018

Member

Or, as some call the disco chicken, :party_parrot:

# So "business manager" finds results with "business" AND "manager"
# anywhere in the labor category.
# Note this is relatively hard on the DB, producing one query per word,
# but given our number of users, small queries and indexing, we should be OK.

This comment has been minimized.

@hbillings

hbillings Jul 24, 2018

Member

Not particularly relevant now, but I wonder if this is an argument for "match exact" being checked by default.

This comment has been minimized.

@tbaxter-18f

tbaxter-18f Jul 24, 2018

Member

I don't think so. I think most people are doing small phrases, and we have a small DB with a small userbase, so we should be able to absorb these extra queries with no issue. As a user, I think 'match exact' would be Not What I Want in most cases.

@@ -381,6 +395,12 @@ def normalize_labor_category(val):
return val
def get_readable_business_size(self):
if 's' in self.business_size.lower():

This comment has been minimized.

@tadhg-ohiggins

tadhg-ohiggins Jul 31, 2018

Collaborator

Should this be if self.business_size.lower().startswith('s')? Or do we know the the presence of an s anywhere in the string is always sufficient? I ask in part because the model gives this field a length of 128.

This comment has been minimized.

@tbaxter-18f

tbaxter-18f Aug 1, 2018

Member

I think that's a really good question I don't really have the answer to. This is unchanged from L342 in the old code, but I think you're right, something feels off here.

Looking at the model and the view -- and the front end -- I think we should be able to match more accurately than any sort of in check. But at the very least, I think you are correct that startswith would be better. I would check the form HTML to look at the possible submitted values and ask @KellyBailey or @tram to confirm, but yeah, I think this check could be tightened up.

This comment has been minimized.

@tadhg-ohiggins

tadhg-ohiggins Aug 3, 2018

Collaborator

The HTML appears to submit values of either s or o, so this isn't going to cause problems currently but should likely either use an equality comparison or startswith.

This comment has been minimized.

@tbaxter-18f

tbaxter-18f Aug 3, 2018

Member

Thanks for checking! Do you want to pull the branch and make that change or would you like me to?

This comment has been minimized.

@tadhg-ohiggins

tadhg-ohiggins Aug 3, 2018

Collaborator

@tbaxter-18f If you already have it locally, would you mind doing it? I'm currently tethering from a place with bad connectivity.

This comment has been minimized.

@tbaxter-18f

tbaxter-18f Aug 3, 2018

Member

it'll be Monday AM before I get to it, but I'm happy to.

@hbillings

This comment has been minimized.

Member

hbillings commented Aug 6, 2018

Is this good to merge?

@tbaxter-18f

This comment has been minimized.

Member

tbaxter-18f commented Aug 6, 2018

If @tadhg-ohiggins has no objections, I'd say yes!

@tadhg-ohiggins

This comment has been minimized.

Collaborator

tadhg-ohiggins commented Aug 6, 2018

Looks good to me!

@tram tram merged commit a7a4071 into develop Aug 6, 2018

3 checks passed

ci/circleci: build Your tests passed on CircleCI!
Details
security/snyk - package.json (CALC) No manifest changes detected
security/snyk - requirements.txt (CALC) No manifest changes detected

@hbillings hbillings deleted the 2052-search-reliability branch Sep 29, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment