Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search tests #150

Closed
wants to merge 9 commits into from
Closed

Search tests #150

wants to merge 9 commits into from

Conversation

scragraham
Copy link
Contributor

@scragraham scragraham commented Jan 7, 2019

Initial search api tests based off of initial work from Anton.

The following tests will fail since functionality is missing on the search side:

tests.search_tests.GetTest.test_search_get,FAIL
tests.search_tests.SearchTest.test_search_query_AND_two_results,FAIL
tests.search_tests.SearchTest.test_search_query_OR,FAIL
tests.search_tests.SearchTest.test_search_query_date_less_than,FAIL

The baseline has them as 'ok' so they will be flagged as failures, this is on purpose as those tests should pass before we release the feature.

Missing tests: range, multivalue fields.

jeanleonov and others added 6 commits November 16, 2018 16:57
 - modified get_sdk.sh as the original URL is now defunct
 - configure /search/ url to use main.app
 - modified search_tests to use Application object
 - PutTest - passes
 - GetTest - fails with 500 error
 - Search
   - Simple word query - passes
   - AND query with no results - passes
   - AND query - fails (returns no results)
   - OR query - fails (returns all results)
   - Date equality query - passes
   - Date less than - fails (returns as if equal)
…document ids.

- Gets around initial issue of get_indexes() not being implemented.
Copy link
Contributor

@jeanleonov jeanleonov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work you've done. These tests should speed up and make more consistent our progress on SearchService changes.
Though it still requires more accurate model of document before being merged.

from google.appengine.ext import webapp


def render_doc(document):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though these functions (render_doc and parse_doc) are parts of initial branch, they need to be reworked before going to master.
As field type is crucial in Search API logic, it need to be present in data model used in tests.

For handling types and multivalued fields, I'd suggest to use structure like one is used in search.Document:

{
    "id": "doc_1",
    "rank": 32165498,
    "language": "en",
    "fields": [
        {"name": "field1", "type": "text", "language": "fr", "value": "2018-12-15"},
        {"name": "field1", "type": "date", "language": None, "value": "2018-12-15"},
        {"name": "field1", "type": "number", "language": None, "value": 2018},
        {"name": "field1", "type": "number", "language": None, "value": 12},
        {"name": "field1", "type": "number", "language": None, "value": 15},
        {"name": "fieldX", "type": "atom", "language": None, "value": "foo-bar"},
    ]
}

rank and language can be skipped for now.
Having field with name field1 5 times with 3 different types is allowed by Search API so using list of fields instead of dict is only way we can go.

},
{
"id": "seconddate",
"date_entered": "2019-04-04",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like an idea of having predefined documents list.
Though documents on its own should be updated to use list of field definitions like was explained in a comment under render_doc function.

Currently date_entered fields will be interpreted like a regular text, so it won't be searchable by 'date_entered>2017-08-09' or even 'date_entered="2019-04-04"', but it will show up in results for '2019' because as a text field it will be indexed like:

TOKEN     DOCS
2019      seconddate:1
04        seconddate:2

test-suite/tests/search_tests.py Show resolved Hide resolved
 - Modified search.py to construct the proper field types
   instead of just search.TextField.
 - When returning content from the app, pass the document id
   also since it isn't part of the fields.
 - minor tweek of hawkeye_test_runner.py to handle spaces for expected
   results.
 - Need datetime for search.DateField, also ensured that lang is
   two characters.
 - Changed the search_tests.py file to have a type for the fields
   of the document.
 - added additional verification of documents returned by queries.
 - modified date queries to return data by passing just the year instead
   of year-month-day
 - adjusted baseline.csv file to account for searchapi tests.
@scragraham
Copy link
Contributor Author

Hi @jeanleonov

thanks for the feedback, I've implemented the changes. I still need to get some multi-valued tests in, and probably some more cleanup/documentation. I'll take a look at that tomorrow. And some tests on the other field types: Atom, Html, Number

Copy link
Contributor

@jeanleonov jeanleonov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. It's already good enough to use it, so I'll start playing with this branch for SearchService2.
Though date-field tests doesn't check documented behavior, so it should be updated before going to master. It's better to leave some tests failing with appscale, but none tests should fail with GAE.

Another minor note: imho it's nice when entire project or at least every file uses one style of quotes (' or ").

# response = self.http_post(
# '/search/get-range',
# json.dumps({'index': 'index-1', 'start_id': '1'}))
# self.assertEquals(response.status, 200)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be this commented fragment removed?

self.assertEquals(len(documents), 1)
doc = documents[0]
# Make sure we got the right document back with that date.
self.assertEquals(doc['date_entered'],"2018-12-12T00:00:00")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.assertEquals(len(documents), 1)
doc = documents[0]
# Make sure we got the right document back with that date.
self.assertEquals(doc['date_entered'], '2019-04-04T00:00:00')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same here. Date field shouldn't match "2019".

 - added removal of index schema (should help keep index size down)
Copy link
Contributor

@jeanleonov jeanleonov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look good and mainly works)

  • Though duplicated test name is important issue.
  • Sleep might be not needed with GAE most of the times. But Google guaranties only eventual consistency, so adding short sleep after put should be reasonable.
  • PyCharm highlights number of PEP8 issues (like two lines before class definition, space after comma, multiline dict formatting). I don't like to be annoying with styling comments, so I wouldn't block PR from being merged.

class GetTest(SearchTestCase):
def setUp(self):
self.app.post("/python/search/put",
json=default_documents)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to GAE docs ("Eventual consistency" section), Search API guaranties only eventual consistency. Taking it in account, SearchService2 was implemented to guarantee that document becomes available at most in X milliseconds (300ms currently).
Could you add time.sleep(0.5) after post request?

test-suite/tests/search_tests.py Show resolved Hide resolved
docs = [i["id"] for i in documents]
self.assertIn("numbers", docs)

def test_search_query_number_greater_than_equal(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicated test name

docs = [i["id"] for i in documents]
self.assertIn("numbers", docs)

def test_search_query_number_greater_than_equal(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicated test name

self.assertEquals(response.status_code, 200)
self.assertIn("documents", response.json())
documents = response.json()["documents"]
# Should only be one document
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be two documents

 - Fixed duplicate function name issue.
 - Fixed formatting in several places to address PEP8 issues.
 - Removed comments for most test cases.
 - Added doc strings to functions to describe the test.
 - Fixed GetRangeTest class to work properly
 - Added index.get_range() test cases.
 - Added multi-value field test cases.
@scragraham
Copy link
Contributor Author

Hi @jeanleonov,

Sorry about the duplicate function names, that was sloppy on my part.
I fixed most of the formatting issues that show up regarding PEP8 issues. I didn't fix the 'function name should be lowercase'. I did some overall cleanup, I think it looks better now.

I fixed the GetRangeTest class to correctly run the range tests and added a few test cases.
Also added a few simple multivalue field tests.

For the eventual consistency issue I added a global variable: CONSISTENCY_WAIT_TIME
The current value is 0.5, I actually ran into some consistency issues when running the tests
against GAE with that setting. In any case we can adjust higher easily now.

This should be a good baseline for the SearchAPI tests.

These tests all pass against GAE:

Search API Test Suite
=====================
test_search_put (tests.search_tests.PutTest) ... ok
test_search_get (tests.search_tests.GetTest) ... ok
test_get_range_all_documents (tests.search_tests.GetRangeTest) ... ok
test_get_range_three_documents_from_a (tests.search_tests.GetRangeTest) ... ok
test_get_range_three_documents_from_c (tests.search_tests.GetRangeTest) ... ok
test_search_query_AND_no_results (tests.search_tests.SearchTest) ... ok
test_search_query_AND_two_results (tests.search_tests.SearchTest) ... ok
test_search_query_OR (tests.search_tests.SearchTest) ... ok
test_search_query_date_equal_by_field (tests.search_tests.SearchTest) ... ok
test_search_query_date_global_equal (tests.search_tests.SearchTest) ... ok
test_search_query_date_less_than (tests.search_tests.SearchTest) ... ok
test_search_query_implicit_AND_two_results (tests.search_tests.SearchTest) ... ok
test_search_query_multivalue_fields_first_value (tests.search_tests.SearchTest) ... ok
test_search_query_multivalue_fields_number_no_results (tests.search_tests.SearchTest) ... ok
test_search_query_multivalue_fields_second_value (tests.search_tests.SearchTest) ... ok
test_search_query_multivalue_fields_string_no_results (tests.search_tests.SearchTest) ... ok
test_search_query_multivalue_fields_third_value (tests.search_tests.SearchTest) ... ok
test_search_query_number_by_field_equal (tests.search_tests.SearchTest) ... ok
test_search_query_number_by_field_equal_colon (tests.search_tests.SearchTest) ... ok
test_search_query_number_greater_than_equal_equality (tests.search_tests.SearchTest) ... ok
test_search_query_number_greater_than_equal_one (tests.search_tests.SearchTest) ... ok
test_search_query_number_greater_than_equal_two (tests.search_tests.SearchTest) ... ok
test_search_query_number_less_than (tests.search_tests.SearchTest) ... ok
test_search_query_number_less_than_equal (tests.search_tests.SearchTest) ... ok
test_search_query_number_less_than_no_results (tests.search_tests.SearchTest) ... ok
test_search_simple_query (tests.search_tests.SearchTest) ... ok

----------------------------------------------------------------------
Ran 26 tests in 34.815s

OK

Copy link
Contributor

@jeanleonov jeanleonov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent!

@jeanleonov
Copy link
Contributor

jeanleonov commented Jan 31, 2019

I ran the tests against my local deployment with SearchService2 - all 26 tests passed Ok.
For running it in OCD I pushed your branch to AppScale/hawkeye repo and prepared branch in appscale-testing which includes search2 role in hawkeye AppScalefile.

Daily build 5880 is run against:

All these PRs should go together

@jeanleonov
Copy link
Contributor

@jeanleonov
Copy link
Contributor

@scragraham
Copy link
Contributor Author

These tests were pulled in via PR #168 . Closing this PR

@scragraham scragraham closed this Sep 13, 2019
@scragraham scragraham deleted the search-tests branch September 13, 2019 18:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants