-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Search tests #150
Search tests #150
Conversation
- modified get_sdk.sh as the original URL is now defunct - configure /search/ url to use main.app - modified search_tests to use Application object - PutTest - passes - GetTest - fails with 500 error - Search - Simple word query - passes - AND query with no results - passes - AND query - fails (returns no results) - OR query - fails (returns all results) - Date equality query - passes - Date less than - fails (returns as if equal)
the earlier dated document.
…document ids. - Gets around initial issue of get_indexes() not being implemented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the work you've done. These tests should speed up and make more consistent our progress on SearchService changes.
Though it still requires more accurate model of document before being merged.
from google.appengine.ext import webapp | ||
|
||
|
||
def render_doc(document): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though these functions (render_doc
and parse_doc
) are parts of initial branch, they need to be reworked before going to master.
As field type is crucial in Search API logic, it need to be present in data model used in tests.
For handling types and multivalued fields, I'd suggest to use structure like one is used in search.Document:
{
"id": "doc_1",
"rank": 32165498,
"language": "en",
"fields": [
{"name": "field1", "type": "text", "language": "fr", "value": "2018-12-15"},
{"name": "field1", "type": "date", "language": None, "value": "2018-12-15"},
{"name": "field1", "type": "number", "language": None, "value": 2018},
{"name": "field1", "type": "number", "language": None, "value": 12},
{"name": "field1", "type": "number", "language": None, "value": 15},
{"name": "fieldX", "type": "atom", "language": None, "value": "foo-bar"},
]
}
rank
and language
can be skipped for now.
Having field with name field1
5 times with 3 different types is allowed by Search API so using list of fields instead of dict is only way we can go.
test-suite/tests/search_tests.py
Outdated
}, | ||
{ | ||
"id": "seconddate", | ||
"date_entered": "2019-04-04", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like an idea of having predefined documents list.
Though documents on its own should be updated to use list of field definitions like was explained in a comment under render_doc
function.
Currently date_entered
fields will be interpreted like a regular text, so it won't be searchable by 'date_entered>2017-08-09'
or even 'date_entered="2019-04-04"'
, but it will show up in results for '2019'
because as a text field it will be indexed like:
TOKEN DOCS
2019 seconddate:1
04 seconddate:2
- Modified search.py to construct the proper field types instead of just search.TextField. - When returning content from the app, pass the document id also since it isn't part of the fields. - minor tweek of hawkeye_test_runner.py to handle spaces for expected results. - Need datetime for search.DateField, also ensured that lang is two characters. - Changed the search_tests.py file to have a type for the fields of the document. - added additional verification of documents returned by queries. - modified date queries to return data by passing just the year instead of year-month-day - adjusted baseline.csv file to account for searchapi tests.
Hi @jeanleonov thanks for the feedback, I've implemented the changes. I still need to get some multi-valued tests in, and probably some more cleanup/documentation. I'll take a look at that tomorrow. And some tests on the other field types: Atom, Html, Number |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. It's already good enough to use it, so I'll start playing with this branch for SearchService2.
Though date-field tests doesn't check documented behavior, so it should be updated before going to master. It's better to leave some tests failing with appscale, but none tests should fail with GAE.
Another minor note: imho it's nice when entire project or at least every file uses one style of quotes ('
or "
).
test-suite/tests/search_tests.py
Outdated
# response = self.http_post( | ||
# '/search/get-range', | ||
# json.dumps({'index': 'index-1', 'start_id': '1'})) | ||
# self.assertEquals(response.status, 200) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be this commented fragment removed?
test-suite/tests/search_tests.py
Outdated
self.assertEquals(len(documents), 1) | ||
doc = documents[0] | ||
# Make sure we got the right document back with that date. | ||
self.assertEquals(doc['date_entered'],"2018-12-12T00:00:00") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this document should be matched by "2018"
.
https://cloud.google.com/appengine/docs/standard/python/search/query_strings#Python_Queries_on_date_fields
test-suite/tests/search_tests.py
Outdated
self.assertEquals(len(documents), 1) | ||
doc = documents[0] | ||
# Make sure we got the right document back with that date. | ||
self.assertEquals(doc['date_entered'], '2019-04-04T00:00:00') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same here. Date field shouldn't match "2019".
- added removal of index schema (should help keep index size down)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Look good and mainly works)
- Though duplicated test name is important issue.
- Sleep might be not needed with GAE most of the times. But Google guaranties only eventual consistency, so adding short sleep after put should be reasonable.
- PyCharm highlights number of PEP8 issues (like two lines before class definition, space after comma, multiline dict formatting). I don't like to be annoying with styling comments, so I wouldn't block PR from being merged.
test-suite/tests/search_tests.py
Outdated
class GetTest(SearchTestCase): | ||
def setUp(self): | ||
self.app.post("/python/search/put", | ||
json=default_documents) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to GAE docs ("Eventual consistency" section), Search API guaranties only eventual consistency. Taking it in account, SearchService2 was implemented to guarantee that document becomes available at most in X milliseconds (300ms currently).
Could you add time.sleep(0.5)
after post request?
test-suite/tests/search_tests.py
Outdated
docs = [i["id"] for i in documents] | ||
self.assertIn("numbers", docs) | ||
|
||
def test_search_query_number_greater_than_equal(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicated test name
test-suite/tests/search_tests.py
Outdated
docs = [i["id"] for i in documents] | ||
self.assertIn("numbers", docs) | ||
|
||
def test_search_query_number_greater_than_equal(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicated test name
test-suite/tests/search_tests.py
Outdated
self.assertEquals(response.status_code, 200) | ||
self.assertIn("documents", response.json()) | ||
documents = response.json()["documents"] | ||
# Should only be one document |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be two documents
- Fixed duplicate function name issue. - Fixed formatting in several places to address PEP8 issues. - Removed comments for most test cases. - Added doc strings to functions to describe the test. - Fixed GetRangeTest class to work properly - Added index.get_range() test cases. - Added multi-value field test cases.
Hi @jeanleonov, Sorry about the duplicate function names, that was sloppy on my part. I fixed the GetRangeTest class to correctly run the range tests and added a few test cases. For the eventual consistency issue I added a global variable: CONSISTENCY_WAIT_TIME This should be a good baseline for the SearchAPI tests. These tests all pass against GAE:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent!
I ran the tests against my local deployment with SearchService2 - all 26 tests passed Ok. Daily build 5880 is run against:
All these PRs should go together |
These tests were pulled in via PR #168 . Closing this PR |
Initial search api tests based off of initial work from Anton.
The following tests will fail since functionality is missing on the search side:
tests.search_tests.GetTest.test_search_get,FAIL
tests.search_tests.SearchTest.test_search_query_AND_two_results,FAIL
tests.search_tests.SearchTest.test_search_query_OR,FAIL
tests.search_tests.SearchTest.test_search_query_date_less_than,FAIL
The baseline has them as 'ok' so they will be flagged as failures, this is on purpose as those tests should pass before we release the feature.
Missing tests: range, multivalue fields.