-
Notifications
You must be signed in to change notification settings - Fork 59
Uses the macros to replace the common queries. #127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Advice is needed for line 213 and line 214 in run_tests.py. It is quite ugly to replace TABLE_NAME twice. |
Pull Request Test Coverage Report for Build 379
💛 - Coveralls |
| - Replaces TABLE_NAME with the table associated for the query. | ||
| """ | ||
| return (' ').join(query).format(TABLE_NAME=self._table_name) | ||
| return (' ').join(query).format(NUM_ROWS=self.NUM_ROWS, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider breaking this up into two helper methods: _replace_macros and _replace_variables.
so, it would be return _replace_variables(_replace_macros(' '.join(query))
you can optionally create meaningful variable names for the intermediate strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| Replaces keyword TABLE_NAME and macros in the query. | ||
| """ | ||
|
|
||
| NUM_ROWS = 'SELECT COUNT(0) AS num_rows FROM {TABLE_NAME}' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: to make these distinct from the rest of the variables (e.g. TABLE_NAME), what do you think of adding a _QUERY suffix to all of them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| Replaces keyword TABLE_NAME and macros in the query. | ||
| """ | ||
|
|
||
| NUM_ROWS = 'SELECT COUNT(0) AS num_rows FROM {TABLE_NAME}' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's cleaner to create an enum of these macros. It both provides a natural grouping of these constants and also is easier to loop through them (e.g. adding a new macro is just as easy as adding a new entry in that Enum).
So, it would be something like:
import enum
class _QueryMacros(enum.Enum): # nested under the QueryFormatter class
NUM_ROWS_QUERY = 'SELECT ...'
SUM_START_QUERY = 'SELECT ...'
def _replace_macros(query):
for macro in _QueryMacros:
if macro.name == query:
return macro.value
else:
return query
This also means that you'd no longer need the surrounding {} brackets in the test def, which further distinguishes macros from regular variable names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for such a detailed comment! DONE.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually got a warning on "for macro in _QueryMacros", says "Expected expected collections.iterable got Type[_QueryMacros] instead. Any idea how to fix this? I have found one way which loops through the members.items, but I don't think it is as readable as current code.
| " SUM(start_position) AS sum_start, ", | ||
| " SUM(end_position) AS sum_end ", | ||
| "FROM {TABLE_NAME}" | ||
| "{NUM_ROWS}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think it's cleaner to just put these on one line now that you don't have multiple lines.
e.g.
{
"query": [ "NUM_ROWS_QUERY" ],
"expected_result": { "num_rows": 5 }
},
{
"query": [ "SUM_START_QUERY" ],
"expected_result": { "sum_start": 55 }
},
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point!
|
|
||
| def format_query(self, query): | ||
| # type: (List[str]) -> str | ||
| # type: (list[str]) -> str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. Is the correct format list instead of List?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so, List actually gives a warning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bashir2 FYI, looks like we should be using list instead of List
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really; I mean, I think IntelliJ is smart enough to understand list but the right choice is List according to PEP 484, AFAIK. The reason that you have been getting warnings for List is because you have not imported it from the typing module, check processed_variant.py for an example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I will correct it.
|
|
||
| def format_query(self, query): | ||
| # type: (List[str]) -> str | ||
| # type: (list[str]) -> str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really; I mean, I think IntelliJ is smart enough to understand list but the right choice is List according to PEP 484, AFAIK. The reason that you have been getting warnings for List is because you have not imported it from the typing module, check processed_variant.py for an example.
| Formatting logic is as follows: | ||
| - Concatenates ``query`` parts into one string. | ||
| - Replaces macro NUM_ROWS_QUERY/SUM_START_QUERY/SUM_END_QUERY with the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider not listing the members of _QueryMacros here such that adding a new query or changing the name of one in _QueryMacros does not an update of this documentation (which will probably not happen making the documentation obsolete in future).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
PTAL. |
arostamianfar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Thanks! Just some minor nits.
|
|
||
| class _QueryMacros(enum.Enum): | ||
| NUM_ROWS_QUERY = 'SELECT COUNT(0) AS num_rows FROM {TABLE_NAME}' | ||
| SUM_START_QUERY = ('SELECT SUM(start_position) AS sum_start FROM {' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: consider the following format as it's more readable (in general, try to avoid breaking lines as much as possible).
SUM_START_QUERY = (
'SELECT SUM(start_position) AS sum_start FROM {TABLE_NAME}')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, done.
| - Replaces TABLE_NAME with the table associated for the query. | ||
| """ | ||
| return (' ').join(query).format(TABLE_NAME=self._table_name) | ||
| return self._replace_variables(self._replace_macros((' ').join(query))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: do you need the brackets around ' '? i.e. does ..._macros(' '.join(query))) work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
| """Formats a query. | ||
| Replaces keyword TABLE_NAME and eventually macros in the query. | ||
| Replaces macros and variable TABLE_NAME in the query. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: by the same token as Bashir's comment, please remove 'table_name' from here as well. (just "Replaces macros and variables in the query").
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
Thanks. PTAL :) |
arostamianfar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* Add multiple query support to integration tests googlegenomics#120. update all tests in integration small_tests to support multiple query. update the validate_table in run_tests, add a loop for all test cases in the test file. changed the required keys for the .json file. Remove "validation_query" and "expected_query_result", and add "test_cases". Ran ./deploy_and_run_tests.sh and all integration tests passed. Update the development guide doc (googlegenomics#124) Update the development guide doc. Add IntelliJ IDE setup. Add more details. Added an INFO message for the full command. Tested: Ran manually and checked the new log message. Uses the macros to replace the common queries. (googlegenomics#127) Define NUM_ROWS, SUM_START, SUM_END in QueryFormatter, and replaces them in the query to avoid duplicate code. TESTED: deploy_and_run_tests. Define SchemaDescriptor class. Provides serialization and lookup API for type/mode of schema fields. Tested: unit test
Define NUM_ROWS, SUM_START, SUM_END in QueryFormatter, and replaces them in the query to avoid duplicate code. TESTED: deploy_and_run_tests.
Define NUM_ROWS, SUM_START, SUM_END in QueryFormatter, and replaces them in the query to avoid duplicate code. TESTED: deploy_and_run_tests.
Define NUM_ROWS, SUM_START, SUM_END in QueryFormatter, and replaces them in the query to avoid duplicate code.
TESTED:
deploy_and_run_tests.