Updated esquery.py file. #58

aswanipranjal · 2018-05-19T17:16:30Z

Closes #57.

With this PR, all the string queries in esquery.py file have been converted into elasticsearch_dsl objects such as Aggregation, Query and Search.
I've also tried to update the searches and queries to the latest norms that elasticsearch follows by using match_phrase rather than using type: phrase to search for queries.

get_agg returns a dictionary containing the search object with the necessary fields.

Removed all the manually created string queries in all the functions and replaced them with elasticsearch_dsl objects like A(aggregation), Q(query) and Search. Updated queries to use the current methods to search and aggregate. Closes chaoss#57

aswanipranjal · 2018-05-21T12:52:47Z

@jgbarah, @valeriocos, @acs can you please review this PR?

valeriocos

LGTM, thank you @aswanipranjal. If it is possible, I would suggest to add some tests.

jgbarah

Thanks a lot. Very good job. Could you please try to address the comments? In addition, we need some tests. Could you write at least one test function per function? I think that producing some queries, and comparing that they are the same (by comparing the resulting DSL dicts, for example) would be good enogugh for a start.

jgbarah · 2018-05-23T07:21:23Z

manuscripts/esquery.py


        :param filters: dict with the filters to be applied
        :param inverse: if True include all the inverse filters (the one starting with *)
-        :return: a string with the DSL value for these filters
+        :return: a list of match object(dict)(s)


In all references to elasticsearch_dsl objects, I would expliticly mention the module. For example in this case, "list of elasticsearch_dsl match objects", or if you prefer, "es_dsl" for short. Please, add this to all this kind of references in other methods, so that it is crystal clear what they do.

Besides, what do you mean by (dict)(s)? It is better that you describe in detail the data structure that the method returns, including, if it is convenient, an example. Remember that this message should include enough information for a trained developer to be able of using the method.

By dicts i meant that the filters are dictionaries. I'll update the comments to be more clear.

jgbarah · 2018-05-23T07:25:39Z

manuscripts/esquery.py


 class ElasticQuery():
    """ Helper class for building Elastic queries """

    AGGREGATION_ID = 1  # min aggregation identifier
    AGG_SIZE = 100  # Default max number of buckets
-    ES_PRECISION = 3000
+    ES_PRECISION = 3000  # This is the default value for percision_threshold

    @classmethod
    def __get_query_filters(cls, filters=None, inverse=False):


I would use "filters=[]". Since it is expected to be a list, this way you don't need, in the implementation, to check for None, and it is even more explicit that you're expecting a list for that argument. I would only use None if you find a case where you need to make a difference between that "None" and an empty list as an argument.

jgbarah · 2018-05-23T07:26:49Z

manuscripts/esquery.py


        return query_filters

    @classmethod
    def __get_query_range(cls, date_field, start=None, end=None):
        """
-        Create a string with range DSL query for the date in date_field from start to end dates.
+        Create a filter dict with date_field from start to end dates.

        :param date_field: field with the date value


For dates, and other "complex" data types, please specify the actual format, if possible. For example, in this case, is it "datetime.datetime"? (besides: if it is not something such as datetime.datetime, I would rather consider changing the profile to really use something like that, so that dates are converted into a pythonic format before this method (and others below) are called. We can build some auxiliary function for that, if needed.

jgbarah · 2018-05-23T07:29:30Z

manuscripts/esquery.py

+        query_basic = Search()
+
+        # Not using this for now.
+        # query_basic = query_basic.query("query_string", analyse_wildcard="true", query="*")


Any reason for keeping the comment?

No, nothing special. I was just going to experiment with the wildcard flag.

In that case, maybe remove it now, and produce later a pr with the extra functionality, if you find it convenient.

jgbarah · 2018-05-23T07:31:11Z

manuscripts/esquery.py

+        if not agg_id:
+            agg_id = cls.AGGREGATION_ID
+        query_agg = A("terms", field=field, size=cls.AGG_SIZE, order={"_count": "desc"})
+        """


This is much better in the description of the method, above.

jgbarah · 2018-05-23T07:35:49Z

manuscripts/esquery.py

        return bounds

    @classmethod
    def __get_query_agg_ts(cls, field, time_field, interval=None,
                           time_zone=None, start=None, end=None,
                           agg_type='count', offset=None):
        """
-        Create a string with an aggregation DSL query for getting the time series
-        values for field.
+        Create an aggregation DSL object for getting the time series values for field.


Isn't it really a "es_dsl aggregation object" instead of a "aggregation DSL object"?

Yeah, my bad. Thanks!

jgbarah · 2018-05-23T07:38:09Z

manuscripts/esquery.py

-        """ % (query_basic)
-
+        # size=0 gives only the count and not the hits
+        query = query_basic.extra(size=0)
        return query

    @classmethod
    def get_agg(cls, field=None, date_field=None, start=None, end=None,


As commented above, in other comments, try to clean up the profile a bit, so that for example fields is a list, dates are datetime or similar, etc. Please remember to do so for all methods.

jgbarah · 2018-05-23T07:46:53Z

manuscripts/esquery.py

+                                                                offset=offset)
+        s.aggs.bucket(agg_id, query_agg)
+
+        return json.dumps(s.to_dict())


I'm not sure why this is returning the query as a JSON file. Shouldn't it better return a es_dsl object, so that we get rid of es DSL queries completely?

@jgbarah, that is because in metrics.py, get_metrics_data function requests elasticsearch for the data and it requires the data to be in JSON format other wise we will get a 500 error.

We can change the metrics.py file as well to use the Search().execute method to get the data but i thought that we are probably going to change all the methods and replace them with better more agile methods anyways hence I didn't change all the files.

@jgbarah, @valeriocos: Should I be looking at changing all the files so that we are only using elasticsearch_dsl module?

OK, I see. Not for now. Let's keep that for another pr, but let's finish this letting it produce JSON as the result.

aswanipranjal · 2018-05-23T09:06:03Z

I'll make another commit with the tests and cleaning up the comments a bit more.

Added details to the comments in esquery.py about the data sctuctures used and the format that the parameters should be in. Added tests for every function in esquery.py file.

aswanipranjal · 2018-05-26T09:38:02Z

@jgbarah @valeriocos @acs please review. I've updated esquery.py with proper comments and added tests for all the functions.

I am ready to make another PR which completes the cleaning process. Once this PR has been merged, then I'll make the second PR.

Or, if not, I can make that commit in that PR a part of this PR only.

valeriocos · 2018-05-31T07:19:05Z

tests/test_esquery.py

+        self.assertDictEqual(self.es._ElasticQuery__get_query_range(date_field=self.date_field,
+                             start=self.start, end=self.end), test_range_dict)
+
+    def test_get_query_basic(self):


I would split this test in two: test_no_filters and test_with_filters.

valeriocos · 2018-05-31T07:20:04Z

tests/test_esquery.py

+        }
+        self.assertDictEqual(self.es._ElasticQuery__get_bounds(self.start, self.end), test_bounds_dict)
+
+    def test_get_query_agg_ts(self):


I would split this test in two: tests that check exceptions and tests that do not

valeriocos · 2018-05-31T07:20:36Z

tests/test_esquery.py

+        self.assertDictEqual(test_ts_dict2.to_dict(), test_avg_ts_dict)
+
+    def test_get_agg(self):
+        """Test the aggregation function"""


Please, add a blank line after the method description

valeriocos

Thank you @aswanipranjal for this PR. Overall it looks good to me. I have added just some minor comments.

jgbarah

OK, let's go with this version, so that we can build on top of it.

@aswanipranjal please take note of @valeriocos suggestion, and split the tests in a separate pr.

Updated esquery.py

78d4eac

Removed all the manually created string queries in all the functions and replaced them with elasticsearch_dsl objects like A(aggregation), Q(query) and Search. Updated queries to use the current methods to search and aggregate. Closes chaoss#57

jgbarah requested review from acs, jgbarah and valeriocos May 23, 2018 07:15

valeriocos approved these changes May 23, 2018

View reviewed changes

jgbarah requested changes May 23, 2018

View reviewed changes

Updated esquery.py and created tests/test_esquery.py

9acc5c2

Added details to the comments in esquery.py about the data sctuctures used and the format that the parameters should be in. Added tests for every function in esquery.py file.

aswanipranjal force-pushed the cleanup-esquery branch from 08bfcd9 to 9acc5c2 Compare May 26, 2018 09:25

aswanipranjal mentioned this pull request May 30, 2018

Update metrics.py, esquery.py and test_esquery.py to use only elasticsearch_dsl module #60

Closed

valeriocos reviewed May 31, 2018

View reviewed changes

jgbarah approved these changes May 31, 2018

View reviewed changes

jgbarah merged commit 104b657 into chaoss:master May 31, 2018

aswanipranjal deleted the cleanup-esquery branch June 3, 2018 11:04

aswanipranjal restored the cleanup-esquery branch June 3, 2018 11:04

aswanipranjal deleted the cleanup-esquery branch July 8, 2018 07:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updated esquery.py file. #58

Updated esquery.py file. #58

aswanipranjal commented May 19, 2018

aswanipranjal commented May 21, 2018

valeriocos left a comment •

edited

Loading

jgbarah left a comment

jgbarah May 23, 2018

aswanipranjal May 23, 2018

jgbarah May 23, 2018

jgbarah May 23, 2018

jgbarah May 23, 2018

jgbarah May 23, 2018

aswanipranjal May 23, 2018

jgbarah May 23, 2018

jgbarah May 23, 2018

jgbarah May 23, 2018

aswanipranjal May 23, 2018

jgbarah May 23, 2018

aswanipranjal May 23, 2018

jgbarah May 23, 2018

aswanipranjal May 23, 2018

aswanipranjal May 23, 2018 •

edited

Loading

jgbarah May 23, 2018

aswanipranjal commented May 23, 2018

aswanipranjal commented May 26, 2018 •

edited

Loading

valeriocos May 31, 2018

valeriocos May 31, 2018

valeriocos May 31, 2018

valeriocos left a comment

jgbarah left a comment

Updated esquery.py file. #58

Updated esquery.py file. #58

Conversation

aswanipranjal commented May 19, 2018

aswanipranjal commented May 21, 2018

valeriocos left a comment • edited Loading

Choose a reason for hiding this comment

jgbarah left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aswanipranjal May 23, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aswanipranjal commented May 23, 2018

aswanipranjal commented May 26, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

valeriocos left a comment

Choose a reason for hiding this comment

jgbarah left a comment

Choose a reason for hiding this comment

valeriocos left a comment •

edited

Loading

aswanipranjal May 23, 2018 •

edited

Loading

aswanipranjal commented May 26, 2018 •

edited

Loading