Initial working draft of aggregating serializer and viewset, see #HEA… #127

chrispreee · 2024-11-04T11:26:02Z

…-547

rhunwicks

As mentioned in the individual comments, i think it would benefit from some refactoring, but the overall approach looks good.

Mostly, I think it needs unit tests.

rhunwicks · 2025-01-09T04:38:37Z

apps/baseline/models.py

    quantity_other_uses = models.PositiveIntegerField(blank=True, null=True, verbose_name=_("Quantity Other Uses"))
    # Can normally be calculated / validated as `quantity_produced + quantity_purchased - quantity_sold - quantity_other_uses`  # NOQA: E501
-    # but there are exceptions, such as MilkProduction, where there is also an amount used for ButterProduction
+    # but there are exceptions, such as MilkProduction, where there is also an amount used for ButterProduction, is this captured quantity_other_uses?  # NOQA: E501


No there is a MilkProduction.quantity_butter_production that captures that separately. Please update this comment accordingly.

Will do thanks

rhunwicks · 2025-01-09T04:47:03Z

apps/baseline/viewsets.py

+        &min_kcals_consumed_percent=52
+        &max_kcals_consumed_percent=99
+
+    The strategy type codes are:


The strategy type codes are defined in metadata.models.LivelihoodStrategyType.

They probably should be referenced here rather than coded as a list?

I made this very verbose as the docstring is rendered in the API HTML GUI, so this provides end user guidance.

Do we validate the value of the parameter anywhere? Is that using the metadata.models.LivelihoodStrategyType? At a miminum I think we should put a comment in the code (i.e. outside the docstring) that explains the the allowable values for this parameter must be one of choices from LivelihoodStrategyType.

rhunwicks · 2025-01-09T05:19:53Z

apps/baseline/viewsets.py

+    serializer_class = LivelihoodZoneBaselineReportSerializer
+    filterset_class = LivelihoodZoneBaselineFilterSet
+
+    def get_queryset(self):


Is there a particular reason to put this code in get_queryset instead of filter_queryset? You are calling filter_queryset here, does it get called again later by whatever mechanism DRF normally uses to call it?

For example, the default list method has:

def list(self, request, *args, **kwargs): queryset = self.filter_queryset(self.get_queryset())

I think that perhaps it is applying the global filters to the already filtered queryset. That probably doesn't make any difference, but it is hard to be sure without any unit tests.

It also seems likely that this ViewSet will never work as a retrieve/post/push endpoint - logically it can only do list. It isn't clear whether pagination makes any sense either. Therefore, a simpler and more standard definition would probably be:

class LivelihoodZoneBaselineReportViewSet(GenericViewSet): def list(self, request, *args, **kwargs): queryset = self.filter_queryset(self.get_queryset()) # All the other code from here. # <snip> serializer = self.get_serializer(queryset, many=True) return Response(serializer.data)

Interesting comment thanks.

We could just remove the filter_queryset call from this method and leave it to list.

I think putting it in get_queryset is more flexible, and a nice division of responsibilities with a clear interface, nicely encapsulated in the single-purpose method. Doing this in list would also work. I think of list more as a surface-level rendering function for functionality specific to the HTTP list method, than something for data and cell calculation, but DRF's not strict. I agree it won't ever PUT/PATCH/DELETE, I think 'get the top 10' will be common, and accidentally very large resultsets will be possible (and not easily cached), and flexibility is helpful while the requirements are being hammered out.

If you remove filter_queryset from here, then you will be doing the aggregation on the unfiltered queryset. I know that you are applying filter_queryset to the subqueries, but it still changing way that the function works and without a unit test it is hard to see whether it will affect the result.

I think that putting everything in list is the basic DRF approach for the most simple API endpoint. All the stuff that comes on top with ModelViewSet is supposed to make it easier to produce CRUD endpoints for models, which isn't what we want here.

I think that if you want to keep it in get_queryset then you still need LivelihoodZoneBaselineReportViewSet(mixins.ListModelMixin, GenericViewSet) rather than ModelViewSet.

I think putting it all in list is the "normal" DRF way - I think get_queryset should really only be altering the base queryset, e.g. adding select_related, etc. and the docs says that filter_queryset shouldn't need to be altered by the user. Here we are changing the number of rows returned by the queryset - i.e. we are changing the structure of the data to create a response that meets our needs. I think that belongs in list.

Another way to think about it is that if this viewset had another action, for example if it did also allow a POST, then would this code be going in get_queryset or in list?

That works, will do.

you will be doing the aggregation on the unfiltered queryset

The aggregation is not performed until the queryset is evaluated. It is lazy until after the filter call in list.

putting it all in list is the "normal" DRF way

I see the DRF architecture as a number of horizontal slices, with data extraction done by the bottom slices, data shaping in the middle, and rendering at the top. My understanding is that the queryset is constructed by get_queryset so that the data is defined only once for the whole viewset. HTTP methods are mid-way between extracting the data 'below', and rendering it 'above'. The RETRIEVE URL currently works. (By mistake, I will remove it!) POST doesn't use get_queryset, as it just runs an INSERT, PATCH does as it gets the object before updating.

rhunwicks · 2025-01-09T05:44:29Z

apps/baseline/serializers.py

        return str(obj.wealth_group)
+
+
+class DictQuerySetField(rest_framework_fields.SerializerMethodField):


I think this needs a class docstring explaining why it is required / what it does.

Will do thanks

rhunwicks · 2025-01-09T05:50:02Z

apps/baseline/serializers.py

+            }
+
+        # Add the ordering field if specified
+        ordering = self.context["request"].query_params.get("ordering")


If we are hard-coding the ordering parameter rather than using https://www.django-rest-framework.org/api-guide/settings/#ordering_param we need to document it. It would be better to use settings.ORDERING_PARAM

Will do thanks

rhunwicks · 2025-01-09T05:54:10Z

apps/baseline/viewsets.py

+
+    The API currently only supports a single slice at a time. For combining please run multiple searches, and add
+    the desired results to the Compare tab.
+    """


I think it would be useful if the docstring included an example response, e.g. a JSON one.

rhunwicks · 2025-01-09T05:57:06Z

apps/baseline/serializers.py

+    # Filters are automatically created, eg, min_kcals_consumed_percent and max_kcals_consumed_percent.
+    # If no ordering is specified by the FilterSet, the results are ordered by percent descending in the order here.
+    aggregates = {
+        "kcals_consumed": Sum,


what is preventing us from adding quantity_produced, quantity_consumed, income, expenditure, etc. - i.e. any field from LivelihoodActivity?

Nothing at all, wanted to get the overall design verified before building out loads of functionality.

I do think we want to be selective though, as the bare figures are unlikely to be useful and likely to be misleading, because, eg, kcals_consumed will not work for those LS types such as income that don't store that number - we need to convert cash to a kcals equivalent to incorporate those and produce a useful statistic. I suspect the most useful aggregates will be more use-case driven, for example composite and normalized:

"kcal_income": Sum( ( F("livelihood_strategies__livelihoodactivity__quantity_purchased") + F("livelihood_strategies__livelihoodactivity__quantity_produced") ) * F("livelihood_strategies__product__kcals_per_unit"), ),

This will need a small enhancement to work, but again I wanted to get the fundamentals reviewed and prototyped in the UI before getting clever.

I think that based on all the charts, we will need kcals_consumed, income and expenditure for most purposes - and a way to do the "complete" graph that converts income and expenditure to kcals.

rhunwicks · 2025-01-09T14:20:57Z

Based on discussions in Chat I am merging this now to enable testing, and because it won't have any knock-on impacts, but we will continue to evolve this MR and expect a second MR off the same branch in due course.

chrispreee force-pushed the HEA-547/filter-by-contrib-to-income branch 3 times, most recently from 8a210f8 to 10d0f1d Compare November 4, 2024 15:58

chrispreee requested review from girumb and rhunwicks November 5, 2024 15:42

chrispreee force-pushed the HEA-547/filter-by-contrib-to-income branch from 10d0f1d to 20f3d12 Compare November 13, 2024 11:40

Initial working draft of aggregating serializer and viewset, see #HEA…

7734573

…-547

chrispreee force-pushed the HEA-547/filter-by-contrib-to-income branch from 20f3d12 to 7734573 Compare November 13, 2024 11:50

rhunwicks requested changes Jan 9, 2025

View reviewed changes

rhunwicks approved these changes Jan 9, 2025

View reviewed changes

rhunwicks merged commit 81ade5c into main Jan 9, 2025
17 checks passed

		return str(obj.wealth_group)


		class DictQuerySetField(rest_framework_fields.SerializerMethodField):

Initial working draft of aggregating serializer and viewset, see #HEA… #127

Initial working draft of aggregating serializer and viewset, see #HEA… #127

Uh oh!

Conversation

chrispreee commented Nov 4, 2024

Uh oh!

rhunwicks left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chrispreee Jan 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chrispreee Jan 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rhunwicks commented Jan 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

chrispreee Jan 9, 2025 •

edited

Loading

chrispreee Jan 10, 2025 •

edited

Loading