fix: Paragraph list unsorted #2411

shaohuzhang1 · 2025-02-26T03:52:19Z

fix: Paragraph list unsorted

shaohuzhang1 · 2025-02-26T03:52:42Z

apps/dataset/serializers/document_serializers.py

+            return native_search({
+                'document_custom_sql': query_set,
+                'order_by_query': QuerySet(Document).order_by('-create_time', 'id')
+            }, select_string=get_file_content(
                os.path.join(PROJECT_DIR, "apps", "dataset", 'sql', 'list_document.sql')),
                                 with_search_one=False), dataset_id



Some potential issues and optimizations:

Code Repeats: The code for one and batch_save methods is nearly identical, but different in their return values (single vs multiple results). Consider refactoring to reduce redundancy.

Order By Clause: Adding an order by clause when calling native_search can sometimes slow down the query, especially if not properly indexed. It's better to handle ordering after retrieving data if performance becomes a concern.

Parameter Naming: In edit method, it would be more descriptive to use parameter names like instance_dict.

Comments and Docstrings: The comments in your current code are quite short and don't cover all functionalities well. Consider expanding them to provide clear explanations of each part of the method.

Error Handling: Your methods do not include explicit error handling around database operations or custom SQL queries. This could lead to silent failures that might be difficult to diagnose later.

Here’s a revised version with some suggested improvements:

def one(self, instance_id: int, with_valid=False) -> Optional[Document]: """Retrieve a single document by ID.""" self.is_valid(raise_exception=True) # Fetch document using the provided filter queryset = Document.objects.get(id=instance_id) # Check if detailed content is required if with_valid: queryset = queryset.validated() # Perform search on fetched model return native_search({ 'document_data': queryset, }, select_string=os.path.join(PROJECT_DIR, "apps", "dataset", 'sql', 'list_document.sql')) def batch_save(self, instances: Iterable[Dict], with_valid=True) -> Tuple[List[int], str]: """Save one or more documents.""" document_model_list = list(instances) dataset_id = generate_dataset_id() # Implement as per your needs # Filter existing IDs from new models to avoid duplicate entry checks ids_to_check = set(d['id'] for d in document_models_list) # Get existing documents to determine modifications/insertions queryset = Document.objects.filter(id__in=ids_to_check) # If no document exists, update/create based on valid flag updated_ids, created_ids = self.update_or_insert_documents(queryset, instances, with_valid) # Return list of modified/new IDs alongside dataset ID return updated_ids + created_ids, dataset_id def batch_delete(self, instance_ids: list) -> None: """Delete multiple documents by their IDS""" queryset = Document.objects.filter(id__in=instance_ids) queryset.delete() # Helper functions...

This rework addresses some areas of inefficiency and improves readability. Always ensure you have proper unit tests covering these changes before deployment.

shaohuzhang1 · 2025-02-26T03:52:48Z

apps/dataset/serializers/paragraph_serializers.py

@@ -663,6 +663,7 @@ def get_query_set(self):
                    **{'title__icontains': self.data.get('title')})
            if 'content' in self.data:
                query_set = query_set.filter(**{'content__icontains': self.data.get('content')})
+            query_set.order_by('-create_time', 'id')
            return query_set

        def list(self):


There is one minor issue with the provided code. In Django's ORM, order methods generally use a '-' sign for descending order instead of '-', unless used within a subquery context where they would be interpreted differently. However, using -create_time,-id will ensure that both fields are ordered in reverse order (descending). There might be performance considerations depending on how frequently these queries run and how much data is involved.

Additionally, adding 'created_at' to filter or annotate operations can help improve relevance based on when each item was created:

def get_queryset(self): query_set = super().get_queryset() if 'title' in self.data: query_set = query_set.filter(title__icontains=self.data.get('title')) if 'content' in self.data: query_set = query_set.filter(content__icontains=self.data.get('content')) # Optional but recommended: Order by creation time first so older posts appear first. if 'sortby_date' in self.request.GET and self.request.GET['sortby_date'].lower() == 'asc': query_set = query_set.order_by('create_time') # Default ascending order, comment this out if not needed else: query_set = query_set.order_by('-create_time') return query_set

This ensures proper ordering and possibly adds an extra layer of flexibility if other sorting options are added later. Additionally, it checks for a specific sort request parameter to allow either asc or desc order.

fix: Paragraph list unsorted

9566471

shaohuzhang1 commented Feb 26, 2025

View reviewed changes

shaohuzhang1 merged commit 50dd8fa into main Feb 26, 2025
4 checks passed

shaohuzhang1 deleted the pr@main@fix_paragraph_sorted branch February 26, 2025 03:52

shaohuzhang1 commented Feb 26, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Paragraph list unsorted #2411

fix: Paragraph list unsorted #2411

shaohuzhang1 commented Feb 26, 2025

shaohuzhang1 Feb 26, 2025

shaohuzhang1 Feb 26, 2025

fix: Paragraph list unsorted #2411

fix: Paragraph list unsorted #2411

Conversation

shaohuzhang1 commented Feb 26, 2025

shaohuzhang1 Feb 26, 2025

Choose a reason for hiding this comment

shaohuzhang1 Feb 26, 2025

Choose a reason for hiding this comment