-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Paragraph list unsorted #2411
Conversation
return native_search({ | ||
'document_custom_sql': query_set, | ||
'order_by_query': QuerySet(Document).order_by('-create_time', 'id') | ||
}, select_string=get_file_content( | ||
os.path.join(PROJECT_DIR, "apps", "dataset", 'sql', 'list_document.sql')), | ||
with_search_one=False), dataset_id | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some potential issues and optimizations:
-
Code Repeats: The code for
one
andbatch_save
methods is nearly identical, but different in their return values (single vs multiple results). Consider refactoring to reduce redundancy. -
Order By Clause: Adding an order by clause when calling
native_search
can sometimes slow down the query, especially if not properly indexed. It's better to handle ordering after retrieving data if performance becomes a concern. -
Parameter Naming: In
edit
method, it would be more descriptive to use parameter names likeinstance_dict
. -
Comments and Docstrings: The comments in your current code are quite short and don't cover all functionalities well. Consider expanding them to provide clear explanations of each part of the method.
-
Error Handling: Your methods do not include explicit error handling around database operations or custom SQL queries. This could lead to silent failures that might be difficult to diagnose later.
Here’s a revised version with some suggested improvements:
def one(self, instance_id: int, with_valid=False) -> Optional[Document]:
"""Retrieve a single document by ID."""
self.is_valid(raise_exception=True)
# Fetch document using the provided filter
queryset = Document.objects.get(id=instance_id)
# Check if detailed content is required
if with_valid:
queryset = queryset.validated()
# Perform search on fetched model
return native_search({
'document_data': queryset,
}, select_string=os.path.join(PROJECT_DIR, "apps", "dataset", 'sql', 'list_document.sql'))
def batch_save(self, instances: Iterable[Dict], with_valid=True) -> Tuple[List[int], str]:
"""Save one or more documents."""
document_model_list = list(instances)
dataset_id = generate_dataset_id() # Implement as per your needs
# Filter existing IDs from new models to avoid duplicate entry checks
ids_to_check = set(d['id'] for d in document_models_list)
# Get existing documents to determine modifications/insertions
queryset = Document.objects.filter(id__in=ids_to_check)
# If no document exists, update/create based on valid flag
updated_ids, created_ids = self.update_or_insert_documents(queryset, instances, with_valid)
# Return list of modified/new IDs alongside dataset ID
return updated_ids + created_ids, dataset_id
def batch_delete(self, instance_ids: list) -> None:
"""Delete multiple documents by their IDS"""
queryset = Document.objects.filter(id__in=instance_ids)
queryset.delete()
# Helper functions...
This rework addresses some areas of inefficiency and improves readability. Always ensure you have proper unit tests covering these changes before deployment.
@@ -663,6 +663,7 @@ def get_query_set(self): | |||
**{'title__icontains': self.data.get('title')}) | |||
if 'content' in self.data: | |||
query_set = query_set.filter(**{'content__icontains': self.data.get('content')}) | |||
query_set.order_by('-create_time', 'id') | |||
return query_set | |||
|
|||
def list(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is one minor issue with the provided code. In Django's ORM, order
methods generally use a '-' sign for descending order instead of '-', unless used within a subquery context where they would be interpreted differently. However, using -create_time,-id
will ensure that both fields are ordered in reverse order (descending). There might be performance considerations depending on how frequently these queries run and how much data is involved.
Additionally, adding 'created_at'
to filter or annotate operations can help improve relevance based on when each item was created:
def get_queryset(self):
query_set = super().get_queryset()
if 'title' in self.data:
query_set = query_set.filter(title__icontains=self.data.get('title'))
if 'content' in self.data:
query_set = query_set.filter(content__icontains=self.data.get('content'))
# Optional but recommended: Order by creation time first so older posts appear first.
if 'sortby_date' in self.request.GET and self.request.GET['sortby_date'].lower() == 'asc':
query_set = query_set.order_by('create_time') # Default ascending order, comment this out if not needed
else:
query_set = query_set.order_by('-create_time')
return query_set
This ensures proper ordering and possibly adds an extra layer of flexibility if other sorting options are added later. Additionally, it checks for a specific sort request parameter to allow either asc or desc order.
fix: Paragraph list unsorted