-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
process instance report refactor #977
Conversation
Warning Rate Limit Exceeded@jasquat has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 20 minutes and 36 seconds before requesting another review. How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. WalkthroughThe recent updates aim to refine process instance reporting functionality by enhancing task filtering based on user interaction status, improving metadata filtering, and upgrading query generation for reports. These changes collectively aim to boost the efficiency and accuracy of process instance reporting. Changes
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 11
Configuration used: .coderabbit.yaml
Files selected for processing (1)
- spiffworkflow-backend/src/spiffworkflow_backend/services/process_instance_report_service.py (9 hunks)
spiffworkflow-backend/src/spiffworkflow_backend/services/process_instance_report_service.py
Outdated
Show resolved
Hide resolved
spiffworkflow-backend/src/spiffworkflow_backend/services/process_instance_report_service.py
Show resolved
Hide resolved
# FIXME: this may be problematic | ||
# if user_group_identifier filter is set to something you are not in | ||
process_instance_query = process_instance_query.filter(UserGroupAssignmentModel.user_id == user.id) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment on lines 438-441 indicates a potential issue with filtering by user_group_identifier
. This should be addressed or clarified to ensure the logic is correct and does not inadvertently exclude valid data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will fix in a future refactor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jasquat: Understood. If you need assistance with the refactor or have any questions in the future, feel free to reach out.
spiffworkflow-backend/src/spiffworkflow_backend/services/process_instance_report_service.py
Show resolved
Hide resolved
spiffworkflow-backend/src/spiffworkflow_backend/services/process_instance_report_service.py
Show resolved
Hide resolved
|
||
process_instances = ( | ||
process_instance_query.group_by(ProcessInstanceModel.id) | ||
process_instance_query.group_by(ProcessInstanceModel.id) # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use of group_by(ProcessInstanceModel.id)
on line 718 without including all selected columns in the group_by
clause or using them in aggregate functions may lead to undefined behavior in some SQL databases. Ensure this is the intended usage and consider potential database compatibility issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We add the column to the select in the next line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jasquat: I understand your point about adding the column to the select statement. However, the concern raised was specifically about the SQL group_by
behavior, where all selected columns that are not part of an aggregate function should also be included in the group_by
clause to ensure defined behavior across different SQL databases. This is to prevent potential compatibility issues or unexpected results, as SQL standards require. If your query's logic is indeed correct and tested across the databases you intend to support, this might not be an issue. However, it's generally good practice to ensure that all non-aggregated selected columns are either included in the group_by
clause or used within aggregate functions to guarantee consistent behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is working as intended.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jasquat: Understood. If the implementation is working as intended and has been tested for your specific use case and database compatibility, then it should be fine. Always good to double-check these aspects for cross-database compatibility and to ensure the behavior remains consistent. If you have any more questions or need further assistance, feel free to ask.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 8
Configuration used: .coderabbit.yaml
Files selected for processing (1)
- spiffworkflow-backend/src/spiffworkflow_backend/services/process_instance_report_service.py (9 hunks)
Additional comments: 3
spiffworkflow-backend/src/spiffworkflow_backend/services/process_instance_report_service.py (3)
- 436-439: The comment on lines 436-437 indicates a potential issue with filtering by
user_group_identifier
. This should be addressed or clarified to ensure the logic is correct and does not inadvertently exclude valid data.- 650-650: The use of
outerjoin
on line 650 may not be optimal if the intention is to filter process instances based on tasks related to the user. Consider if aninner join
would be more appropriate for the intended filter logic.- 716-716: The use of
group_by(ProcessInstanceModel.id)
on line 716 without including all selected columns in thegroup_by
clause or using them in aggregate functions may lead to undefined behavior in some SQL databases. Ensure this is the intended usage and consider potential database compatibility issues.
columns_for_metadata_query = db.session.query( | ||
ProcessInstanceMetadataModel.key.distinct() # type: ignore | ||
).order_by(ProcessInstanceMetadataModel.key) | ||
if process_model_identifier: | ||
columns_for_metadata_query = columns_for_metadata_query.join(ProcessInstanceModel) | ||
columns_for_metadata_query = columns_for_metadata_query.join(ProcessInstanceModel) # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding a comment explaining the purpose of joining ProcessInstanceModel
only when process_model_identifier
is provided. This helps maintain clarity and understand the conditional logic.
if filter_found is False: | ||
filters.append(new_filter) | ||
|
||
# When we say we want to filter by "waiting for group" or "waiting for specific user," what we probably assume is that | ||
# there are human tasks that those people can actually complete right now. | ||
# We don't exactly have that in the query directly, but if you pass a filter for user_group_identifier, it will get into | ||
# this function, and if you pass any statuses, and if they are all "active" then it will do what you want, which is | ||
# to look for only HumanTaskModel.completed.is_(False). So...we should probably make the widget add filters for | ||
# both user_group_identifier and status. And we should make a method that does a similar thing for waiting for users. | ||
@classmethod | ||
def filter_by_user_group_identifier( | ||
cls, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
📝 NOTE
This review was outside the diff hunks and was mapped to the diff hunk with the greatest overlap. Original lines [393-439]
The method filter_by_user_group_identifier
lacks documentation. Adding docstrings explaining the method's purpose, parameters, and return value would improve maintainability and readability.
Would you like me to generate the docstring for this method?
def filter_by_instances_with_tasks_waiting_for_me( | ||
cls, | ||
process_instance_query: Query, | ||
user: UserModel, | ||
) -> Query: | ||
process_instance_query = process_instance_query.filter(ProcessInstanceModel.process_initiator_id != user.id) | ||
process_instance_query = process_instance_query.join( | ||
HumanTaskModel, | ||
and_( | ||
HumanTaskModel.process_instance_id == ProcessInstanceModel.id, | ||
HumanTaskModel.completed.is_(False), # type: ignore | ||
), | ||
).join( | ||
HumanTaskUserModel, | ||
and_(HumanTaskUserModel.human_task_id == HumanTaskModel.id, HumanTaskUserModel.user_id == user.id), | ||
) | ||
|
||
user_group_assignment_for_lane_assignment = aliased(UserGroupAssignmentModel) | ||
process_instance_query = process_instance_query.outerjoin( # type: ignore | ||
user_group_assignment_for_lane_assignment, | ||
and_( | ||
user_group_assignment_for_lane_assignment.group_id == HumanTaskModel.lane_assignment_id, | ||
user_group_assignment_for_lane_assignment.user_id == user.id, | ||
), | ||
).filter( | ||
# it should show up in your "Waiting for me" list IF: | ||
# 1) task is not assigned to a group OR | ||
# 2) you are not in the group | ||
# In the case of number 2, it probably means you were added to the task individually by an admin | ||
or_( | ||
HumanTaskModel.lane_assignment_id.is_(None), # type: ignore | ||
user_group_assignment_for_lane_assignment.group_id.is_(None), | ||
) | ||
) | ||
|
||
return process_instance_query |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method filter_by_instances_with_tasks_waiting_for_me
lacks documentation. Adding docstrings explaining the method's purpose, parameters, and return value would improve maintainability and readability.
Would you like me to generate the docstring for this method?
def filter_by_instances_with_tasks_completed_by_me( | ||
cls, | ||
process_instance_query: Query, | ||
user: UserModel, | ||
) -> Query: | ||
process_instance_query = process_instance_query.filter(ProcessInstanceModel.process_initiator_id != user.id) | ||
process_instance_query = process_instance_query.join( # type: ignore | ||
HumanTaskModel, | ||
and_( | ||
HumanTaskModel.process_instance_id == ProcessInstanceModel.id, | ||
HumanTaskModel.completed_by_user_id == user.id, | ||
), | ||
) | ||
return process_instance_query |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method filter_by_instances_with_tasks_completed_by_me
lacks documentation. Adding docstrings explaining the method's purpose, parameters, and return value would improve maintainability and readability.
Would you like me to generate the docstring for this method?
def add_where_clauses_for_process_instance_metadata_filters( | ||
cls, | ||
process_instance_query: Query, | ||
report_metadata: ReportMetadata, | ||
user: UserModel | None = None, | ||
page: int = 1, | ||
per_page: int = 100, | ||
) -> dict: | ||
process_instance_query = ProcessInstanceModel.query | ||
instance_metadata_aliases: dict[str, Any], | ||
) -> Query: | ||
for column in report_metadata["columns"]: | ||
if column["accessor"] in cls.non_metadata_columns(): | ||
continue | ||
instance_metadata_alias = aliased(ProcessInstanceMetadataModel) | ||
instance_metadata_aliases[column["accessor"]] = instance_metadata_alias | ||
|
||
filters_for_column = [] | ||
if "filter_by" in report_metadata: | ||
filters_for_column = [f for f in report_metadata["filter_by"] if f["field_name"] == column["accessor"]] | ||
isouter = True | ||
join_conditions = [ | ||
ProcessInstanceModel.id == instance_metadata_alias.process_instance_id, | ||
instance_metadata_alias.key == column["accessor"], | ||
] | ||
if len(filters_for_column) > 0: | ||
for filter_for_column in filters_for_column: | ||
isouter = False | ||
if "operator" not in filter_for_column or filter_for_column["operator"] == "equals": | ||
join_conditions.append(instance_metadata_alias.value == filter_for_column["field_value"]) | ||
elif filter_for_column["operator"] == "not_equals": | ||
join_conditions.append(instance_metadata_alias.value != filter_for_column["field_value"]) | ||
elif filter_for_column["operator"] == "greater_than_or_equal_to": | ||
join_conditions.append(instance_metadata_alias.value >= filter_for_column["field_value"]) | ||
elif filter_for_column["operator"] == "less_than": | ||
join_conditions.append(instance_metadata_alias.value < filter_for_column["field_value"]) | ||
elif filter_for_column["operator"] == "contains": | ||
join_conditions.append(instance_metadata_alias.value.like(f"%{filter_for_column['field_value']}%")) | ||
elif filter_for_column["operator"] == "is_empty": | ||
# we still need to return results if the metadata value is null so make sure it's outer join | ||
isouter = True | ||
process_instance_query = process_instance_query.filter( | ||
or_(instance_metadata_alias.value.is_(None), instance_metadata_alias.value == "") | ||
) | ||
elif filter_for_column["operator"] == "is_not_empty": | ||
join_conditions.append( | ||
or_(instance_metadata_alias.value.is_not(None), instance_metadata_alias.value != "") | ||
) | ||
process_instance_query = process_instance_query.join( # type: ignore | ||
instance_metadata_alias, and_(*join_conditions), isouter=isouter | ||
).add_columns(func.max(instance_metadata_alias.value).label(column["accessor"])) | ||
return process_instance_query |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method add_where_clauses_for_process_instance_metadata_filters
lacks documentation. Adding docstrings explaining the method's purpose, parameters, and return value would improve maintainability and readability.
Would you like me to generate the docstring for this method?
def generate_order_by_query_array( | ||
cls, | ||
report_metadata: ReportMetadata, | ||
instance_metadata_aliases: dict[str, Any], | ||
) -> list: | ||
order_by_query_array = [] | ||
order_by_array = report_metadata["order_by"] | ||
if len(order_by_array) < 1: | ||
order_by_array = ProcessInstanceReportModel.default_order_by() | ||
for order_by_option in order_by_array: | ||
attribute = re.sub("^-", "", order_by_option) | ||
if attribute in cls.process_instance_stock_columns(): | ||
if order_by_option.startswith("-"): | ||
order_by_query_array.append(getattr(ProcessInstanceModel, attribute).desc()) | ||
else: | ||
order_by_query_array.append(getattr(ProcessInstanceModel, attribute).asc()) | ||
elif attribute in instance_metadata_aliases: | ||
if order_by_option.startswith("-"): | ||
order_by_query_array.append(func.max(instance_metadata_aliases[attribute].value).desc()) | ||
else: | ||
order_by_query_array.append(func.max(instance_metadata_aliases[attribute].value).asc()) | ||
return order_by_query_array |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method generate_order_by_query_array
lacks documentation. Adding docstrings explaining the method's purpose, parameters, and return value would improve maintainability and readability.
Would you like me to generate the docstring for this method?
process_instance_query = process_instance_query.join( # type: ignore | ||
HumanTaskModel, | ||
and_( | ||
HumanTaskModel.process_instance_id == ProcessInstanceModel.id, | ||
HumanTaskModel.completed_by_user_id == user.id, | ||
), | ||
) | ||
return process_instance_query | ||
|
||
@classmethod | ||
def add_where_clauses_for_process_instance_metadata_filters( | ||
cls, | ||
process_instance_query: Query, | ||
report_metadata: ReportMetadata, | ||
user: UserModel | None = None, | ||
page: int = 1, | ||
per_page: int = 100, | ||
) -> dict: | ||
process_instance_query = ProcessInstanceModel.query | ||
instance_metadata_aliases: dict[str, Any], | ||
) -> Query: | ||
for column in report_metadata["columns"]: | ||
if column["accessor"] in cls.non_metadata_columns(): | ||
continue | ||
instance_metadata_alias = aliased(ProcessInstanceMetadataModel) | ||
instance_metadata_aliases[column["accessor"]] = instance_metadata_alias | ||
|
||
filters_for_column = [] | ||
if "filter_by" in report_metadata: | ||
filters_for_column = [f for f in report_metadata["filter_by"] if f["field_name"] == column["accessor"]] | ||
isouter = True | ||
join_conditions = [ | ||
ProcessInstanceModel.id == instance_metadata_alias.process_instance_id, | ||
instance_metadata_alias.key == column["accessor"], | ||
] | ||
if len(filters_for_column) > 0: | ||
for filter_for_column in filters_for_column: | ||
isouter = False | ||
if "operator" not in filter_for_column or filter_for_column["operator"] == "equals": | ||
join_conditions.append(instance_metadata_alias.value == filter_for_column["field_value"]) | ||
elif filter_for_column["operator"] == "not_equals": | ||
join_conditions.append(instance_metadata_alias.value != filter_for_column["field_value"]) | ||
elif filter_for_column["operator"] == "greater_than_or_equal_to": | ||
join_conditions.append(instance_metadata_alias.value >= filter_for_column["field_value"]) | ||
elif filter_for_column["operator"] == "less_than": | ||
join_conditions.append(instance_metadata_alias.value < filter_for_column["field_value"]) | ||
elif filter_for_column["operator"] == "contains": | ||
join_conditions.append(instance_metadata_alias.value.like(f"%{filter_for_column['field_value']}%")) | ||
elif filter_for_column["operator"] == "is_empty": | ||
# we still need to return results if the metadata value is null so make sure it's outer join | ||
isouter = True | ||
process_instance_query = process_instance_query.filter( | ||
or_(instance_metadata_alias.value.is_(None), instance_metadata_alias.value == "") | ||
) | ||
elif filter_for_column["operator"] == "is_not_empty": | ||
join_conditions.append( | ||
or_(instance_metadata_alias.value.is_not(None), instance_metadata_alias.value != "") | ||
) | ||
process_instance_query = process_instance_query.join( # type: ignore | ||
instance_metadata_alias, and_(*join_conditions), isouter=isouter | ||
).add_columns(func.max(instance_metadata_alias.value).label(column["accessor"])) | ||
return process_instance_query | ||
|
||
@classmethod | ||
def generate_order_by_query_array( | ||
cls, | ||
report_metadata: ReportMetadata, | ||
instance_metadata_aliases: dict[str, Any], | ||
) -> list: | ||
order_by_query_array = [] | ||
order_by_array = report_metadata["order_by"] | ||
if len(order_by_array) < 1: | ||
order_by_array = ProcessInstanceReportModel.default_order_by() | ||
for order_by_option in order_by_array: | ||
attribute = re.sub("^-", "", order_by_option) | ||
if attribute in cls.process_instance_stock_columns(): | ||
if order_by_option.startswith("-"): | ||
order_by_query_array.append(getattr(ProcessInstanceModel, attribute).desc()) | ||
else: | ||
order_by_query_array.append(getattr(ProcessInstanceModel, attribute).asc()) | ||
elif attribute in instance_metadata_aliases: | ||
if order_by_option.startswith("-"): | ||
order_by_query_array.append(func.max(instance_metadata_aliases[attribute].value).desc()) | ||
else: | ||
order_by_query_array.append(func.max(instance_metadata_aliases[attribute].value).asc()) | ||
return order_by_query_array | ||
|
||
@classmethod | ||
def get_basic_query( | ||
cls, | ||
filters: list[FilterValue], | ||
) -> Query: | ||
process_instance_query: Query = ProcessInstanceModel.query | ||
# Always join that hot user table for good performance at serialization time. | ||
process_instance_query = process_instance_query.options(selectinload(ProcessInstanceModel.process_initiator)) | ||
filters = report_metadata["filter_by"] | ||
restrict_human_tasks_to_user = None | ||
|
||
for value in cls.check_filter_value(filters, "process_model_identifier"): | ||
process_model = ProcessModelService.get_process_model( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
📝 NOTE
This review was outside the diff hunks and was mapped to the diff hunk with the greatest overlap. Original lines [570-615]
The method get_basic_query
lacks documentation. Adding docstrings explaining the method's purpose, parameters, and return value would improve maintainability and readability.
Would you like me to generate the docstring for this method?
def run_process_instance_report( | ||
cls, | ||
report_metadata: ReportMetadata, | ||
user: UserModel | None = None, | ||
page: int = 1, | ||
per_page: int = 100, | ||
) -> dict: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method run_process_instance_report
lacks documentation. Adding docstrings explaining the method's purpose, parameters, and return value would improve maintainability and readability.
Would you like me to generate the docstring for this method?
Some refactoring to the process instance report service to move code to their own methods for better readability and maintainability.
Summary by CodeRabbit