Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

process instance report refactor #977

Merged
merged 5 commits into from
Feb 6, 2024
Merged

Conversation

jasquat
Copy link
Contributor

@jasquat jasquat commented Feb 6, 2024

Some refactoring to the process instance report service to move code to their own methods for better readability and maintainability.

Summary by CodeRabbit

  • Enhancements
    • Improved filtering logic for tasks waiting for or completed by a user.
    • Enhanced metadata filtering.
    • Optimized query generation for process instance reports.
    • Updated existing summary for better clarity.

Copy link
Contributor

coderabbitai bot commented Feb 6, 2024

Warning

Rate Limit Exceeded

@jasquat has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 20 minutes and 36 seconds before requesting another review.

How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.
Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.
Please see our FAQ for further information.

Commits Files that changed from the base of the PR and between 0dd8fa1 and 6163bc6.

Walkthrough

The recent updates aim to refine process instance reporting functionality by enhancing task filtering based on user interaction status, improving metadata filtering, and upgrading query generation for reports. These changes collectively aim to boost the efficiency and accuracy of process instance reporting.

Changes

File Path Change Summary
.../services/process_instance_report_service.py Adjusted task filtering logic, modified metadata filtering, and enhanced process instance report query generation.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share

Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit-tests for this file.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit tests for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository from git and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit tests.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • The JSON schema for the configuration file is available here.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json

CodeRabbit Discord Community

Join our Discord Community to get help, request features, and share feedback.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 11

Configuration used: .coderabbit.yaml

Commits Files that changed from the base of the PR and between 4d976bf and 0e3dbeb.
Files selected for processing (1)
  • spiffworkflow-backend/src/spiffworkflow_backend/services/process_instance_report_service.py (9 hunks)

Comment on lines +438 to +441
# FIXME: this may be problematic
# if user_group_identifier filter is set to something you are not in
process_instance_query = process_instance_query.filter(UserGroupAssignmentModel.user_id == user.id)

Copy link
Contributor

@coderabbitai coderabbitai bot Feb 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment on lines 438-441 indicates a potential issue with filtering by user_group_identifier. This should be addressed or clarified to ensure the logic is correct and does not inadvertently exclude valid data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix in a future refactor.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jasquat: Understood. If you need assistance with the refactor or have any questions in the future, feel free to reach out.


process_instances = (
process_instance_query.group_by(ProcessInstanceModel.id)
process_instance_query.group_by(ProcessInstanceModel.id) # type: ignore
Copy link
Contributor

@coderabbitai coderabbitai bot Feb 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of group_by(ProcessInstanceModel.id) on line 718 without including all selected columns in the group_by clause or using them in aggregate functions may lead to undefined behavior in some SQL databases. Ensure this is the intended usage and consider potential database compatibility issues.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We add the column to the select in the next line.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jasquat: I understand your point about adding the column to the select statement. However, the concern raised was specifically about the SQL group_by behavior, where all selected columns that are not part of an aggregate function should also be included in the group_by clause to ensure defined behavior across different SQL databases. This is to prevent potential compatibility issues or unexpected results, as SQL standards require. If your query's logic is indeed correct and tested across the databases you intend to support, this might not be an issue. However, it's generally good practice to ensure that all non-aggregated selected columns are either included in the group_by clause or used within aggregate functions to guarantee consistent behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is working as intended.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jasquat: Understood. If the implementation is working as intended and has been tested for your specific use case and database compatibility, then it should be fine. Always good to double-check these aspects for cross-database compatibility and to ensure the behavior remains consistent. If you have any more questions or need further assistance, feel free to ask.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 8

Configuration used: .coderabbit.yaml

Commits Files that changed from the base of the PR and between 0e3dbeb and 0dd8fa1.
Files selected for processing (1)
  • spiffworkflow-backend/src/spiffworkflow_backend/services/process_instance_report_service.py (9 hunks)
Additional comments: 3
spiffworkflow-backend/src/spiffworkflow_backend/services/process_instance_report_service.py (3)
  • 436-439: The comment on lines 436-437 indicates a potential issue with filtering by user_group_identifier. This should be addressed or clarified to ensure the logic is correct and does not inadvertently exclude valid data.
  • 650-650: The use of outerjoin on line 650 may not be optimal if the intention is to filter process instances based on tasks related to the user. Consider if an inner join would be more appropriate for the intended filter logic.
  • 716-716: The use of group_by(ProcessInstanceModel.id) on line 716 without including all selected columns in the group_by clause or using them in aggregate functions may lead to undefined behavior in some SQL databases. Ensure this is the intended usage and consider potential database compatibility issues.

Comment on lines 186 to 190
columns_for_metadata_query = db.session.query(
ProcessInstanceMetadataModel.key.distinct() # type: ignore
).order_by(ProcessInstanceMetadataModel.key)
if process_model_identifier:
columns_for_metadata_query = columns_for_metadata_query.join(ProcessInstanceModel)
columns_for_metadata_query = columns_for_metadata_query.join(ProcessInstanceModel) # type: ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a comment explaining the purpose of joining ProcessInstanceModel only when process_model_identifier is provided. This helps maintain clarity and understand the conditional logic.

Comment on lines 390 to 401
if filter_found is False:
filters.append(new_filter)

# When we say we want to filter by "waiting for group" or "waiting for specific user," what we probably assume is that
# there are human tasks that those people can actually complete right now.
# We don't exactly have that in the query directly, but if you pass a filter for user_group_identifier, it will get into
# this function, and if you pass any statuses, and if they are all "active" then it will do what you want, which is
# to look for only HumanTaskModel.completed.is_(False). So...we should probably make the widget add filters for
# both user_group_identifier and status. And we should make a method that does a similar thing for waiting for users.
@classmethod
def filter_by_user_group_identifier(
cls,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 NOTE
This review was outside the diff hunks and was mapped to the diff hunk with the greatest overlap. Original lines [393-439]

The method filter_by_user_group_identifier lacks documentation. Adding docstrings explaining the method's purpose, parameters, and return value would improve maintainability and readability.

Would you like me to generate the docstring for this method?

Comment on lines +443 to +478
def filter_by_instances_with_tasks_waiting_for_me(
cls,
process_instance_query: Query,
user: UserModel,
) -> Query:
process_instance_query = process_instance_query.filter(ProcessInstanceModel.process_initiator_id != user.id)
process_instance_query = process_instance_query.join(
HumanTaskModel,
and_(
HumanTaskModel.process_instance_id == ProcessInstanceModel.id,
HumanTaskModel.completed.is_(False), # type: ignore
),
).join(
HumanTaskUserModel,
and_(HumanTaskUserModel.human_task_id == HumanTaskModel.id, HumanTaskUserModel.user_id == user.id),
)

user_group_assignment_for_lane_assignment = aliased(UserGroupAssignmentModel)
process_instance_query = process_instance_query.outerjoin( # type: ignore
user_group_assignment_for_lane_assignment,
and_(
user_group_assignment_for_lane_assignment.group_id == HumanTaskModel.lane_assignment_id,
user_group_assignment_for_lane_assignment.user_id == user.id,
),
).filter(
# it should show up in your "Waiting for me" list IF:
# 1) task is not assigned to a group OR
# 2) you are not in the group
# In the case of number 2, it probably means you were added to the task individually by an admin
or_(
HumanTaskModel.lane_assignment_id.is_(None), # type: ignore
user_group_assignment_for_lane_assignment.group_id.is_(None),
)
)

return process_instance_query
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method filter_by_instances_with_tasks_waiting_for_me lacks documentation. Adding docstrings explaining the method's purpose, parameters, and return value would improve maintainability and readability.

Would you like me to generate the docstring for this method?

Comment on lines +481 to +494
def filter_by_instances_with_tasks_completed_by_me(
cls,
process_instance_query: Query,
user: UserModel,
) -> Query:
process_instance_query = process_instance_query.filter(ProcessInstanceModel.process_initiator_id != user.id)
process_instance_query = process_instance_query.join( # type: ignore
HumanTaskModel,
and_(
HumanTaskModel.process_instance_id == ProcessInstanceModel.id,
HumanTaskModel.completed_by_user_id == user.id,
),
)
return process_instance_query
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method filter_by_instances_with_tasks_completed_by_me lacks documentation. Adding docstrings explaining the method's purpose, parameters, and return value would improve maintainability and readability.

Would you like me to generate the docstring for this method?

Comment on lines +497 to +543
def add_where_clauses_for_process_instance_metadata_filters(
cls,
process_instance_query: Query,
report_metadata: ReportMetadata,
user: UserModel | None = None,
page: int = 1,
per_page: int = 100,
) -> dict:
process_instance_query = ProcessInstanceModel.query
instance_metadata_aliases: dict[str, Any],
) -> Query:
for column in report_metadata["columns"]:
if column["accessor"] in cls.non_metadata_columns():
continue
instance_metadata_alias = aliased(ProcessInstanceMetadataModel)
instance_metadata_aliases[column["accessor"]] = instance_metadata_alias

filters_for_column = []
if "filter_by" in report_metadata:
filters_for_column = [f for f in report_metadata["filter_by"] if f["field_name"] == column["accessor"]]
isouter = True
join_conditions = [
ProcessInstanceModel.id == instance_metadata_alias.process_instance_id,
instance_metadata_alias.key == column["accessor"],
]
if len(filters_for_column) > 0:
for filter_for_column in filters_for_column:
isouter = False
if "operator" not in filter_for_column or filter_for_column["operator"] == "equals":
join_conditions.append(instance_metadata_alias.value == filter_for_column["field_value"])
elif filter_for_column["operator"] == "not_equals":
join_conditions.append(instance_metadata_alias.value != filter_for_column["field_value"])
elif filter_for_column["operator"] == "greater_than_or_equal_to":
join_conditions.append(instance_metadata_alias.value >= filter_for_column["field_value"])
elif filter_for_column["operator"] == "less_than":
join_conditions.append(instance_metadata_alias.value < filter_for_column["field_value"])
elif filter_for_column["operator"] == "contains":
join_conditions.append(instance_metadata_alias.value.like(f"%{filter_for_column['field_value']}%"))
elif filter_for_column["operator"] == "is_empty":
# we still need to return results if the metadata value is null so make sure it's outer join
isouter = True
process_instance_query = process_instance_query.filter(
or_(instance_metadata_alias.value.is_(None), instance_metadata_alias.value == "")
)
elif filter_for_column["operator"] == "is_not_empty":
join_conditions.append(
or_(instance_metadata_alias.value.is_not(None), instance_metadata_alias.value != "")
)
process_instance_query = process_instance_query.join( # type: ignore
instance_metadata_alias, and_(*join_conditions), isouter=isouter
).add_columns(func.max(instance_metadata_alias.value).label(column["accessor"]))
return process_instance_query
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method add_where_clauses_for_process_instance_metadata_filters lacks documentation. Adding docstrings explaining the method's purpose, parameters, and return value would improve maintainability and readability.

Would you like me to generate the docstring for this method?

Comment on lines +546 to +567
def generate_order_by_query_array(
cls,
report_metadata: ReportMetadata,
instance_metadata_aliases: dict[str, Any],
) -> list:
order_by_query_array = []
order_by_array = report_metadata["order_by"]
if len(order_by_array) < 1:
order_by_array = ProcessInstanceReportModel.default_order_by()
for order_by_option in order_by_array:
attribute = re.sub("^-", "", order_by_option)
if attribute in cls.process_instance_stock_columns():
if order_by_option.startswith("-"):
order_by_query_array.append(getattr(ProcessInstanceModel, attribute).desc())
else:
order_by_query_array.append(getattr(ProcessInstanceModel, attribute).asc())
elif attribute in instance_metadata_aliases:
if order_by_option.startswith("-"):
order_by_query_array.append(func.max(instance_metadata_aliases[attribute].value).desc())
else:
order_by_query_array.append(func.max(instance_metadata_aliases[attribute].value).asc())
return order_by_query_array
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method generate_order_by_query_array lacks documentation. Adding docstrings explaining the method's purpose, parameters, and return value would improve maintainability and readability.

Would you like me to generate the docstring for this method?

Comment on lines 432 to 579
process_instance_query = process_instance_query.join( # type: ignore
HumanTaskModel,
and_(
HumanTaskModel.process_instance_id == ProcessInstanceModel.id,
HumanTaskModel.completed_by_user_id == user.id,
),
)
return process_instance_query

@classmethod
def add_where_clauses_for_process_instance_metadata_filters(
cls,
process_instance_query: Query,
report_metadata: ReportMetadata,
user: UserModel | None = None,
page: int = 1,
per_page: int = 100,
) -> dict:
process_instance_query = ProcessInstanceModel.query
instance_metadata_aliases: dict[str, Any],
) -> Query:
for column in report_metadata["columns"]:
if column["accessor"] in cls.non_metadata_columns():
continue
instance_metadata_alias = aliased(ProcessInstanceMetadataModel)
instance_metadata_aliases[column["accessor"]] = instance_metadata_alias

filters_for_column = []
if "filter_by" in report_metadata:
filters_for_column = [f for f in report_metadata["filter_by"] if f["field_name"] == column["accessor"]]
isouter = True
join_conditions = [
ProcessInstanceModel.id == instance_metadata_alias.process_instance_id,
instance_metadata_alias.key == column["accessor"],
]
if len(filters_for_column) > 0:
for filter_for_column in filters_for_column:
isouter = False
if "operator" not in filter_for_column or filter_for_column["operator"] == "equals":
join_conditions.append(instance_metadata_alias.value == filter_for_column["field_value"])
elif filter_for_column["operator"] == "not_equals":
join_conditions.append(instance_metadata_alias.value != filter_for_column["field_value"])
elif filter_for_column["operator"] == "greater_than_or_equal_to":
join_conditions.append(instance_metadata_alias.value >= filter_for_column["field_value"])
elif filter_for_column["operator"] == "less_than":
join_conditions.append(instance_metadata_alias.value < filter_for_column["field_value"])
elif filter_for_column["operator"] == "contains":
join_conditions.append(instance_metadata_alias.value.like(f"%{filter_for_column['field_value']}%"))
elif filter_for_column["operator"] == "is_empty":
# we still need to return results if the metadata value is null so make sure it's outer join
isouter = True
process_instance_query = process_instance_query.filter(
or_(instance_metadata_alias.value.is_(None), instance_metadata_alias.value == "")
)
elif filter_for_column["operator"] == "is_not_empty":
join_conditions.append(
or_(instance_metadata_alias.value.is_not(None), instance_metadata_alias.value != "")
)
process_instance_query = process_instance_query.join( # type: ignore
instance_metadata_alias, and_(*join_conditions), isouter=isouter
).add_columns(func.max(instance_metadata_alias.value).label(column["accessor"]))
return process_instance_query

@classmethod
def generate_order_by_query_array(
cls,
report_metadata: ReportMetadata,
instance_metadata_aliases: dict[str, Any],
) -> list:
order_by_query_array = []
order_by_array = report_metadata["order_by"]
if len(order_by_array) < 1:
order_by_array = ProcessInstanceReportModel.default_order_by()
for order_by_option in order_by_array:
attribute = re.sub("^-", "", order_by_option)
if attribute in cls.process_instance_stock_columns():
if order_by_option.startswith("-"):
order_by_query_array.append(getattr(ProcessInstanceModel, attribute).desc())
else:
order_by_query_array.append(getattr(ProcessInstanceModel, attribute).asc())
elif attribute in instance_metadata_aliases:
if order_by_option.startswith("-"):
order_by_query_array.append(func.max(instance_metadata_aliases[attribute].value).desc())
else:
order_by_query_array.append(func.max(instance_metadata_aliases[attribute].value).asc())
return order_by_query_array

@classmethod
def get_basic_query(
cls,
filters: list[FilterValue],
) -> Query:
process_instance_query: Query = ProcessInstanceModel.query
# Always join that hot user table for good performance at serialization time.
process_instance_query = process_instance_query.options(selectinload(ProcessInstanceModel.process_initiator))
filters = report_metadata["filter_by"]
restrict_human_tasks_to_user = None

for value in cls.check_filter_value(filters, "process_model_identifier"):
process_model = ProcessModelService.get_process_model(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 NOTE
This review was outside the diff hunks and was mapped to the diff hunk with the greatest overlap. Original lines [570-615]

The method get_basic_query lacks documentation. Adding docstrings explaining the method's purpose, parameters, and return value would improve maintainability and readability.

Would you like me to generate the docstring for this method?

Comment on lines +618 to +624
def run_process_instance_report(
cls,
report_metadata: ReportMetadata,
user: UserModel | None = None,
page: int = 1,
per_page: int = 100,
) -> dict:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method run_process_instance_report lacks documentation. Adding docstrings explaining the method's purpose, parameters, and return value would improve maintainability and readability.

Would you like me to generate the docstring for this method?

@jasquat jasquat merged commit 64d8805 into main Feb 6, 2024
21 of 22 checks passed
@jasquat jasquat deleted the feature/waiting-for-filter branch February 6, 2024 19:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant