-
Notifications
You must be signed in to change notification settings - Fork 16
Combine Execution and Audit Logs in Request Status Endpoint [#1024] #1068
Conversation
…s used and execution logs are embedded, also return audit logs. Execution Logs are created at the collection level while audit logs are for the overall privacy request level, so most fields returned for audit logs are None. Logs are also grouped at the dataset level here, so give the audit logs a fake dataset name for display purposes, for example, "Request approved".
…ded in a verbose request status response.
AuditLog.message, | ||
cast(AuditLog.action.label("status"), sqlalchemy.String).label("status"), | ||
AuditLog.privacy_request_id, | ||
null().label("dataset_name"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really cool, I didn't know you could do this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Me either. Very cool. It's great that they handle this scenario so nicely
AuditLog.user_id, | ||
).filter(AuditLog.privacy_request_id == self.id) | ||
|
||
combined: Query = execution_log_query.union_all(audit_log_query) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like union all
keeps all of the records from both original tables, but union removes duplicates. union sorts and eliminates duplicates first. In early testing, union
was returning too many records, but in hindsight this may have had nothing to do with the union
and may have been related to the fact that I didn't have other parts of my query correct yet.
Regardless, I think union all makes sense here - we don't need to do the extra step of sorting and dupe-removal.
class ExecutionAndAuditLogResponse(BaseSchema): | ||
"""Schema for the combined ExecutionLogs and Audit Logs | ||
associated with a PrivacyRequest""" | ||
|
||
collection_name: Optional[str] | ||
fields_affected: Optional[List[FieldsAffectedResponse]] | ||
message: Optional[str] | ||
action_type: Optional[ActionType] | ||
status: Optional[Union[ExecutionLogStatus, AuditLogAction]] | ||
updated_at: Optional[datetime] | ||
user_id: Optional[str] | ||
|
||
class Config: | ||
"""Set orm_mode and allow population by field name""" | ||
|
||
use_enum_values = True | ||
allow_population_by_field_name = True | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I originally extended the ExecutionLogResponse schema above and just overrode fields to make them optional or change their type but mypy didn't like this, so I made a new one. I also allow_population_by_field_name
here because it is no longer ExecutionLogs that are getting serialized with this, it's a union query of audit logs and execution logs.
combined: Query = execution_log_query.union_all(audit_log_query) | ||
|
||
for log in combined.order_by(ExecutionLog.updated_at.asc()): | ||
dataset_name: str = log.dataset_name or f"Request {log.status}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This endpoint groups execution logs by dataset name, but audit logs don't have a dataset. I just create a name for the audit log, and then the single audit log will be associated with that name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! 🚀 union_all
is interesting.
AuditLog.message, | ||
cast(AuditLog.action.label("status"), sqlalchemy.String).label("status"), | ||
AuditLog.privacy_request_id, | ||
null().label("dataset_name"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Me either. Very cool. It's great that they handle this scenario so nicely
It's ready to merge once the changelog conflicts are fixed. |
…d_logs # Conflicts: # CHANGELOG.md
thanks for the review @TheAndrewJackson ! |
…1068) * Update the request status endpoint, so when the verbose query param is used and execution logs are embedded, also return audit logs. Execution Logs are created at the collection level while audit logs are for the overall privacy request level, so most fields returned for audit logs are None. Logs are also grouped at the dataset level here, so give the audit logs a fake dataset name for display purposes, for example, "Request approved". * Update CHANGELOG and update docs to reflect that audit logs are included in a verbose request status response.
Purpose
We have AuditLogs that store user actions at the PrivacyRequest level, and ExecutionLogs that describe system actions at the individual Collection level. There is information from both of these tables that is useful in establishing the timeline of the request.
Unify/combine this information into a single endpoint.
Changes
http://0.0.0.0:8080/api/v1/privacy-request?include_identities=true&request_id=<privacy_request_id>&verbose=true
, both execution logs and audit logs are included in the responseunion all
query. Both tables share some columns and have unique columns of their own, so if a column is missing from one table, we return null.Checklist
CHANGELOG.md
fileCHANGELOG.md
file is being appended toUnreleased
section in an appropriate category. Add a new category from the list at the top of the file if the needed one isn't already there.Run Unsafe PR Checks
label has been applied, and checks have passed, if this PR touches any external servicesTicket
Fixes #1024