RFD 0171: Database session playback #42082

gabrielcorado · 2024-05-28T14:58:34Z

Related to #5799 and #9019.

Rendered version

greedy52

First quick around. Thanks for drafting this!

rfd/0171-database-session-playback.md

greedy52 · 2024-05-28T20:21:07Z

rfd/0171-database-session-playback.md

+- `on`: Enables session recording, but only queries are recorded. This will
+  mimic the current behavior.
+- `full`: Enables session recording. Queries and responses are recorded.
+- `off`: Disables session recording. Audit events are kept unchanged (start,
+  end, and query events emitted).
+- `recording_only`: Enables recording, but events will not be emitted, they
+  will only be present on recordings. This mode will be the same format as SSH
+  sessions, where the commands and their results are only present on the
+  recording.


what mode is preferred in case of mode conflict?

For recording_only, I assume the "start" and "end" events will still emit? Should a toggle on audit-events event be part of record_session?

what mode is preferred in case of mode conflict?

Most verbose one. full > recording_only > on > off

I assume the "start" and "end" events will still emit?

They will be included, like in the SSH sessions (I'll update the text to clarify that).

Should a toggle on audit-events event be part of record_session?

I'm still unsure about this too, because it doesn't look like a recording option. But moving this around (for example, to a cluster-level config) would reduce the ability to more dynamic configurations (for example, per protocol).

Configuring this at role level leads to some awkwardness. The fact that user has multiple roles forces you to perform merging on this, and few people can track the end result. You end up with a situation where more finicky configurations are unstable (people will be surprised that the recording was made/was not made), and ultimately disable or enable it for everyone.

Instead of roles, how about we scope it around individual databases? The default values may be put in a global settings object, but individual database objects can contain overrides. For auto-discovered databases we can use labels to guide these.

Ultimately, I think this setting is less about the people accessing the data and more about the nature of data itself.

Moving on, do we need these many levels for the configuration? I'd propose we keep the current capabilities without any changes and only treat the full session recording as an optional add-on. That collapses the full / recording / on / off enum down to session_recording_enable: true|false.

However, I do think we could use a more complex configuration, perhaps something like this:

database_session_recording: enable: true | false response_max_rows: 100 # only meaningful for row-based protocols response_max_bytes: 2M # meaningful for all protocols, including row-based anonymise: disabled | ellipsis | hash | stats_only (default)

With a strong default anonymisation we can make the feature enabled by default, which both prevents surprise capture of sensitive data as well as promotes the feature among the users.

I've removed the full (as we're no longer recording the data returned) and recording_only (considered out of this RFD scope as it is not directly related to session recording) options. We're left only with on|off.

greedy52 · 2024-05-28T20:21:55Z

rfd/0171-database-session-playback.md

+  the database name.
+- Data with fields metadata: Format the data into an ASCII table, where each
+  field is a column.
+- Data without field metadata: Print each row as a single line.


What does the player show when there is no data result but only queries?

greedy52 · 2024-05-28T20:23:42Z

rfd/0171-database-session-playback.md

+Web UI. We're going to convert database recording events into `SessionPrint`
+events, which can be rendered by the players.
+
+This conversion will be agnostic to the database protocol. This will make it


Will older db sessions before this feature become playable? What happens when the db sessions are not playable (e.g. protocol opensearch)?

Will older db sessions before this feature become playable?

That's the initial ideal.

What happens when the db sessions are not playable (e.g. protocol opensearch)?

Protocol-specific events will require additional implementation (to convert them into SessionPrint). So, the recording will not be available if all the events can be converted. We could always try to render them based on the generic events, but some protocols might never emit them (opensearch that you mentioned is one of them).

I'll investigate better how we can improve this UX-wise so we don't show the play button for sessions that we cannot convert properly.

(I'll add all this info to the RFD)

I think it would be nice to add non-textual playback option too. The database responses are inherently more structured than SSH terminal events. We can exploit that, for example to show the query response as a actual HTML table (sortable and with client-side filters!) or properly convey the request-response nature of some protocols, like opensearch.

Obviously this would require some UI work, but in the end I think it would be a great alternative to purely text-based reply, which has numerous inherent limitations.

I think it would be nice to add non-textual playback option too.

This can be implemented outside the session player (currently focused on text-based sessions). All the information will still be present on the recordings.

I'll investigate better how we can improve this UX-wise so we don't show the play button for sessions that we cannot convert properly.

(I'll add all this info to the RFD)

Did I miss this? I don't find a section on this.

I've added a section describing the general necessary changes on the WebUI. I haven't included the full details, as we might discuss them on the implementation PR.

I've added a section describing the general necessary changes on the WebUI. I haven't included the full details, as we might discuss them on the implementation PR.

I am more looking for where users can find the recordings. Before this feature, users have to grab an ID from an audit event. So it's safe to assume that now we can see the database sessions in "Session Recordings" tab and tsh recordings ls right?

Right, we'll bring the database recordings to those listings.

rfd/0171-database-session-playback.md

Tener · 2024-06-04T06:26:58Z

rfd/0171-database-session-playback.md

+  oneof Result {
+    // Status of the query execution. It is used when the command doesn't return
+    // any data.
+    Status status = 5;


We may want to rename it to DatabseResultStatus or similar. What will be the contents of the status?

It is a message already present on our events package. I've copied it to the RFD to make it easier to understand.

Tener · 2024-06-04T06:28:40Z

rfd/0171-database-session-playback.md

+{
+  ...
+  "status": {
+    "success": false,


An optional error_code might be nice: these are often present for both relational databases as well as for those based on HTTP requests.

Tener · 2024-06-04T06:41:05Z

rfd/0171-database-session-playback.md

+- `on`: Enables session recording, but only queries are recorded. This will
+  mimic the current behavior.
+- `full`: Enables session recording. Queries and responses are recorded.
+- `off`: Disables session recording. Audit events are kept unchanged (start,
+  end, and query events emitted).
+- `recording_only`: Enables recording, but events will not be emitted, they
+  will only be present on recordings. This mode will be the same format as SSH
+  sessions, where the commands and their results are only present on the
+  recording.


Configuring this at role level leads to some awkwardness. The fact that user has multiple roles forces you to perform merging on this, and few people can track the end result. You end up with a situation where more finicky configurations are unstable (people will be surprised that the recording was made/was not made), and ultimately disable or enable it for everyone.

Instead of roles, how about we scope it around individual databases? The default values may be put in a global settings object, but individual database objects can contain overrides. For auto-discovered databases we can use labels to guide these.

Ultimately, I think this setting is less about the people accessing the data and more about the nature of data itself.

Moving on, do we need these many levels for the configuration? I'd propose we keep the current capabilities without any changes and only treat the full session recording as an optional add-on. That collapses the full / recording / on / off enum down to session_recording_enable: true|false.

However, I do think we could use a more complex configuration, perhaps something like this:

database_session_recording: enable: true | false response_max_rows: 100 # only meaningful for row-based protocols response_max_bytes: 2M # meaningful for all protocols, including row-based anonymise: disabled | ellipsis | hash | stats_only (default)

With a strong default anonymisation we can make the feature enabled by default, which both prevents surprise capture of sensitive data as well as promotes the feature among the users.

Tener · 2024-06-04T07:09:24Z

rfd/0171-database-session-playback.md

+Web UI. We're going to convert database recording events into `SessionPrint`
+events, which can be rendered by the players.
+
+This conversion will be agnostic to the database protocol. This will make it


I think it would be nice to add non-textual playback option too. The database responses are inherently more structured than SSH terminal events. We can exploit that, for example to show the query response as a actual HTML table (sortable and with client-side filters!) or properly convey the request-response nature of some protocols, like opensearch.

Obviously this would require some UI work, but in the end I think it would be a great alternative to purely text-based reply, which has numerous inherent limitations.

rfd/0171-database-session-playback.md

gabrielcorado · 2024-06-11T14:25:58Z

For now, we're removing the data recording from this RFD scope. I've updated all the sections and will resolve some related comments.

greedy52

Could you also add some info on how the user can find these recordings? like how recordings are listed, in both Web UI and tsh.

rfd/0171-database-session-playback.md

greedy52 · 2024-06-12T13:09:14Z

rfd/0171-database-session-playback.md

+}
+```
+
+### Recording options


The recording mode is optional for the initial MVP. And it might depend on how we proceed with #40170

greedy52 · 2024-06-12T13:13:14Z

rfd/0171-database-session-playback.md

+- The player will define if there is need to display the number of affected
+  records. Protocol-specific variations can be added for this. For example, if
+  the query was a `SELECT` the player might display it as "Returned X rows"
+  after the status.


The DatabaseSessionCommandResult only captures a number affected_records. How does the player known how to render Returned xxx vs xxx inserted? If the logic is based on the query, how reliable is our logic?

It is the player translator's job to generate those descriptions. For example, tt can be implemented similarly to our PoC as a state machine. So, it knows the type of the command that generated the result. Databases that generate concurrent executions could be used; in that case, we would need a field to make the relationship of command -> result.

greedy52 · 2024-06-12T13:15:21Z

rfd/0171-database-session-playback.md

+Web UI. We're going to convert database recording events into `SessionPrint`
+events, which can be rendered by the players.
+
+This conversion will be agnostic to the database protocol. This will make it


I'll investigate better how we can improve this UX-wise so we don't show the play button for sessions that we cannot convert properly.

(I'll add all this info to the RFD)

Did I miss this? I don't find a section on this.

rfd/0171-database-session-playback.md

r0mant · 2024-06-12T21:52:17Z

rfd/0171-database-session-playback.md

+- `on`: Enables session recording.
+- `off`: Disables session recording. Audit events are kept unchanged (start,
+  end, and query events emitted).


As far as I recall, all database access events currently go to both audit log and session recording. So we're just adding an ability to replay them (and an extra event for query result). Is that correct? Will off option basically mean that we stop emitting events to session recordings?

As far as I recall, all database access events currently go to both audit log and session recording. So we're just adding an ability to replay them (and an extra event for query result). Is that correct?

Yes.

Will off option basically mean that we stop emitting events to session recordings?

Yes. This change will not affect audit events (similar to SSH enhanced session recordings). Currently, there is no option to do this other than completely disabling the session recordings on the cluster level (auth_server.session_recording: "off").

rfd: database session playback

7829ab6

gabrielcorado added rfd Request for Discussion no-changelog Indicates that a PR does not require a changelog entry labels May 28, 2024

gabrielcorado requested review from Tener, greedy52 and GavinFrazar May 28, 2024 14:58

gabrielcorado self-assigned this May 28, 2024

github-actions bot requested review from justinas and ryanclark May 28, 2024 14:59

github-actions bot added the size/md label May 28, 2024

greedy52 requested review from r0mant and smallinsky May 28, 2024 19:55

greedy52 reviewed May 28, 2024

View reviewed changes

rfd/0171-database-session-playback.md Outdated Show resolved Hide resolved

greedy52 reviewed May 29, 2024

View reviewed changes

rfd/0171-database-session-playback.md Outdated Show resolved Hide resolved

r0mant reviewed Jun 3, 2024

View reviewed changes

rfd/0171-database-session-playback.md Outdated Show resolved Hide resolved

Tener reviewed Jun 4, 2024

View reviewed changes

gabrielcorado added 3 commits June 11, 2024 10:46

rfd: update required approvers

9ae4f7a

rfd: remove storing query results

d6aa6da

rfd: review suggestions

6a61b6d

gabrielcorado requested review from Tener, r0mant and greedy52 June 11, 2024 14:31

greedy52 reviewed Jun 12, 2024

View reviewed changes

r0mant approved these changes Jun 12, 2024

View reviewed changes

gabrielcorado mentioned this pull request Jun 26, 2024

Record PostgreSQL queries/command results #43546

Merged

rfd: web ui information

1530e21

gabrielcorado requested a review from greedy52 June 26, 2024 15:10

greedy52 approved these changes Jun 26, 2024

View reviewed changes

public-teleport-github-review-bot bot removed request for justinas, ryanclark, GavinFrazar and smallinsky June 26, 2024 15:38

gabrielcorado added this pull request to the merge queue Jun 26, 2024

Merged via the queue into master with commit 8d102a3 Jun 26, 2024
37 checks passed

gabrielcorado deleted the rfd/0171-database-session-playback branch June 26, 2024 23:36

gabrielcorado restored the rfd/0171-database-session-playback branch June 27, 2024 15:15

gabrielcorado deleted the rfd/0171-database-session-playback branch June 27, 2024 17:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFD 0171: Database session playback #42082

RFD 0171: Database session playback #42082

gabrielcorado commented May 28, 2024

greedy52 left a comment

greedy52 May 28, 2024 •

edited

Loading

gabrielcorado May 29, 2024 •

edited

Loading

Tener Jun 4, 2024 •

edited

Loading

gabrielcorado Jun 11, 2024

greedy52 May 28, 2024

greedy52 May 28, 2024

gabrielcorado May 29, 2024

Tener Jun 4, 2024

gabrielcorado Jun 11, 2024

greedy52 Jun 12, 2024

gabrielcorado Jun 26, 2024

greedy52 Jun 26, 2024

gabrielcorado Jun 26, 2024

Tener Jun 4, 2024

gabrielcorado Jun 11, 2024

Tener Jun 4, 2024

Tener Jun 4, 2024 •

edited

Loading

Tener Jun 4, 2024

gabrielcorado commented Jun 11, 2024

greedy52 left a comment •

edited

Loading

greedy52 Jun 12, 2024

greedy52 Jun 12, 2024

gabrielcorado Jun 12, 2024 •

edited

Loading

greedy52 Jun 12, 2024

r0mant Jun 12, 2024

gabrielcorado Jun 12, 2024 •

edited

Loading

RFD 0171: Database session playback #42082

RFD 0171: Database session playback #42082

Conversation

gabrielcorado commented May 28, 2024

greedy52 left a comment

Choose a reason for hiding this comment

greedy52 May 28, 2024 • edited Loading

Choose a reason for hiding this comment

gabrielcorado May 29, 2024 • edited Loading

Choose a reason for hiding this comment

Tener Jun 4, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Tener Jun 4, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gabrielcorado commented Jun 11, 2024

greedy52 left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gabrielcorado Jun 12, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gabrielcorado Jun 12, 2024 • edited Loading

Choose a reason for hiding this comment

greedy52 May 28, 2024 •

edited

Loading

gabrielcorado May 29, 2024 •

edited

Loading

Tener Jun 4, 2024 •

edited

Loading

Tener Jun 4, 2024 •

edited

Loading

greedy52 left a comment •

edited

Loading

gabrielcorado Jun 12, 2024 •

edited

Loading

gabrielcorado Jun 12, 2024 •

edited

Loading