-
-
Notifications
You must be signed in to change notification settings - Fork 513
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dolt diff does not display added and corresponding deleted rows next to each other #7808
Comments
I'm not sure this is even possible to fix without breaking history-independence. Commits store the current state of the table, but they don't track the commands that led to that state. This is on purpose: two tables with the same state should have the same hash regardless of the sequence of commands that were run. This means that there's no way for the commit to know that the added and deleted row were from the same update operation. |
@nicktobey Seems this is not about the sequence of commands? I might be wrong since I'm not familiar with dolt implementation... A modified row is displayed as deleted and added rows, and we hope to display them next to each other. But when a commit change happens in primary keys, it might mess up this order, since the diff rows are sorted by primary keys. Here is another example that Tim found if that's clearer: https://www.dolthub.com/repositories/dolthub/hospital-price-transparency-v3/pulls/169/compare, where the values of column |
The reason I mentioned the sequence of commands is because, for instance, the diff you linked to could have happened in multiple ways:
The diff operation only sees the start and end state of the table, so it can't tell the difference between these two possibilities. It has to display the same diff for both. One thing we could do is attempt to detect when a change was likely an update to a primary key, match each row-add to its corresponding row-remove, and display them next to each other. It means that would do this even if the change wasn't the result of changing a primary key, as long as it could have been... but that's probably rare enough that we don't mind. The question is how we would detect these pairs of rows and whether we can do it efficiently. |
I see now, thanks for the clarification :) |
this affects how we display diff rows on dolthub
repro:
pk2
in 2 rows:related dolthub pull: https://www.dolthub.com/repositories/liuliu/test1/pulls/11/compare
The text was updated successfully, but these errors were encountered: