-
Notifications
You must be signed in to change notification settings - Fork 13
Add ability to Squash Changes when merging #370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
jeremystretch
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just started getting into the meat of this PR, but wanted to return some initial feedback in the interest of keeping this moving.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Building on Jeremy's suggestion to make this a CharField with strategy choices, we should consider taking this further and refactor the merge/revert logic to use the actual Strategy pattern.
Right now we've got the boolean check scattered through merge() and revert() with _merge_collapsed() vs _merge_iterative() methods. If we're going to support multiple strategies (and "Squash" strongly implies there could be others), we should pull this into proper strategy classes.
Something like:
# strategies.py
class MergeStrategy(ABC):
@abstractmethod
def merge(self, branch, changes, request, commit, logger): ...
@abstractmethod
def revert(self, branch, changes, request, commit, logger): ...
class IterativeMergeStrategy(MergeStrategy):
# current _merge_iterative / _revert_iterative logic
class SquashMergeStrategy(MergeStrategy):
# current _merge_collapsed / _revert_collapsed logicThen Branch.merge() becomes:
def merge(self, user, commit=True):
strategy = get_strategy(self.merge_strategy) # looks up from field/registry
# ... setup ...
strategy.merge(self, changes, request, commit, logger)
# ... cleanup ...This gets the 800+ lines of merge logic out of the Branch model, makes it trivial to add new strategies, and each strategy is independently testable. Also means CollapsedChange and all the dependency ordering helpers move into SquashMergeStrategy where they belong. They're implementation details of that specific strategy, not general Branch concerns.
netbox_branching/migrations/0007_branch_merged_using_collapsed.py
Outdated
Show resolved
Hide resolved
One big benefit on this approach is that it makes each of the strategies much more testable at a lower level (e.g. not black-box testing Test coverage is great for end-to-end scenarios, but I think we might want to expand verification of the internal dependency ordering logic. All the existing tests verify that merges succeed/fail correctly, which is good, but they don't inspect whether the dependency graph is actually being built correctly. |
Fixes: #198
This squashes (collapses) all Changelogs (for a given object) in a branch to a single Change:
The old (iterative) and new (collapse) merge strategy are both supported (the default is the old iterative approach). There is a select-box on merge to enable the squash merge:
Because ObjectChanges can reference other objects (FK) it has to do dependency ordering: If an object references something that is created, the create has to come first, etc..
This will not fix 100% of the issues, but the goal is that the end user can fairly easily fix the branch (by either deleting a problem Object or modifying it's data) and be able to get the branch to merge.
The biggest thing this fixes is Where you have an Object that causes a Constraint error. (Site with a slug that conflicts with one in main). Currently using the replay of changelogs you cannot fix this as you will always try to replay the ObjectcChange with the conflicting state. With collapse migrations, the intermediate state of that caused the conflict is skipped.
The major thing this can't fix is the swap case: You have a Module Bay with two modules at position 1 and 2 and you swap these (move 1 to 3, 2 to 1, 3 to 2) as with collapse you miss the move to position 3 and no matter which one you try to merge you will run into a constraint error. But, you can easily fix this by temporarily modifying the data before merge then merging.
Note: I looked at doing partial collapse (i.e. only collapsing some of the changes) but quickly ran into issues because of the constraint issues mentioned above and that Objects can reference other objects which means you have to do a dependency change and at that point it is just cleaner to collapse all the migrations as you have to do all the work anyways and it gets messy figuring out what you can partially collapse.
Step1: For collapse you want to take the initial objects prechange_data, and then you want to collapse all the objects postchange_data down into one so later field modifications overwrite older ones.
Note: because you are collapsing you sort of loose the time sequence (do you take it from the first change or the last - which actually depends). So barring dependencies we generally do DELETEs, UPDATEs then CREATEs (within those in time order)
Step 2: For dependency ordering it matters what type of operation you are doing. For a create you care about if postchange_data from another object refers to it, but for a delete you care if the prechange_data for another object refers to it. Note: we only care if these references are to other objects being modified in this branch. If it is a reference to something in main we don't need to track the dependency.
Step 3: For actually doing the dependency ordering a given Change can reference more then one FK so we use an algorithm that handles this and have to have cycle detection, which hopefully shouldn't happen (A references B, references C, references A) but need to guard against it.
Note: There is an edge case with nullable FK fields and having a created item refer to each other (Circuit / CircuitTermiantions apparently do this) If we detect this we split the create back into a create (with the nulled fields) and an update setting those fields.
Note: All of these new ObjectChange are in-memory, we don't actually modify the existing ObjectChange in the database. Because of this we can't use the existing apply / revert code as those work on actual ObjectChange records, so we have new functions similar to those.
_build_fk_dependency_graphhas very similar code for each type of Change, but it is just dissimilar enough I didn't see a good way of refactoring it to be more DRY without increasing the complexity. I also created some sub-functions which are only called once (like_build_fk_dependency_graph) but it encapsulates the given functionality and makes it far easier to understand IMHO.Right now there is a lot of logging that will appear in the Merge job log making it fairly big. This info is useful for debugging and it was decided to error on the side of too-much info rather than paring it down. We can do that later after it has been out and proven itself.