Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge frontend and Pyk again #4066

Closed
Baltoli opened this issue Feb 29, 2024 · 6 comments
Closed

Merge frontend and Pyk again #4066

Baltoli opened this issue Feb 29, 2024 · 6 comments
Assignees

Comments

@Baltoli
Copy link
Collaborator

Baltoli commented Feb 29, 2024

We dropped Pyk as a direct component of the K repository in 2022 so that it could exist as a more standalone tool. Since then, our model of the relationship between these components has changed (specifically, we hope in the long term to push more of the features currently implemented in the frontend into Pyk, and for the frontend to gradually become a much smaller set of internal tools).

As part of this process, we probably want to think about re-merging the two repositories. Doing so with minimal disruption to downstream projects will require some thought.

Part of: #4063

@Baltoli Baltoli mentioned this issue Feb 29, 2024
5 tasks
@Baltoli
Copy link
Collaborator Author

Baltoli commented Mar 7, 2024

@tothtamas28 began experimenting with public runners for Pyk, but doing so is blocked on #4061. We should think about fixing that before any merge begins.

@ehildenb questions the performance of these runners; there is a rate-limiting effect as we are only allocated 15? 60? x4vcpu runners for our open-source code. Beyond a point we will run out of compute, but not clear when that will be. Heavier jobs will benefit from being run on our own runners.

@Baltoli
Copy link
Collaborator Author

Baltoli commented Mar 15, 2024

We discussed this in the K meeting today and arrived at the following set of plans:
Important prerequisite:

  • Eliminate flakiness in Pyk CI: Fine-tune CI tests pyk#1017
  • Get the Pyk-based version of the regression test suite passing at an appropriate success rate: Pass regression-new test-suite #4194
  • Ensure that every downstream repository is correctly syncing its K version from Pyk
  • Freeze changes to the Pyk repository
  • ❗Copy Pyk into the K source tree directly such that K presents a Poetry project compatible with the existing Pyk installation
  • On each downstream repo, change their Poetry code to point at K rather than the old Pyk repository.
  • Once all clients are updated, remove the old Pyk repository and reopen K to changes.

Caution

When we copy Pyk into the K source tree, we should take care to think through how the K build and installation process works for downstream users, and that we've produced a working version of the merged tree that can be installed locally / in a docker container / via Nix, as appropriate.

@Baltoli
Copy link
Collaborator Author

Baltoli commented Mar 21, 2024

First step of the merge:

  • After copying the source code in, still keep K installing as it always has
  • Pyk will then just be a standard Poetry project that lives in a subdirectory

We should disable autoupdates from Pyk before freezing its code.

@Baltoli
Copy link
Collaborator Author

Baltoli commented Mar 21, 2024

Post-merge subtleties we should address:

  • Test suite filtering; we don't want to run the K tests when the Pyk code changes
  • but we do want the converse. There is flakiness in the Pyk CI pipeline - 3 or 4 reruns sometimes
  • Pyk also publishes a docker image. We should incorporate Pyk into how we do K Docker images
  • Major version bump
  • kup install k with a local poetry version of Pyk should not break. Only breakage would be a reverse dependency of a lower-level tool on Pyk - don't do this

rv-jenkins pushed a commit to runtimeverification/pyk that referenced this issue Mar 22, 2024
Related:
*
runtimeverification/k#4066 (comment)

---------

Co-authored-by: devops <devops@runtimeverification.com>
@Baltoli
Copy link
Collaborator Author

Baltoli commented Apr 9, 2024

@tothtamas28 and I ran through the merge process yesterday; a summary of the concrete steps we're taking is:

  1. Lock PRs to Pyk to keep the repository state consistent after the merge.
    a. We can probably achieve this by manually removing automerge from all open Pyk PRs; the volume of traffic to the repository is not sufficiently high that we need a stronger automated solution for this.
    b. Notify the #pyk channel that this is happening.
  2. Perform a git subtree merge of Pyk into the K repository.
    a. This will look something like git subtree add -P pyk <pyk git url> master
    b. The .github directory from Pyk should be removed.
  3. Perform a few manual cleanup steps once the Pyk tree has been merged into K.
    a. Fix CI invocations so that Pyk tests are executed.
    b. Invert problematic .gitignore stanzas inside the local Pyk directory.
    c. Add Pyk to the Docker image that we build and release for K.
    d. Update CI release jobs to bump the Pyk package version to match K's version.
  4. Do a major version bump to K with the appropriate changelog included.
  5. Lock K PRs so that the Pyk merge + version bump PR can go through cleanly.
  6. In the devops repo, move every repo that currently depends on Pyk over to depending on K instead.
    a. Changes to both Pyk and K are locked at this point, so no repos will actually be receiving updates until we're ready to do so.
    b. Assign @Baltoli and @tothtamas28 to these update jobs as reviewers (temporarily).
  7. Merge the big PR.
    a. This will produce a one-off cluster of "Update K" jobs for each repository that historically depended on Pyk.
    b. For each repo, update the scripts that bump versions such that they take all the relevant information from the new K repository structure.
    c. If the repo's K / Pyk dependency was out of date, perform a one-off manual edit such that this new "Update K" job doesn't actually bump the version, but just points at the newly-structured repository in the appropriate place.
    d. Once updated, we can un-assign ourselves and reassign the correct reviewer.
  8. The critical path of the merge is now complete and we can reopen PRs.

There will be a secondary process that needs to happen once the big PR gets merged; the steps here are less critical and have fewer dependencies, but should be done ASAP to minimise disruption.

  1. Add filtering to K CI so that if only files under pyk/ change, we don't need to rebuild K (instead, we should download an appropriate release build).
  2. Transfer issues; this can be automated with the gh tool.
  3. Transfer PRs; this is a bit more manual but can be done with git subtree pull on an appropriate branch, and a manual copy of the PR body.

@Baltoli
Copy link
Collaborator Author

Baltoli commented Apr 9, 2024

Log

9 April 2024 11:22 - PRs to Pyk closed

List of automerge-labelled PRs is here (should be empty): https://github.com/runtimeverification/pyk/pulls?q=is%3Aopen+is%3Apr+label%3Aautomerge

9 April 2024 - 12:16 - Pyk repo filtered

To merge the history of the two repositories cleanly, we need to do some destructive modification on a local copy of Pyk:

  • All commits in Pyk need to be edited such that they reference files relative to a pyk/ subdirectory, rather than relative to a root directory.
  • Commit messages that reference pull request IDs should be rewritten to reference a full Github URL (when the history is imported, unqualified #-links will point wrongly at K PRs instead of the old Pyk ones)

We can do both of these things easily with git filter-repo.

$ git clone git@github.com:runtimeverification/pyk.git
$ cd pyk
$ git filter-repo --to-subdirectory-filter pyk
$ cat ../expressions.txt
regex:(.*)\(#(\d+)\)==>\1(https://github.com/runtimeverification/pyk/pull/\2)
$ git filter-repo --replace-message ../expressions.txt

The local checkout of Pyk here should now have a single pyk subdirectory, and commit messages that merge a PR should have a full URL instead.

9 April 2024 - 12:33 - Merge Pyk repository into K

We can now merge the filtered Pyk repository into K following these instructions.

Presuming that K and Pyk are checked out in the same parent directory:

$ git clone git@github.com:runtimeverification/k.git
$ cd k
$ git branch remerge-pyk develop
$ git checkout remerge-pyk
$ git remote add pyk ../pyk
$ git fetch pyk --tags
$ git merge --allow-unrelated-histories pyk/master
$ git remote remove pyk

This branch can now be opened as an ordinary PR.

9 April 2024 - 13:00 - Ignore pyk directory for formatting

Fortunately, this is the only change required to get K's existing CI to pass with the Pyk repository included in it: f6e2848

9 April 2024 - 15:53 - Run Pyk CI

We need to run all of Pyk's tests as part of K's existing PR test workflow1; this means porting the test portions over from the Pyk repository into the K workflow files.

Changes that needed to be made here:

  • Instead of downloading a released K .deb file, use the Github artifact download step to get it from this test run.
  • Add a dependency on the Jammy packaging step for the Pyk tests that rely on a K installation.
  • Changes to paths etc. to reflect Pyk now being in a subdirectory relative to the root.
  • Add Pyk to the K Docker images that are built on a successful K release; this required a rebased-out manual change to make sure it was working properly.

10 April 2024 - 08:47 - Align .gitignore

The only actual semantic difference we needed to preserve between the two repositories is that Pyk checks in .kore files, while the old frontend does not. We therefore lift Pyk's ignore rules up to the top level, other than a local exclusion for .kore.

10 April 2024 - 09:07 - Align Pyk library version

Pyk will no longer have its own separate version, but will instead take its version from the top-level K installation. This isn't a big change to enforce; we already have top-level version bumping infrastructure used for the K debian packages etc., so we just need to:

  • Set Pyk's version to 6.3.0
  • Add pyproject.toml to the list of files that get bumped on pushes to develop, with an appropriate sed incantation to perform the bump.

10 April 2024 - 11:11 - Test PR reinstatement

PRs can't be automatically transferred between repos, so we need to manually recreate the changes on the new repo. We can do this with a process similar to the way that we merged the master branch of Pyk into K:

  • Add a filtered local checkout of Pyk as a git remote to your K repo (see above).
  • Fetch refs from that remote (see above).
  • For each existing PR to Pyk:
    • Check out a new branch in the K repo, based on develop, with the same name as the Pyk PR branch.2
    • Merge the Pyk branch from the local remote into the new K branch.
    • Open a PR as usual.
    • Copy any relevant metadata across from Pyk to K for the PR.

I went through this process manually once, and it seems to work pretty well (#4165). Once the main merge PR goes through, I'll take care of automating this for the set of open PRs against Pyk.

10 April 2024 - 11:26 - Label Pyk issues

Issues can be transferred between repos, in contrast to PRs. I went through the list of open Pyk issues and added the pyk label to them, so that they will be easily identifiable when transferred. Once the main PR merge goes through, I will automate the transfer of these issues using gh.

10 April 2024 - 11:57 - Begin to port dependency update jobs

Once we're ready to merge the big K PR, we need to add a new dependency job for each repo that used to depend on Pyk such that they now depend on K. This PR: https://github.com/runtimeverification/devops/pull/202 does this, and assigns me as a reviewer for each one.

When the big PR is merged, I'll get a bunch of notifications, and will then fix each repo's dependency update scripts such that they now depend on K rather than Pyk. I can then re-assign the usual maintainer for each repo.

10 April 2024 - 11:57 - Merge PR open for review

😅

Footnotes

  1. A caveat here is that once the repository merge PR goes through, we'll want to do something more sophisticated than just running all our tests, all the time.

  2. The naming is really just for clarity.

Baltoli pushed a commit that referenced this issue Apr 9, 2024
Related:
*
#4066 (comment)

---------

Co-authored-by: devops <devops@runtimeverification.com>
Baltoli pushed a commit that referenced this issue Apr 9, 2024
Related:
*
#4066 (comment)

---------

Co-authored-by: devops <devops@runtimeverification.com>
Baltoli pushed a commit that referenced this issue Apr 9, 2024
Related:
*
#4066 (comment)

---------

Co-authored-by: devops <devops@runtimeverification.com>
Baltoli pushed a commit that referenced this issue Apr 10, 2024
Related:
*
#4066 (comment)

---------

Co-authored-by: devops <devops@runtimeverification.com>
Baltoli pushed a commit that referenced this issue Apr 10, 2024
Related:
*
#4066 (comment)

---------

Co-authored-by: devops <devops@runtimeverification.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants