-
-
Notifications
You must be signed in to change notification settings - Fork 200
Sam/nix and conventional ami #1012
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@darora just wanted to follow up that in this new PR testinfra ami tests are now passing for the nix ami build https://github.com/supabase/postgres/actions/runs/9672036422/job/26683662377?pr=1012#step:10:95 This resolves #953 (comment) I have moved the docker work to a new PR that should be coming up tomorrow (and will deprecate the old) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my view we still need some automatic deduplication of config files to be sure they don't have differences and will not have in the future. E.g ansible-nix/files/kong_config/kong.service.j2 and postgres/ansible/files/kong_config/kong.service.j2 look like copies and many other files like that. If we have files derived from other ones, I think it's better to have one "master" one and the other to be derived in an automatic way. Otherwise, we could get easily confused in the changes between them. And also this make PR somewhat bulky and hard to review. Maybe a script that derives config files from main Supabase repo is a suitable solution for this. Or symlinks to existing files.
@pashkinelfe that's reasonable to me, thanks. |
83fd3ed
to
d5b4643
Compare
The diff itself looks fine, though someone will need to test an actual upgrade to see if there's any other hurdles that are not immediately obvious. Agreed with Pavel if it's doable in a straightforward manner. Otherwise, I guess we'll have to review a diff of the two dirs locally as a one-time exercise. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving once #1025 gets merged in to clean things up
bc1b197
to
4afe5f0
Compare
This reverts commit f719e66.
Sam/nix and conventional consolidate (#1025) * feat: consolidate ansible and use vars to toggle AMI builds * fix: resolving merge conflict * chore: merge conflict * Revert "chore: merge conflict" This reverts commit ddc6b1d. * fix: update ansible location for script * fix: ansible consolidated location * fix: set up modes on system-setup * fix: set vars * fix: python True and False in extra_vars * fix: adj vars * fix: set all ami vars * fix: args as json * fix: nixpkg_mode * fix: refining mode rules * fix: consolidate create dirs * fix: cleaning up modes * fix: systemd psql service reload targets * fix: starting postgres issues * fix: timing for pgsodium_getkey script * fix: packer file upload on stage 2 * fix: consolidation of ansible location * fix: stage2 fix hostname * fix: limit stage that tasks run on * fix: setting hosts only on stage 2 nix ami * fix: rewrite hosts in ansible to allow for re-use of playbook file * chore: trigger checks * fix: pgsodium getkey is different for deb vs nix builds * fix: consolidated files location * fix: on stage2 postgres server is already started at this point * fix: without env vars * fix: vars on the right mode * fix: dedupe * fix: locales * fix: locales * chore: try step with no env vars * fix: no need to start pg at this point stage2 * fix: yaml * fix: more cleanup of modes * fix: snapd already absent at this point + consolidate tasks * fix: already absent at this point * fix: service not present at this stage * fix: disable different services for first boot depending on mode * fix: pg already restarted at this point in stage 2 * fix: no start on stage2 * fix: try to start in stage2 * chore: include env vars for stage2 * fix: stop before starting * fix: debpkg mode only * fix: should use conventional path * fix: need to locale-gen prior to initdb * fix: nix build needs .env * fix: stage2 treatment of pgsodium_getket * chore: re-introduce permission checks via osquery * fix: correct the path to files --------- Co-authored-by: Sam Rose <samuel@supabase.io>
* fix: was using the wrong sha256 hash for version * chore: updating wrappers version * itests: make sure we run the current commit on psql bundle test --------- Co-authored-by: Sam Rose <samuel@supabase.io>
* fix: locale gen and ami deregister on any testinfra run * fix: use more manual approach --------- Co-authored-by: Sam Rose <samuel@supabase.io>
48ecb12
to
12852b2
Compare
* feat: nix-ami-changes * chore: version bump * chore: remap branch for ami build * chore: bump version * chore: bump version to trigger build * feat: use /var/lib/postgresql as home for postgres user * fix: makre sure bashrc exists * fix: minor refactor * chore: moving to a different PR * chore: bump version and remove deprecated workflow * feat: parallel testinfra-nix just for ami test * chore: testing just testinfra-nix workflow * chore: re-run build * chore: re-trigger testinfra * fix: wait for AMI to reach available state * fix: use ami id in stage 3 testinfra ami-test * fix: env vars * chore: bump version * chore: restore packer build * chore: create a parallel test * chore: bump version * fix: capture and use ami name * fix: aws regions * chore: capture ami name * chore: force_deregister all ami prior to create new * fix: pass same ami name each time * fix: manage concurrency of testinfra builds * fix: no args on stage 2 * fix: re-intro original testinfra * Revert "fix: re-intro original testinfra" This reverts commit f719e66. * chore: push to re-trigger build * chore: update instance name * fix: location of pg_isready binary * fix: re-intro conventional ami infra test + more symlinks where expected * fix: dealing with symlink creation issues * fix: try concurrency rules on on all large builds * chore; try with no concurrency rules * chore: rerun * chore: rebasing on develop Sam/nix and conventional consolidate (supabase#1025) * feat: consolidate ansible and use vars to toggle AMI builds * fix: resolving merge conflict * chore: merge conflict * Revert "chore: merge conflict" This reverts commit ddc6b1d. * fix: update ansible location for script * fix: ansible consolidated location * fix: set up modes on system-setup * fix: set vars * fix: python True and False in extra_vars * fix: adj vars * fix: set all ami vars * fix: args as json * fix: nixpkg_mode * fix: refining mode rules * fix: consolidate create dirs * fix: cleaning up modes * fix: systemd psql service reload targets * fix: starting postgres issues * fix: timing for pgsodium_getkey script * fix: packer file upload on stage 2 * fix: consolidation of ansible location * fix: stage2 fix hostname * fix: limit stage that tasks run on * fix: setting hosts only on stage 2 nix ami * fix: rewrite hosts in ansible to allow for re-use of playbook file * chore: trigger checks * fix: pgsodium getkey is different for deb vs nix builds * fix: consolidated files location * fix: on stage2 postgres server is already started at this point * fix: without env vars * fix: vars on the right mode * fix: dedupe * fix: locales * fix: locales * chore: try step with no env vars * fix: no need to start pg at this point stage2 * fix: yaml * fix: more cleanup of modes * fix: snapd already absent at this point + consolidate tasks * fix: already absent at this point * fix: service not present at this stage * fix: disable different services for first boot depending on mode * fix: pg already restarted at this point in stage 2 * fix: no start on stage2 * fix: try to start in stage2 * chore: include env vars for stage2 * fix: stop before starting * fix: debpkg mode only * fix: should use conventional path * fix: need to locale-gen prior to initdb * fix: nix build needs .env * fix: stage2 treatment of pgsodium_getket * chore: re-introduce permission checks via osquery * fix: correct the path to files --------- Co-authored-by: Sam Rose <samuel@supabase.io> * Sam/timescale and wrappers (supabase#1052) * fix: was using the wrong sha256 hash for version * chore: updating wrappers version * itests: make sure we run the current commit on psql bundle test --------- Co-authored-by: Sam Rose <samuel@supabase.io> * fix: locale gen and ami deregister on any testinfra run (supabase#1055) * fix: locale gen and ami deregister on any testinfra run * fix: use more manual approach --------- Co-authored-by: Sam Rose <samuel@supabase.io> * chore: update pg_upgrade initiate.sh to support nix-based upgrades (supabase#1057) * chore: package nix flake revision in pg_upgrade binaries tarball when building the nix AMI (supabase#1058) * chore: activate release workflow * chore: bump version --------- Co-authored-by: Sam Rose <samuel@supabase.io> Co-authored-by: Paul Cioanca <paul.cioanca@supabase.io>
This PR will need minor follow up prior to approval/merge for github actions that are dedicated to specifically merging to develop. It supersedes pr #953
Documentation of changes in #1012
Conventional AMI approach
The existing/conventional AMI build approach installs postgres from the
postgresql-common
ubuntu/debian package at the time of the AMI build. In addition, it builds extensions, and wrappers from source at the point of AMI build, and installs them as ‘.deb’ packages.Nix packaged postgresql bundle approach
In the nix approach, we use the postgresql provided by nixpkgs (currently pinned at version 15.6 vi a76c4553d7e741e17f289224eda135423de0491d commit of
nixpkgs-unstable
branch locked via https://github.com/supabase/postgres/blob/sam/2-stage-ami-nix/flake.lock#L114 )Nixpkgs sources from https://github.com/NixOS/nixpkgs/blob/a76c4553d7e741e17f289224eda135423de0491d/pkgs/servers/sql/postgresql/generic.nix#L52 ← this URL
the nixpkgs package applies the following patches for
aarch64-linux
pg 15.6https://github.com/NixOS/nixpkgs/blob/a76c4553d7e741e17f289224eda135423de0491d/pkgs/servers/sql/postgresql/patches/disable-resolve_symlinks.patch
https://github.com/NixOS/nixpkgs/blob/a76c4553d7e741e17f289224eda135423de0491d/pkgs/servers/sql/postgresql/patches/less-is-more.patch
https://github.com/NixOS/nixpkgs/blob/a76c4553d7e741e17f289224eda135423de0491d/pkgs/servers/sql/postgresql/patches/hardcode-pgxs-path.patch
https://github.com/NixOS/nixpkgs/blob/a76c4553d7e741e17f289224eda135423de0491d/pkgs/servers/sql/postgresql/patches/specify_pkglibdir_at_runtime.patch
https://github.com/NixOS/nixpkgs/blob/a76c4553d7e741e17f289224eda135423de0491d/pkgs/servers/sql/postgresql/patches/findstring.patch
https://github.com/NixOS/nixpkgs/blob/a76c4553d7e741e17f289224eda135423de0491d/pkgs/servers/sql/postgresql/patches/locale-binary-path.patch
https://github.com/NixOS/nixpkgs/blob/a76c4553d7e741e17f289224eda135423de0491d/pkgs/servers/sql/postgresql/patches/socketdir-in-run-13.patch
When a PR is submitted to the supabase/postgres repo updating any of the nix packages maintained there, a build of the entire bundle is triggered on supported systems (
x86_64-linux
andaarch64-linux
as of this writing). When this the nix ci workflow is initiated, nix is able to source from our binary cache (currently located in a publicly readable aws s3 bucket at https://nix-postgres-artifacts.s3.amazonaws.com ) and will check for any component dependency which has an exact match and has already successfully built. Nix will source that built version from the cache, and only build the items that have changed. If nix cannot build a changed item, the build will fail. If the build succeeds, nix will perform flake “checks” (scripted tests with dependencies managed by nix). An example of the “check” is seen hereOur CI implmentation of nix has only 2 trusted public keys and 2 specified nix caches (ours and the upstream nixpkgs community cache https://github.com/supabase/postgres/blob/sam/2-stage-ami-nix/docker/nix/Dockerfile#L5 and on the AMI at https://github.com/supabase/postgres/blob/sam/2-stage-ami-nix/scripts/nix-provision.sh#L20 )
https://github.com/supabase/postgres/actions/runs/9468138054/job/26083806922?pr=953#step:6:813 this starts the database, and enables several extensions post-build. If this test fails, the build will also fail. If the nix build and check succeeds, the build will upload the artifacts to the nix cache for re-use prior to stopping this workflow.
In the debian/ubuntu
postgresql-common
package, the “postgres” user is created, and postgres is installed to locations that are conventional for debian/ubuntu. In the nix approach, we explicitly create the “postgres” linux user, and then we use thenix profile install
method to install the nix-built binaries for postgres, into thenix profile
for the “postgres” user (located a/home/postgres/.nixprofile
on the ami machine). We then alias the installed file locations to the conventional debian/ubuntu locations for postgres installation.nix profile
command will give us an imperative way to install, uninstall, and upgrade packages that we build with nix going forward, allowing us to integrate our nix-built packages with debian/ubuntu distributions.2 Stage AMI approach
The Ansible and Packer code has been forked in parallel in the same repo, so that both the nix-built approach, and the existing ubuntu/debian package approach can be supported in paralell. This will allow continued production rollouts under the old method, while also allowing targeted rollouts with the nix build AMI.
The existing build lives under the same
ansible
folder, and the companion packer hcl files have been retained. The parallel nix AMI build has parallel packer files withnix
inserted into the name, and a new folderansible-nix
. Both of these builds use the same command line command recipe to initiate them.Description of 2 stage approach
The previous packer/ansible build used the https://developer.hashicorp.com/packer/integrations/hashicorp/amazon/latest/components/builder/ebssurrogate exclusively. The new nix-based retains the ebssurrogate approach to build and configure everything except for the postgres bundle.
The nature of nix builds is that they are already “sandboxed” and isolated at build time, and the results are store in a read only directory called the “nix store”. Nix has never had the need to support building in
chroot
as ebssurrogate packer build does, and so running nix inchroot
has never been supported for these reasons. Therefore, a second stage of the AMI build was introduced, that securely sources the private “stage1” AMI built by the stage 1 ebssurrogate approach, and then installs the nix built suapbase postgres/extensions/wrappers bundle from binary cache using the conventional github.com/hashicorp/amazon packer plugin, and limited to installing, configuring and testing postgres from either files uploaded in the first stage, or sourced from nix cache (other than stage 2 ansible playbook and unit test files). The workflow that performs these 2 stages is located here https://github.com/supabase/postgres/blob/sam/2-stage-ami-nix/.github/workflows/ami-release-nix.yml As more Supabase projects are packaged in nix, they will be moved into this 2nd stage for installation and configuration. In the 2nd stage we run migration and unit tests, and linux user/group assignment checks with a temporarily installed copy of osquery https://github.com/supabase/postgres/blob/sam/2-stage-ami-nix/ansible-nix/tasks/stage2/playbook.yml#L71 and https://github.com/supabase/postgres/blob/sam/2-stage-ami-nix/ansible-nix/files/permission_check.pyThe 2nd stage also creates path aliases to the nix-installed binaries so that files and configurations are still where they are expected to be as much as possible. This allows post-AMI-build init scripts like https://github.com/supabase/infrastructure/blob/develop/init-scripts/project/00-init.sh to continue to succeed in running.
We are maintaining documentation on how to work with the nix portion of supabase/postgres at https://github.com/supabase/postgres/tree/sam/2-stage-ami-nix/nix/docs and will continue to expand that as much as possible.
Current Progress on adoption in https://github.com/supabase/postgres
There is an umbrella draft PR at #953 which includes the building of an aarch64-linux AMI
Docker image PR #986
Docker AIO Image PR #987
The Ansible and Packer code has been forked in parallel in the same repo, so that both the nix-built approach, and the existing ubuntu/debian package approach can be supported in paralell. This will allow continued production rollouts under the old method, while also allowing targeted rollouts with the nix build AMI.