Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rpm-ostree jigdo ♲📦 initial code drop #1103

Closed
wants to merge 5 commits into from

Conversation

cgwalters
Copy link
Member

@cgwalters cgwalters commented Nov 14, 2017

@@ -41,6 +41,9 @@ static RpmOstreeCommand compose_subcommands[] = {
{ "commit", RPM_OSTREE_BUILTIN_FLAG_LOCAL_CMD | RPM_OSTREE_BUILTIN_FLAG_REQUIRES_ROOT,
"Commit a target path to an OSTree repository",
rpmostree_compose_builtin_commit },
{ "commit2jigdo", RPM_OSTREE_BUILTIN_FLAG_LOCAL_CMD | RPM_OSTREE_BUILTIN_FLAG_REQUIRES_ROOT,
"Convert an OSTree commit into an OIRPM (jigdo)",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: maybe let's just stick with OIRPM as the final term and keep jigdo in e.g. documentation material? E.g. just

rpm-ostree compose oirpm

?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd been thinking of the "whole thing" as "rpm-ostree jigdo", and "oirpm" is the special additional RPM. But that said I'm totally open to changing this.

@cgwalters cgwalters added the WIP label Nov 15, 2017
@rh-atomic-bot
Copy link

☔ The latest upstream changes (presumably 1240d8d) made this pull request unmergeable. Please resolve the merge conflicts.

@rh-atomic-bot
Copy link

☔ The latest upstream changes (presumably 7ab8869) made this pull request unmergeable. Please resolve the merge conflicts.

@rh-atomic-bot
Copy link

☔ The latest upstream changes (presumably 3668261) made this pull request unmergeable. Please resolve the merge conflicts.

@cgwalters
Copy link
Member Author

OK, so this works end-to-end now. I have a local test setup where I've cached the RPMs and have been using this script:

#!/usr/bin/bash
set -xeuo pipefail

rpm-ostree compose commit2jigdo --repo=repo-build --pkgcache-repo=cache/pkgcache-repo fedora/27/x86_64/atomic-host  /srv/walters/tmp/fedora-atomic-host-oirpm.spec  && mv /root/rpmbuild/RPMS/x86_64/fedora-atomic-host-* f27-jigdo-test/ && (cd f27-jigdo-test/&&createrepo_c .)
rm repo-jigdo-import/ -rf && ostree --repo=repo-jigdo-import init --mode=bare-user && echo 'fsync=false' >> repo-jigdo-import/config && env G_DEBUG=fatal-warnings rpm-ostree compose jigdo2commit --repo=repo-jigdo-import -d /srv/walters/src/pagure/fedora-atomic -e fedora-27-local -e fedora-27-jigdo --releasever 27 fedora-atomic-host

@cgwalters
Copy link
Member Author

@jlebon Can you take a high level look at this? Going forward from here is going to require a lot of changes to e.g. rpmostree-core.c and rpmostree-sysroot-upgrader.c etc. to start making use of all of this. Things like defining what an origin file looks like, moving the logic for "find best jigdo package" into core.c etc.

@rh-atomic-bot
Copy link

☔ The latest upstream changes (presumably 9c004e1) made this pull request unmergeable. Please resolve the merge conflicts.

Copy link
Member

@jlebon jlebon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, there's a lot here. I tried not to spend too much time looking at the implementation details, though the approach in general looks sane. Just some comments on things that caught my eye.

One feeling that's itching at me is how we're overloading some more the core and importer functionality (and soon origin and upgrader). All those different modes of operation will make things harder to hack on. But I don't have any suggestions right now to help with that.

Maybe let's work towards getting the commit2jigdo and jigdo2commit commands in first (e.g. prefix with ex-?) before we implement actual client-side workflows? I'd like to discuss some more what that will look like first (let's do that in the original issue!). Also it'll be easier for others to join in on the fun!

design/jigdo.md Outdated
expressions involved. However, RPM packages themselves don't contain labels;
instead the labeling is stored in the `selinux-policy-targeted` package, and
further complicating things is that there are also other packages that the
labeling configuration such as `container-selinux`. We go to great
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence is missing a couple of words. :)

GCancellable *cancellable,
GError **error)
{
g_autoptr(GOptionContext) context = g_option_context_new ("COMMIT OIRPM-SPEC");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we also require a spec file? Shouldn't we just auto-generate it since we already have all the semantics needed in place. E.g. in the fedora-atomic WIP you posted, you have %{ostree_version}. All the other fields can similarly be filled, except for the name, which we could just make a new key in the treefile? (We can fetch that from the commit, right?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, where are the Requires added to the spec? Is that part not in yet?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, there are things like %description...I dunno. Maybe support providing one but if not we autogenerate it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as Requires; I decided not to do this initially; the rationale is that we can easily get other synthetic Requires that RPM injects like rpmlib(PayloadIsXz) <= 5.2-1 and so we'd have to learn to ignore those. Another reason down the line is that I want the "jigdo set" to actually be metadata in the ostree commit, so it can be covered by that GPG signature.

guint64 epoch;
g_variant_get_child (pkgs, i, "(&st&s&s&s&s)",
&name, &epoch, &version, &release, &architecture,
&repodata_checksum);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see, so we're just storing the pkglist in the OIRPM rather than using Requires. It might be interesting to still do the latter though for the benefit of all those tools that understand them.

GCancellable *cancellable,
GError **error)
{
g_clear_pointer (&self->pkgs_to_download, (GDestroyNotify)g_ptr_array_unref);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add asserts here to make sure these aren't even set at all?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do use this function twice currently; the oirpm is downloaded and parsed, and among other things we get the jigdo set, which we then download. I guess we could clear the "pkgs to download" set after downloading is done? Mmm.

DnfContext *dnfctx = rpmostree_context_get_dnf (self->ctx);
g_autoptr(DnfPackage) oirpm_pkg = NULL;
{ hy_autoquery HyQuery query = hy_query_create (dnf_context_get_sack (dnfctx));
const char *at = strchr (oirpm_id, '@');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, can we make this a separate --version argument or even just a second positional arg?

@cgwalters
Copy link
Member Author

As far as a plan for landing this under an ex prefix, yep makes sense! I'll work on that and some tests.

@cgwalters
Copy link
Member Author

Running test-jigdo.sh...                            
-   ok jigdo ♲📦                                    
PASS: test-jigdo.sh
TOTAL: 1 PASS: 1 SKIP: 0 FAIL: 0

@cgwalters cgwalters changed the title WIP: rpm-ostree jigdo ♲📦 rpm-ostree jigdo ♲📦 initial code drop Dec 1, 2017
@cgwalters cgwalters removed the WIP label Dec 1, 2017
@cgwalters
Copy link
Member Author

Rebased, and now with improved testing: we do an incremental update.

@rh-atomic-bot
Copy link

☔ The latest upstream changes (presumably dafb3d6) made this pull request unmergeable. Please resolve the merge conflicts.

I've been running these a lot locally and this helps a bit.
`test -n` would always be true.  Also while I have the patient open
let's make writing the file atomic.  Maybe someday I'll be motivated
enough to write an `O_TMPFILE` patch for bash.
Tracking issue: coreos#1081

To briefly recap: Let's experiment with doing ostree-in-RPM, basically the
"compose" process injects additional data (SELinux labels for example) in an
"ostree image" RPM, like `fedora-atomic-host-27.8-1.x86_64.rpm`. That "ostree
image" RPM will contain the OSTree commit+metadata, and tell us what RPMs we
need need to download. For updates, like `yum update` we only download changed
RPMs, plus the new "oirpm". But SELinux labeling, depsolving, etc. are still
done server side, and we still have a reliable OSTree commit checksum.

This is a lot like [Jigdo](http://atterer.org/jigdo/)

Here we fully demonstrate the concept working end-to-end; we use the
"traditional" `compose tree` to commit a bunch of RPMs to an OSTree repo, which
has a checksum, version etc. Then the new `ex commit2jigdo` generates the
"oirpm". This is the "server side" operation. Next simulating the client side,
`jigdo2commit` takes the OIRPM and uses it and downloads the "jigdo set" RPMs,
fully regenerating *bit for bit* the final OSTree commit.

If you want to play with this, I'd take a look at the `test-jigdo.sh`; from
there you can find other useful bits like the example `fedora-atomic-host.spec`
file (though the canonical copy of this will likely land in the
[fedora-atomic](http://pagure.io/fedora-atomic) manifest git repo.
@@ -30,6 +30,10 @@ static RpmOstreeCommand ex_subcommands[] = {
"Assemble local unprivileged containers", rpmostree_builtin_container },
{ "kargs", 0,
"Query or Modify the kernel arguments", rpmostree_ex_builtin_kargs },
{ "commit2jigdo", RPM_OSTREE_BUILTIN_FLAG_LOCAL_CMD,
"Convert an OSTree commit into a rpm-ostree jigdo format", rpmostree_ex_builtin_commit2jigdo },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: for funsies, let's just make this fully symmetrical? "Convert an OSTree commit into an rpm-ostree jigdo" ?


if (argc != 4)
{
rpmostree_usage_error (context, "COMMIT OIRPM-SPEC is required", error);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing OUTPUTDIR


/* Prepare a context - this does some generic pre-compose initialization from
* the arguments such as loading the treefile and any specified metadata.
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This no longer seems to be the case?

GCancellable *cancellable,
GError **error)
{
g_assert_cmpint (*outputdir, ==, '/');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We just checked for this, right? I'm fine with it, just mentioning in case it's leftover from before you made it an explicit error in the caller.

GCancellable *cancellable,
GError **error)
{
g_autoptr(GOptionContext) context = g_option_context_new ("COMMIT OIRPM-SPEC OUTPUTDIR");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/COMMIT/REV/ and s/provided_commit/rev/ in impl_commit2jigdo?

/* We'll account for new big objects later after more analysis */
}

g_print ("%u/%u packages contain objects; %u small, examining %u big objects more closely...\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is minor, though I'd word this like:

Found objects in 123/456 packages; found $small + $big objects unpackaged.

Also, it seems like the big objects are examined lower down after this next block, right?

g_print ("oirpm content size (big objs): %s\n", oirpm_bytes_formatted_big);
}

g_auto(GLnxTmpDir) oirpm_tmpd = { 0, };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's definitely a couple of spots where we could factor out clean self-contained functions. Though here at least seems like a good place to split this behemoth in two, WDYT? There's already a lot in scope to keep track of.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep done

buf, buflen, GLNX_FILE_REPLACE_NODATASYNC,
cancellable, error))
return FALSE;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems easier to just enhance write_one_new_object for these two, no?

#include "libglnx.h"
#include <rpm/rpmlib.h>
#include <libdnf/libdnf.h>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use this file to document some more the file structure? E.g. 00commit/meta, 04new-content-ident/nn/01meta, etc...

g_hash_table_iter_remove (&it);
}

GLNX_HASH_TABLE_FOREACH_IT (new_big_content_identical, it, const char *, content_checksum,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole block looks like a duplicate of the block above.

// gpg sig?
DnfPackage *pkg = query_nevra (dnfctx, name, epoch, version, release, architecture, error);
if (!pkg)
return FALSE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't require a package to be in the repos if we already have it imported, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's a very good point. We want to support upgrading ancient systems.

@@ -837,3 +837,54 @@ rpmostree_diff_print (GPtrArray *removed,
g_print (" %s\n", nevra);
}
}

/* Copy of ot_variant_bsearch_str() from libostree
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for now, though this looks like a good candidate for libglnx.

@cgwalters
Copy link
Member Author

Ok fixups ⬆️ though for some of them I just added // FIXME for now

@jlebon
Copy link
Member

jlebon commented Dec 4, 2017

Awesome, looks good!
@rh-atomic-bot r+ 555f333

@rh-atomic-bot
Copy link

⚡ Test exempted: pull fully rebased and already tested.

rh-atomic-bot pushed a commit that referenced this pull request Dec 4, 2017
`test -n` would always be true.  Also while I have the patient open
let's make writing the file atomic.  Maybe someday I'll be motivated
enough to write an `O_TMPFILE` patch for bash.

Closes: #1103
Approved by: jlebon
rh-atomic-bot pushed a commit that referenced this pull request Dec 4, 2017
Tracking issue: #1081

To briefly recap: Let's experiment with doing ostree-in-RPM, basically the
"compose" process injects additional data (SELinux labels for example) in an
"ostree image" RPM, like `fedora-atomic-host-27.8-1.x86_64.rpm`. That "ostree
image" RPM will contain the OSTree commit+metadata, and tell us what RPMs we
need need to download. For updates, like `yum update` we only download changed
RPMs, plus the new "oirpm". But SELinux labeling, depsolving, etc. are still
done server side, and we still have a reliable OSTree commit checksum.

This is a lot like [Jigdo](http://atterer.org/jigdo/)

Here we fully demonstrate the concept working end-to-end; we use the
"traditional" `compose tree` to commit a bunch of RPMs to an OSTree repo, which
has a checksum, version etc. Then the new `ex commit2jigdo` generates the
"oirpm". This is the "server side" operation. Next simulating the client side,
`jigdo2commit` takes the OIRPM and uses it and downloads the "jigdo set" RPMs,
fully regenerating *bit for bit* the final OSTree commit.

If you want to play with this, I'd take a look at the `test-jigdo.sh`; from
there you can find other useful bits like the example `fedora-atomic-host.spec`
file (though the canonical copy of this will likely land in the
[fedora-atomic](http://pagure.io/fedora-atomic) manifest git repo.

Closes: #1103
Approved by: jlebon
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants