Internally, items have always been categorized into either phase1 or
phase2 during commit, with phase2 being committed last. This is needed
to ensure, for example, that a repomd.xml is not committed until after
all other files referenced by it have been committed.
This change now documents that concept and exposes it via the API.
By default, clients get the same behavior as always, but they can now
also explicitly request a phase1 commit.
A phase1 commit writes phase1 items to the DB, same as always, but then
stops without proceeding to write phase2 items. It also leaves the
publish open for later modifications and a later phase2 commit.
=== Why? ===
This will be used to solve the following Pub/rhsm-pulp/exodus-gw
integration problem:
Imagine that you need to publish multiple Pulp yum repos. You want this
to be atomic, so you want them all to use the same exodus-gw publish.
You ask the repos to publish, and they succeed, but then something
goes wrong during the later exodus-gw commit which is performed outside
of Pulp.
So, you restart the whole process and republish the repos again using
a new exodus-gw publish, and commit that successfully.
Problem: as far as Pulp is concerned, the publish tasks from the first
attempt were completely successful, as it does not "see" the later
failure to commit. Therefore, Pulp incorrectly thinks that the RPMs
processed by those tasks are fully published to the CDN, and so skips
publishing them during later tasks. This leads to missing content on the
CDN.
Solution: exodus-rsync, as invoked by Pulp during publish, will request
a phase1 commit during publish of each repo. This ensures the processed
RPMs (or other non-entrypoint files) are fully published on the CDN by the
time the Pulp publish task succeeds, matching Pulp's expectations.
The publish of repomd.xml is still held back until a later phase2
commit, retaining the atomic semantics across multiple repos.