Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add repository_version param as a building context #1687

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CHANGES/479.feature
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Replaced `artifacts` by `build_context` (i.e., a file plugin repository version HREF)
as the parameter to provide the build context for a Containerfile.
Added the `containerfile_name` as the parameter to provide the relative-path of a File Content
from the provided `build_context`.
1 change: 1 addition & 0 deletions CHANGES/479.removal
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Removed support for using raw artifacts as the build context and for the Containerfile.
65 changes: 50 additions & 15 deletions docs/admin/guides/build-image.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,46 +7,81 @@

Users can add new images to a container repository by uploading a Containerfile. The syntax for
Containerfile is the same as for a Dockerfile. The same REST API endpoint also accepts a JSON
string that maps artifacts in Pulp to a filename. Any artifacts passed in are available inside the
build container at `/pulp_working_directory`.
string that maps artifacts in Pulp to a filename.

## Create a Repository
To make a file available during the build process, we need to add it to a versioned file repository,
then pass it as the `build_context` query parameter of the build container API, and finally specify the
```
ADD/COPY <file relative-path> <location in container>
```
instruction in the Containerfile.

It is possible to define the Containerfile in two ways:
* from a [local file](site:pulp_container/docs/admin/guides/build-image#build-from-a-containerfile-uploaded-during-build-request) and pass it during build request
* from an [existing file](site:pulp_container/docs/admin/guides/build-image#upload-the-containerfile-as-a-file-content) in the `build_context`

## Create a Container Repository

```bash
REPO_HREF=$(pulp container repository create --name building | jq -r '.pulp_href')
CONTAINER_REPO=$(pulp container repository create --name building | jq -r '.pulp_href')
```

## Create an Artifact
## Create a File Repository and populate it

```bash
FILE_REPO=$(pulp file repository create --name bar --autopublish | jq -r '.pulp_href')

echo 'Hello world!' > example.txt

ARTIFACT_HREF=$(http --form POST http://localhost/pulp/api/v3/artifacts/ \
file@./example.txt \
| jq -r '.pulp_href')
pulp file content upload --relative-path foo/bar/example.txt \
--file ./example.txt --repository bar
```

## Create a Containerfile

```bash
echo 'FROM centos:7

# Copy a file using COPY statement. Use the relative path specified in the 'artifacts' parameter.
# Copy a file using COPY statement. Use the path specified in the '--relative-path' parameter.
COPY foo/bar/example.txt /inside-image.txt

# Print the content of the file when the container starts
CMD ["cat", "/inside-image.txt"]' >> Containerfile
```

## Build an OCI image

## Build from a Containerfile uploaded during build request

### Build an OCI image with the "local" Containerfile

```bash
TASK_HREF=$(http --form POST ${BASE_ADDR}${CONTAINER_REPO}'build_image/' "containerfile@./Containerfile" \
build_context=${FILE_REPO}versions/1/ | jq -r '.task')
```

!!! note

If `TMPFILE_PROTECTION_TIME` is set to 0 (default value), the automatic cleanup is disabled,
meaning all the uploaded Containerfile used for the builds will be kept in `FILE_UPLOAD_TEMP_DIR`.


## Upload the Containerfile to a File Repository and use it to build

### Upload the Containerfile as a File Content

```bash
pulp file content upload --relative-path MyContainerfile --file ./Containerfile --repository bar
```

### Build an OCI image from a Containerfile present in build_context

```bash
TASK_HREF=$(http --form POST :$REPO_HREF'build_image/' containerfile@./Containerfile \
artifacts="{\"$ARTIFACT_HREF\": \"foo/bar/example.txt\"}" | jq -r '.task')
TASK_HREF=$(http --form POST ${BASE_ADDR}${CONTAINER_REPO}'build_image/' containerfile_name=MyContainerfile \
build_context=${FILE_REPO}versions/2/ | jq -r '.task')
```


!!! warning

Non-staff users, lacking read access to the `artifacts` endpoint, may encounter restricted
functionality as they are prohibited from listing artifacts uploaded to Pulp and utilizing
them within the build process.
File repositories synced with on-demand policy will not automatically pull the missing artifacts.
Trying to build using a file that is not yet pulled will fail.
119 changes: 72 additions & 47 deletions pulp_container/app/serializers.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,14 @@
from gettext import gettext as _
import os
import re

from django.core.validators import URLValidator
from rest_framework import serializers

from pulpcore.plugin.models import (
Artifact,
ContentArtifact,
ContentRedirectContentGuard,
PulpTemporaryFile,
Remote,
Repository,
RepositoryVersion,
Expand Down Expand Up @@ -758,13 +759,12 @@ class OCIBuildImageSerializer(ValidateFieldsMixin, serializers.Serializer):
A repository must be specified, to which the container image content will be added.
"""

containerfile_artifact = RelatedField(
many=False,
lookup_field="pk",
view_name="artifacts-detail",
queryset=Artifact.objects.all(),
containerfile_name = serializers.CharField(
required=False,
allow_blank=True,
help_text=_(
"Artifact representing the Containerfile that should be used to run podman-build."
"Name of the Containerfile, from build_context, that should be used to run "
"podman-build."
),
)
containerfile = serializers.FileField(
Expand All @@ -774,65 +774,90 @@ class OCIBuildImageSerializer(ValidateFieldsMixin, serializers.Serializer):
tag = serializers.CharField(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, i cannot comment on 10 lines above ...
I think, we should not keep the containerfile_artifact as is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum... I could think of 2 ways to do that:

  • provide a new field to pass the pulp_href of the File Content with the Containerfile (not sure if this could be considered an advantage, but with this approach users could use a different repository to store the Containerfile or pass the same as the build_context)
  • use the build_context field and add a logic to validate if the Containerfile is present in the specified File Repository

What do you think of these ideas?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's valuable to be able to use the same build-context with different Container files.

That being said, we could add a containerfile name field to select one file from the build context. Or, yes, we just allow to specify a file_content item for now. I don't know exactly what workflows we need to cover. We still (already) have the option to upload a file on the go, right? So adding more options feels not too bad.

required=False, default="latest", help_text="A tag name for the new image being built."
)
artifacts = serializers.JSONField(
build_context = RepositoryVersionRelatedField(
required=False,
help_text="A JSON string where each key is an artifact href and the value is it's "
"relative path (name) inside the /pulp_working_directory of the build container "
"executing the Containerfile.",
help_text=_("RepositoryVersion to be used as the build context for container images."),
allow_null=True,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want this to be a FileRepositoryVersion, right? Can we do something about the queryset here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hum... I tried to find a FileRepositoryVersion model, but I couldn't.
I also tried to pass a FileRepository qs, but it didn't work:

    build_context = RepositoryVersionRelatedField(
        required=False,
        help_text=_("RepositoryVersion to be used as the build context for container images."),
        allow_null=True,
        queryset = FileRepository.objects.all()
    )


django.core.exceptions.FieldError: Cannot resolve keyword 'number' into field. Choices are: alternatecontentsourcepath, autopublish, content, core_pulp_exporter, description, distributions, group_roles, manifest, name, next_version, pulp_created, pulp_domain, pulp_domain_id, pulp_id, pulp_labels, pulp_last_updated, pulp_type, pulpimporterrepository, remote, remote_id, repository_ptr, repository_ptr_id, repositorycontent, retain_repo_versions, uploads, user_hidden, user_roles, versions

I also tried to "adapt" the qs like (to do a reverse lookup of the repository version from repository):

        queryset = FileRepository.objects.all().repositoryversion_set,

but django complained:

AttributeError: 'QuerySet' object has no attribute 'repositoryversion_set'

Am I doing something wrong?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's not possible. But maybe again RepositoryVersion.objects.filter(repository__pulp_type="file")?

queryset=RepositoryVersion.objects.filter(repository__pulp_type="file.file"),
)

def __init__(self, *args, **kwargs):
"""Initializer for OCIBuildImageSerializer."""
super().__init__(*args, **kwargs)
self.fields["containerfile_artifact"].required = False

def validate(self, data):
"""Validates that all the fields make sense."""
data = super().validate(data)

if "containerfile" in data:
if "containerfile_artifact" in data:
raise serializers.ValidationError(
_("Only one of 'containerfile' and 'containerfile_artifact' may be specified.")
)
data["containerfile_artifact"] = Artifact.init_and_validate(data.pop("containerfile"))
elif "containerfile_artifact" in data:
data["containerfile_artifact"].touch()
else:
if bool(data.get("containerfile", None)) == bool(data.get("containerfile_name", None)):
raise serializers.ValidationError(
_("Exactly one of 'containerfile' or 'containerfile_name' must be specified.")
)

if "containerfile_name" in data and "build_context" not in data:
raise serializers.ValidationError(
_("'containerfile' or 'containerfile_artifact' must " "be specified.")
_("A 'build_context' must be specified when 'containerfile_name' is present.")
)
artifacts = {}
if "artifacts" in data:
for url, relative_path in data["artifacts"].items():
if os.path.isabs(relative_path):

if self._is_permission_function_request():
# the "has_repo_or_repo_ver_param_model_or_obj_perms" permission condition function
# expects a "repo" or "repository_version" arguments, so we need to pass "build_context"
# as "repository_version" to be able to validate the permissions
if data.get("build_context", None):
data["repository_version"] = data["build_context"]

# return early, we don't need the remaining validations to check the permissions
return data

if containerfile := data.get("containerfile", None):
temp_file = PulpTemporaryFile.objects.update_or_create(file=containerfile)[0]
data["containerfile_tempfile_pk"] = str(temp_file.pk)

if build_context := data.get("build_context", None):
data["content_artifact_pks"] = []
repository_version = RepositoryVersion.objects.get(pk=build_context.pk)
content_artifacts = ContentArtifact.objects.filter(
content__pulp_type="file.file", content__in=repository_version.content
).order_by("-content__pulp_created")
for content_artifact in content_artifacts.select_related("artifact").iterator():
if not content_artifact.artifact:
raise serializers.ValidationError(
_("Relative path cannot start with '/'. " "{0}").format(relative_path)
"It is not possible to use File content synced with on-demand "
"policy without pulling the data first."
)
containerfile_name = data.get("containerfile_name", None)
if containerfile_name and content_artifact.relative_path == containerfile_name:
artifact = Artifact.objects.get(pk=content_artifact.artifact.pk)
data["containerfile_artifact_pk"] = str(artifact.pk)

data["content_artifact_pks"].append(content_artifact.pk)

if data.get("containerfile_name") and not data.get("containerfile_artifact_pk", None):
raise serializers.ValidationError(
_(
'Could not find the Containerfile "'
+ data["containerfile_name"]
+ '" in the build_context provided'
)
artifactfield = RelatedField(
view_name="artifacts-detail",
queryset=Artifact.objects.all(),
source="*",
initial=url,
)
try:
artifact = artifactfield.run_validation(data=url)
artifact.touch()
artifacts[str(artifact.pk)] = relative_path
except serializers.ValidationError as e:
# Append the URL of missing Artifact to the error message
e.detail[0] = "%s %s" % (e.detail[0], url)
raise e
data["artifacts"] = artifacts

data["build_context_repo"] = repository_version.repository

return data

def _is_permission_function_request(self):
import inspect

is_perm_function_request = [
i
for i in inspect.stack()
if "has_repo_or_repo_ver_param_model_or_obj_perms" in i.function
]
return is_perm_function_request

class Meta:
fields = (
"containerfile_artifact",
"containerfile_name",
"containerfile",
"repository",
"tag",
"artifacts",
"build_context",
)


Expand Down
64 changes: 42 additions & 22 deletions pulp_container/app/tasks/builder.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,12 @@
)
from pulp_container.constants import MEDIA_TYPE
from pulp_container.app.utils import calculate_digest
from pulpcore.plugin.models import Artifact, ContentArtifact, Content
from pulpcore.plugin.models import (
Artifact,
ContentArtifact,
Content,
PulpTemporaryFile,
)


def get_or_create_blob(layer_json, manifest, path):
Expand Down Expand Up @@ -43,7 +48,7 @@ def get_or_create_blob(layer_json, manifest, path):
artifact=layer_artifact, content=blob, relative_path=layer_json["digest"]
).save()
if layer_json["mediaType"] != MEDIA_TYPE.CONFIG_BLOB_OCI:
BlobManifest(manifest=manifest, manifest_blob=blob).save()
BlobManifest.objects.get_or_create(manifest=manifest, manifest_blob=blob)
return blob


Expand All @@ -68,15 +73,13 @@ def add_image_from_directory_to_repository(path, repository, tag):
manifest_digest = calculate_digest(bytes_data)
manifest_text_data = bytes_data.decode("utf-8")

manifest = Manifest(
manifest, _ = Manifest.objects.get_or_create(
digest=manifest_digest,
schema_version=2,
media_type=MEDIA_TYPE.MANIFEST_OCI,
data=manifest_text_data,
)
manifest.save()
tag = Tag(name=tag, tagged_manifest=manifest)
tag.save()
tag, _ = Tag.objects.get_or_create(name=tag, tagged_manifest=manifest)

with repository.new_version() as new_repo_version:
manifest_json = json.loads(manifest_text_data)
Expand All @@ -96,7 +99,11 @@ def add_image_from_directory_to_repository(path, repository, tag):


def build_image_from_containerfile(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may have an issue here.
You know tasks can be older than your current installation. We never required draining the task queue as a prerequisite for upgrading the codebase. So i wonder if we need to keep the artifacts parameter around. That however raises the next question, whether we still need to handle it properly or whether we use it to make the task fail with a good error message.
Since we are changing it almost completely, i guess the feature "image building" is in tech-preview. So the reach of this question is kind of limited.
@lubosmj your call.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion, the safest thing to do would be to leave the old API untouched (including the ViewSet action), introduce a new one, and then deprecate the old one after 2 releases. Otherwise, we would leave artifacts in place forever. Since this functionality is flagged as "tech-preview" and we completely overhauled the API, I call this concern unimportant.

Personally, I would rather implement a safety feature in pulpcore that will detect potential changes of the function signature after unpickling the function arguments: https://github.com/pulp/pulpcore/blob/e2f03715de68736b78942e5fa3aca98de2c318f7/pulpcore/tasking/tasks.py#L70-L75.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it's kind of safe, since an old task would just fail to start up on a new codebase. And we'd be like: "Hey yeah, tech preview. Try starting again." 🤷 .
More dangerous are changes, where the task didn't change the signature, but interprets it's parameters in a different way.

containerfile_pk=None, artifacts=None, repository_pk=None, tag=None
containerfile_artifact_pk=None,
containerfile_tempfile_pk=None,
content_artifact_pks=None,
repository_pk=None,
tag=None,
):
"""
Builds an OCI container image from a Containerfile.
Expand All @@ -105,10 +112,11 @@ def build_image_from_containerfile(
values. The Containerfile can make use of these files during build process.

Args:
containerfile_pk (str): The pk of an Artifact that contains the Containerfile
artifacts (dict): A dictionary where each key is an artifact PK and the value is it's
relative path (name) inside the /pulp_working_directory of the build
container executing the Containerfile.
containerfile_artifact_pk (str): The pk of an Artifact that contains the Containerfile
containerfile_tempfile_pk (str): The pk of a PulpTemporaryFile that contains
the Containerfile
content_artifact_pk (list): The list of pks of ContentArtifacts used in the build context
of the Containerfile
repository_pk (str): The pk of a Repository to add the OCI container image
tag (str): Tag name for the new image in the repository

Expand All @@ -117,26 +125,29 @@ def build_image_from_containerfile(
image and tag.

"""
containerfile = Artifact.objects.get(pk=containerfile_pk)
if containerfile_tempfile_pk:
containerfile = PulpTemporaryFile.objects.get(pk=containerfile_tempfile_pk)
else:
containerfile = Artifact.objects.get(pk=containerfile_artifact_pk)

repository = ContainerRepository.objects.get(pk=repository_pk)
name = str(uuid4())
with tempfile.TemporaryDirectory(dir=".") as working_directory:
working_directory = os.path.abspath(working_directory)
context_path = os.path.join(working_directory, "context")
os.makedirs(context_path, exist_ok=True)
for key, val in artifacts.items():
artifact = Artifact.objects.get(pk=key)
dest_path = os.path.join(context_path, val)
dirs = os.path.split(dest_path)[0]
if dirs:
os.makedirs(dirs, exist_ok=True)
with open(dest_path, "wb") as dest:
shutil.copyfileobj(artifact.file, dest)

containerfile_path = os.path.join(working_directory, "Containerfile")

containerfile_path = os.path.join(working_directory, "Containerfile")
with open(containerfile_path, "wb") as dest:
shutil.copyfileobj(containerfile.file, dest)

if content_artifact_pks:
content_artifacts = ContentArtifact.objects.filter(pk__in=content_artifact_pks)
for content_artifact in content_artifacts.select_related("artifact").iterator():
_copy_file_from_artifact(
context_path, content_artifact.relative_path, content_artifact.artifact.file
)

bud_cp = subprocess.run(
[
"podman",
Expand Down Expand Up @@ -166,3 +177,12 @@ def build_image_from_containerfile(
repository_version = add_image_from_directory_to_repository(image_dir, repository, tag)

return repository_version


def _copy_file_from_artifact(context_path, relative_path, artifact):
dest_path = os.path.join(context_path, relative_path)
dirs = os.path.dirname(dest_path)
if dirs:
os.makedirs(dirs, exist_ok=True)
with open(dest_path, "wb") as dest:
shutil.copyfileobj(artifact.file, dest)
Loading