Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker buildimage intermittently builds image #1243

Closed
richard-avelar opened this issue Jul 16, 2019 · 6 comments · Fixed by #1291
Closed

Docker buildimage intermittently builds image #1243

richard-avelar opened this issue Jul 16, 2019 · 6 comments · Fixed by #1291
Assignees
Labels
integrations Related to integrations with other services

Comments

@richard-avelar
Copy link

Hello, I am playing around with prefect and trying to use the buildimage for docker

from prefect import task, Flow
from prefect.tasks.docker import images


@task
def build_image():
    images.BuildImage(path='.').run()


with Flow("ETL") as flow:
    e = build_image()

flow.run()
[2019-07-16 19:36:37,341] INFO - prefect.FlowRunner | Beginning Flow run for 'ETL'
[2019-07-16 19:36:37,341] INFO - prefect.FlowRunner | Starting flow run.
[2019-07-16 19:36:37,342] INFO - prefect.TaskRunner | Task 'build_image': Starting task run...
[2019-07-16 19:36:41,274] INFO - prefect.TaskRunner | Task 'build_image': finished task run for task with final state: 'Success'
[2019-07-16 19:36:41,275] INFO - prefect.FlowRunner | Flow run SUCCESS: all reference tasks succeeded

I noticed that this intermittently builds an image once about ever 4-5 times I run it.
Anyone have an idea what is going on?

@cicdw
Copy link
Member

cicdw commented Jul 17, 2019

Hi @richard-avelar !

It's possible this is related to docker caching the images, especially if you haven't changed your Dockerfile at all in between runs. To test this, could you try using nocache keyword to BuildImage?

  from prefect import task, Flow
from prefect.tasks.docker import images


build_image = images.BuildImage(path='.', nocache=True)


with Flow("ETL") as flow:
    e = build_image()

flow.run()

Also, just an FYI: your current implementation creates a Task which calls another task's run method, which could lead to unexpected issues for certain tasks. I'd recommend using the pattern I've included above, where you instead instantiate the BuildImage task as build_image directly, without needing to use the @task decorator or anything else.

@richard-avelar
Copy link
Author

richard-avelar commented Jul 17, 2019

Hey @cicdw, thanks for the reply!
Wouldn't nocache only come into play after the image was created at least once and stored in my local registry? Sorry for the misunderstanding in the original post, I delete images before each run as well.
In any case, I tried to make the changes you mentioned to no avail unfortunately.
Here is the python file

from prefect import Flow
from prefect.tasks.docker import images


build_image = images.BuildImage(path='.', nocache=True)


with Flow("ETL") as flow:
    e = build_image()

flow.run()

and the output displayed

~ python3 prefect.py
[2019-07-17 02:20:19,433] INFO - prefect.FlowRunner | Beginning Flow run for 'ETL'
[2019-07-17 02:20:19,433] INFO - prefect.FlowRunner | Starting flow run.
[2019-07-17 02:20:19,434] INFO - prefect.TaskRunner | Task 'BuildImage': Starting task run...
[2019-07-17 02:20:22,891] INFO - prefect.TaskRunner | Task 'BuildImage': finished task run for task with final state: 'Success'
[2019-07-17 02:20:22,892] INFO - prefect.FlowRunner | Flow run SUCCESS: all reference tasks succeeded
~ docker images
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
~ ls
Dockerfile main.py prefect.py

@cicdw
Copy link
Member

cicdw commented Jul 17, 2019

Oooo I'm sorry, yea - so your image isn't even being built! So, if we strip out the Prefect logic, the BuildImage task only runs this code:

import docker


client = docker.APIClient(base_url="unix:///var/run/docker.sock", version="auto")
client.build(path=".", tag=None, nocache=False, rm=True, forcerm=False)

So I think we want to understand why this doesn't do what we think it should. Maybe try:

response = [line for line in client.build(path=".", tag=None, nocache=False, rm=True, forcerm=False)]
print(response)

and we can see if there's anything informative in that output.

@richard-avelar
Copy link
Author

richard-avelar commented Jul 17, 2019

Interesting, that seems to work 😅
The file

import docker


client = docker.APIClient(base_url="unix:///var/run/docker.sock", version="auto")
client.build(path=".", tag=None, nocache=False, rm=True, forcerm=False)
response = [line for line in client.build(path=".", tag=None, nocache=False, rm=True, forcerm=False)]
print(response)

The output being a very long list of docker build output steps, progress, etc.
And the resulting images showing up in my docker images list

Hmm, The docker specific code seems to work then but I just cant seem to get it to work in prefect for some reason. The docker specific code took about a full 2 minutes to finish running (as is typical for this image to take to complete building), but the prefect code takes a couple of seconds to run and complete execution which also seemed off to me like there was never a serious attempt to build or it silently failed the build attempt and exited

@cicdw
Copy link
Member

cicdw commented Jul 17, 2019

Great!

So I have a guess here -> it's possible that client.build is an asynchronous / deferred call of some kind, and whenever the Prefect Task exits it prevents the build from actually occurring. What we should do is update the Prefect Task to actually iterate over the client.build generator to ensure it completes before returning from the Task.

In the meantime, I recommend you implement a custom Task like so:

import docker


@task
def build_image(path):
    client = docker.APIClient(base_url="unix:///var/run/docker.sock", version="auto")
    client.build(path=path, tag=None, nocache=False, rm=True, forcerm=False)
    response = [line for line in client.build(path=".", tag=None, nocache=False, rm=True, forcerm=False)]
    return 

And I'd bet this works. Looks like I need to audit our Docker Tasks for this behavior!

@richard-avelar
Copy link
Author

Sounds good to me!
Thanks for prompt feedback!

@joshmeek joshmeek assigned joshmeek and unassigned joshmeek Jul 20, 2019
@joshmeek joshmeek added the integrations Related to integrations with other services label Jul 20, 2019
@cicdw cicdw added this to the v0.6.1 milestone Jul 28, 2019
@cicdw cicdw self-assigned this Jul 29, 2019
abrookins pushed a commit that referenced this issue Mar 15, 2022
Touchups for agent work queue by name
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integrations Related to integrations with other services
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants