no active session for <id>: context deadline exceeded #456

errordeveloper · 2020-11-27T10:00:18Z

I’m seeing buildx error like this:

#25 exporting to image
#25 exporting layers
#25 exporting layers 2.4s done
#25 exporting manifest sha256:bee6a0ade00b7bc2bcaae86deefc8e88e6a274dee097c8ade078e4b3a88de520 done
#25 exporting config sha256:ac3b1fe36ac072dafadc6f2aed84ff305d810a92a3e62f71a820928c832d9ac9
#25 exporting config sha256:ac3b1fe36ac072dafadc6f2aed84ff305d810a92a3e62f71a820928c832d9ac9 0.0s done
#25 exporting manifest sha256:b7812799f180bbded0a8de865efba29dcf351c858f958e506cc13e612657e4b8 0.0s done
#25 exporting config sha256:b7efbce252b10357c495758cfb1299a4c6908153eec71fc8e3097bcca5d3eb8f done
#25 exporting manifest list sha256:cbf719abf8d8c678350283152e91f9f49c1138e0087758eeaa67f6697ae4dfda done
#25 ERROR: no active session for os44n22472jqaz3bk2t3jktq2: context deadline exceeded

Things to note:

buildx v0.4.2
build takes 4h50m
it's meant to push to quay.io
full build log

The text was updated successfully, but these errors were encountered:

errordeveloper · 2020-11-27T10:09:31Z

My first hunch is that some context is initialised with a short deadline, and that deadline expires soon since the build is taking so long.

The error is very ambiguous, however no active session for is quite unique and is returned from here:

buildx/vendor/github.com/moby/buildkit/session/manager.go

Lines 176 to 178 in fb7b670

    
           case <-ctx.Done(): 
        
           	sm.mu.Unlock() 
        
           	return nil, errors.Wrapf(ctx.Err(), "no active session for %s", id)

Also, ERROR: is a unique prefix too, perhaps somewhat surprisingly!

buildx/vendor/github.com/moby/buildkit/util/progress/progressui/printer.go

Lines 151 to 158 in fb7b670

    
           if v.Error != "" { 
        
           	if v.logsPartial { 
        
           		fmt.Fprintln(p.w, "") 
        
           	} 
        
           	if strings.HasSuffix(v.Error, context.Canceled.Error()) { 
        
           		fmt.Fprintf(p.w, "#%d CANCELED\n", v.index) 
        
           	} else { 
        
           		fmt.Fprintf(p.w, "#%d ERROR: %s\n", v.index, v.Error)

All I can tell is that this came from here:

buildx/util/progress/printer.go

Lines 47 to 53 in fb7b670

    
           var c console.Console 
        
           if cons, err := console.ConsoleFromFile(out); err == nil && (mode == "auto" || mode == "tty") { 
        
           	c = cons 
        
           } 
        
           // not using shared context to not disrupt display but let is finish reporting errors 
        
           pw.err = progressui.DisplaySolveStatus(ctx, "", c, out, statusCh) 
        
           close(doneCh)

I am not entirely sure where to go from here, this is seems like some kind of generic processing queue thing and this can be coming from just about anywhere in either buildkit or buildx code...

SphtKr · 2020-12-04T00:28:03Z

Not much to add, but seeing same behavior here today, similar setup, building multiarch across three nodes, amd64 + aarch64 + armv7l ... the arm7l one took close to eight hours and then hit this. Using plain TCP socket context to the two ARM nodes issuing the build on the amd64 node (UNIX socket).

tonistiigi · 2020-12-04T04:12:00Z

Was that export supposed to end with a push? What version of buildkit?

errordeveloper · 2020-12-04T11:25:11Z

Was that export supposed to end with a push? What version of buildkit?

@tonistiigi yes, in my case it was.

errordeveloper · 2020-12-04T11:26:26Z

Using plain TCP socket context to the two ARM nodes issuing the build on the amd64 node (UNIX socket).

In my case everything was built in the same docker instance, with qemu for some of the Arm stages.

errordeveloper · 2020-12-29T15:46:19Z

@tonistiigi ping!

tonistiigi · 2020-12-29T19:36:53Z

It looks like maybe the session connection just dropped because your build took almost 5h. You should see that from the daemon logs. If that is the case then maybe we could add logic to redial. Although not really different from build request connection itself dropping or if this happens at the same time session is being used it would still fail.

errordeveloper · 2020-12-29T19:56:29Z

Would you mind pointing me to where the redial logic would have to be implemented? Also, how can we improve the error message? I'm happy to have a go at fixing this one, just need some pointers. I'm pretty sure it's some sort of a drop.

tonistiigi · 2020-12-29T20:01:18Z

Session request happens in https://github.com/moby/buildkit/blob/master/client/solve.go#L165 but first you should figure out if it is dropping and if there is any error message of condition that causes it.

carlonluca · 2021-02-21T13:34:05Z

Same happens to me, very frequently. In some cases it is almost impossible to push images to docker hub. In my case I'm crossbuilding on arm to other arm versions or amd64.

#13 exporting to image
#13 sha256:e8c613e07b0b7ff33893b694f7759a10d42e180f2b4dc349fb57dc6b71dcab00
#13 exporting layers
#13 exporting layers 103.9s done
#13 exporting manifest sha256:eb3e1510baff02d53aa0afd867f1e4d297935fc99ee88d8152bc7cd977ba8108 0.1s done
#13 exporting config sha256:5c767564ca11f26821ce80e1fe80a713b1b0bffae5fff3e47c03cd29f295e013
#13 exporting config sha256:5c767564ca11f26821ce80e1fe80a713b1b0bffae5fff3e47c03cd29f295e013 0.1s done
#13 pushing layers
#13 pushing layers 765.1s done
#13 ERROR: no active session for mxx75rit9yyhguolxzs23e4oq: context deadline exceeded
------
 > exporting to image:
------
error: failed to solve: rpc error: code = DeadlineExceeded desc = no active session for mxx75rit9yyhguolxzs23e4oq: context deadline exceeded

pgehring · 2021-02-25T13:48:46Z

We are getting the same error while pushing to GitHub Container Registry. We though this is related to the 5GB limit of a single image layer, but we were able to push images with layersizes >5GB.
As we observed, the layers of this example is pushed succesfully according to the logfile.

#9 exporting to image
#9 sha256:e8c613e07b0b7ff33893b694f7759a10d42e180f2b4dc349fb57dc6b71dcab00
#9 exporting layers
#9 exporting layers 989.6s done
#9 exporting manifest sha256:78e6645f9a005842a4243947bd2e6e928f9a45b071637e2a1ba42ad9bf1c1136
#9 exporting manifest sha256:78e6645f9a005842a4243947bd2e6e928f9a45b071637e2a1ba42ad9bf1c1136 0.4s done
#9 exporting config sha256:4782358c0f8cd40619f1244a283965c470e3977ef112276f93a84eb5e5111a27
#9 exporting config sha256:4782358c0f8cd40619f1244a283965c470e3977ef112276f93a84eb5e5111a27 0.4s done
#9 pushing layers
#9 pushing layers 85.5s done
#9 ERROR: no active session for d2mw2899j00mfampt13ooz376: context deadline exceeded
------
 > exporting to image:
------
error: failed to solve: rpc error: code = DeadlineExceeded desc = no active session for d2mw2899j00mfampt13ooz376: context deadline exceeded
Error: buildx call failed with: error: failed to solve: rpc error: code = DeadlineExceeded desc = no active session for d2mw2899j00mfampt13ooz376: context deadline exceeded

The following log is copied from an image with layersizes above 5GB size. As of this log, the pushing of the manifest data fails for the example above.

#9 exporting to image
#9 sha256:e8c613e07b0b7ff33893b694f7759a10d42e180f2b4dc349fb57dc6b71dcab00
#9 pushing layers 461.4s done
#9 pushing manifest for ghcr.io/...
#9 pushing manifest for ghcr.io/... 1.2s done
#9 DONE 805.3s
🛒 Extracting digest...
sha256:56e2cf1b9b5d78b293851ef6e68e23494885974db6e945141677240c458d380c

errordeveloper · 2021-03-12T10:40:04Z

moby/buildkit#2019 is potentially related with regard to improving how the underlying error handling machinery is implemented.

deepio · 2021-04-06T16:48:01Z

For anyone else running into this issue as part of their CI pipeline, we've decided to downgrade to legacy builds instead because reliability is more important than speed.

hfedcba · 2021-05-23T20:20:43Z

This also happens with short build times, in this case less than 10 minues. Often the same build passes with a build time of about 12 minutes. I'd say it fails in about 50 % of the jobs.

#9 DONE 528.6s
#10 exporting to image
#10 exporting layers
#10 exporting layers 10.6s done
#10 exporting manifest sha256:e341e2c0bc3510d62cc7650f3ee0e5006bb679499977832fc8a6b493d5dc8491
#10 exporting manifest sha256:e341e2c0bc3510d62cc7650f3ee0e5006bb679499977832fc8a6b493d5dc8491 0.4s done
#10 exporting config sha256:082f6382fb3047e7182df5bf0360c2cf5b9927b4b4fd8664ad058c75663bc485
#10 exporting config sha256:082f6382fb3047e7182df5bf0360c2cf5b9927b4b4fd8664ad058c75663bc485 0.4s done
#10 ERROR: no active session for 7c6uxna8fa27v8oanfq8zcj3l: context deadline exceeded
------
 > exporting to image:
------
failed to solve: rpc error: code = Unknown desc = no active session for 7c6uxna8fa27v8oanfq8zcj3l: context deadline exceeded
Cleaning up file based variables 00:00
ERROR: Job failed: exit status 1

Alexander-Bartosh · 2021-06-29T07:20:29Z

It looks like related to concurrent builds on the same instance of buildkit.
Have the same problem reproduced multiple times. All build requests take no more than 15 minutes
Env:
Buildkit: moby/buildkit:v0.8.3-rootless, Azure Kubernetes service, Azure Container service

error writing layer blob: no active session for s6kthksskcg47bmuu60w25p3o: context deadline exceeded

wojiushixiaobai · 2021-07-08T11:18:16Z

The same problem.

Oliveirakun · 2021-07-16T23:20:12Z

Does anybody knows any workaround for this issue?

wojiushixiaobai · 2021-07-18T06:41:10Z

@Oliveirakun
Temporary solutions can be used..

export DOCKER_BUILDKIT=1
for i in $(seq 1 3); do
  if docker buildx build --platform linux/amd64,linux/arm64 -t test:dev . --push; then
    break
  fi
  sleep 3
  if [[ "${i}" == "3" ]]; then
    echo "[Error]: build error"
    exit 1
  fi
done

Add retry mechanism..

tonistiigi · 2021-07-18T06:56:49Z

I'd like to see daemon debug logs for this case. Possible cases are

network hiccup to dial session. Solution would be to increase timeout, I think it is 5sec atm.
session disconnects in the middle of the build, or the session stream receives some HTTP2 error. Solution would be for client to redial.
something goes wrong and wrong sessionid is asked.

ifeelingz · 2021-07-26T01:02:53Z

Same Problem : (

ddomnik · 2021-08-16T08:18:05Z

Same problem here ... anybody got a solution or hack for that? Or otherwise can we store multiarch images locally in our "images" and then try to push them?

ifeelingz · 2021-08-16T14:31:35Z

Hi everyone!

I add before " Build and push " step.

  - name: Pre Build
    run: "docker system prune -af"

or

  - name: Pre Build
    run: "docker system prune --volume -af"

It works for me.

Alexander-Bartosh · 2021-08-17T11:42:18Z

Guys I have stopped seeing the problem after update to Buildkit: moby/buildkit:v0.9.0-rootless and increasing cache size from default to 50 GB ( --oci-worker-gc-keepstorage=50000)

Alexander-Bartosh · 2021-08-24T13:20:24Z

Still have it on Buildkit: moby/buildkit:v0.9.0-rootless, Azure Kubernetes service, Azure Container service

In my case was related to 2 concurrent cache exports:

2021-08-24T12:19:53.5930996Z #26 exporting cache
2021-08-24T12:19:53.5931582Z #26 sha256:2700d4ef94dee473593c5c614b55b2dedcca7893909811a8f2b48291a1f581e4
2021-08-24T12:19:53.5931959Z #26 preparing build cache for export
2021-08-24T12:19:54.1934009Z #26 preparing build cache for export 0.5s done
2021-08-24T12:19:54.1934497Z #26 writing layer sha256:0da622ee6d9e9d6172a4eb7e3647a200f74190536a9c3b8b53a9c2d014803b7d
2021-08-24T12:20:14.4543178Z #26 20.95 error: no active session for aim388ulrmif949ccgd2k0xza: context deadline exceeded
2021-08-24T12:20:14.4543633Z #26 20.95 retrying in 1s
2021-08-24T12:20:44.4583718Z #26 50.95 error: no active session for aim388ulrmif949ccgd2k0xza: context deadline exceeded
2021-08-24T12:20:44.4584324Z #26 50.95 retrying in 2s
2021-08-24T12:20:59.4693657Z #26 65.96 error: no active session for aim388ulrmif949ccgd2k0xza: context deadline exceeded
2021-08-24T12:20:59.4694097Z #26 65.96 retrying in 4s
2021-08-24T12:21:09.4759423Z #26 writing layer sha256:0da622ee6d9e9d6172a4eb7e3647a200f74190536a9c3b8b53a9c2d014803b7d 75.4s done
2021-08-24T12:21:09.4760392Z #26 75.97 error: no active session for aim388ulrmif949ccgd2k0xza: context deadline exceeded
2021-08-24T12:21:09.4760992Z #26 ERROR: error writing layer blob: no active session for aim388ulrmif949ccgd2k0xza: context deadline exceeded

2021-08-24T12:19:39.0062856Z #43 exporting cache
2021-08-24T12:19:39.0063170Z #43 sha256:2700d4ef94dee473593c5c614b55b2dedcca7893909811a8f2b48291a1f581e4
2021-08-24T12:19:39.0063480Z #43 preparing build cache for export
2021-08-24T12:19:39.4326968Z #43 preparing build cache for export 0.5s done
2021-08-24T12:19:39.5829077Z #43 writing layer sha256:0da622ee6d9e9d6172a4eb7e3647a200f74190536a9c3b8b53a9c2d014803b7d
2021-08-24T12:19:54.4506394Z #43 15.50 error: no active session for aim388ulrmif949ccgd2k0xza: context deadline exceeded
2021-08-24T12:19:54.4506870Z #43 15.50 retrying in 1s
2021-08-24T12:20:24.4557915Z #43 45.50 error: no active session for aim388ulrmif949ccgd2k0xza: context deadline exceeded
2021-08-24T12:20:24.4558485Z #43 45.50 retrying in 2s
2021-08-24T12:20:54.4602936Z #43 75.51 error: no active session for aim388ulrmif949ccgd2k0xza: context deadline exceeded
2021-08-24T12:20:54.4604240Z #43 75.51 retrying in 4s
2021-08-24T12:21:04.4717026Z #43 writing layer sha256:0da622ee6d9e9d6172a4eb7e3647a200f74190536a9c3b8b53a9c2d014803b7d 85.0s done
2021-08-24T12:21:04.4717958Z #43 85.52 error: no active session for aim388ulrmif949ccgd2k0xza: context deadline exceeded
2021-08-24T12:21:04.4718705Z #43 ERROR: error writing layer blob: no active session for aim388ulrmif949ccgd2k0xza: context deadline exceeded

wojiushixiaobai · 2021-08-29T16:54:18Z

Still have it on Buildkit: moby/buildkit:v0.9.0-rootless, Azure Kubernetes service, Azure Container service

In my case was related to 2 concurrent cache exports:

## 2021-08-24T12:19:53.5930996Z #26 exporting cache
2021-08-24T12:19:53.5931582Z #26 sha256:2700d4ef94dee473593c5c614b55b2dedcca7893909811a8f2b48291a1f581e4
2021-08-24T12:19:53.5931959Z #26 preparing build cache for export
2021-08-24T12:19:54.1934009Z #26 preparing build cache for export 0.5s done
2021-08-24T12:19:54.1934497Z #26 writing layer sha256:0da622ee6d9e9d6172a4eb7e3647a200f74190536a9c3b8b53a9c2d014803b7d
2021-08-24T12:20:14.4543178Z #26 20.95 error: no active session for aim388ulrmif949ccgd2k0xza: context deadline exceeded
2021-08-24T12:20:14.4543633Z #26 20.95 retrying in 1s
2021-08-24T12:20:44.4583718Z #26 50.95 error: no active session for aim388ulrmif949ccgd2k0xza: context deadline exceeded
2021-08-24T12:20:44.4584324Z #26 50.95 retrying in 2s
2021-08-24T12:20:59.4693657Z #26 65.96 error: no active session for aim388ulrmif949ccgd2k0xza: context deadline exceeded
2021-08-24T12:20:59.4694097Z #26 65.96 retrying in 4s
2021-08-24T12:21:09.4759423Z #26 writing layer sha256:0da622ee6d9e9d6172a4eb7e3647a200f74190536a9c3b8b53a9c2d014803b7d 75.4s done
2021-08-24T12:21:09.4760392Z #26 75.97 error: no active session for aim388ulrmif949ccgd2k0xza: context deadline exceeded
2021-08-24T12:21:09.4760992Z #26 ERROR: error writing layer blob: no active session for aim388ulrmif949ccgd2k0xza: context deadline exceeded
## 2021-08-24T12:19:39.0062856Z #43 exporting cache
2021-08-24T12:19:39.0063170Z #43 sha256:2700d4ef94dee473593c5c614b55b2dedcca7893909811a8f2b48291a1f581e4
2021-08-24T12:19:39.0063480Z #43 preparing build cache for export
2021-08-24T12:19:39.4326968Z #43 preparing build cache for export 0.5s done
2021-08-24T12:19:39.5829077Z #43 writing layer sha256:0da622ee6d9e9d6172a4eb7e3647a200f74190536a9c3b8b53a9c2d014803b7d
2021-08-24T12:19:54.4506394Z #43 15.50 error: no active session for aim388ulrmif949ccgd2k0xza: context deadline exceeded
2021-08-24T12:19:54.4506870Z #43 15.50 retrying in 1s
2021-08-24T12:20:24.4557915Z #43 45.50 error: no active session for aim388ulrmif949ccgd2k0xza: context deadline exceeded
2021-08-24T12:20:24.4558485Z #43 45.50 retrying in 2s
2021-08-24T12:20:54.4602936Z #43 75.51 error: no active session for aim388ulrmif949ccgd2k0xza: context deadline exceeded
2021-08-24T12:20:54.4604240Z #43 75.51 retrying in 4s
2021-08-24T12:21:04.4717026Z #43 writing layer sha256:0da622ee6d9e9d6172a4eb7e3647a200f74190536a9c3b8b53a9c2d014803b7d 85.0s done
2021-08-24T12:21:04.4717958Z #43 85.52 error: no active session for aim388ulrmif949ccgd2k0xza: context deadline exceeded
2021-08-24T12:21:04.4718705Z #43 ERROR: error writing layer blob: no active session for aim388ulrmif949ccgd2k0xza: context deadline exceeded

Maybe you can format the reply so that others can read better.

tonistiigi · 2021-09-24T06:28:20Z

moby/buildkit#2369 was merged. Hopefully this is fixed now.

tonistiigi · 2021-09-29T18:37:07Z

@ekaterinadimitrova2 are you running the master buildkit build with the patch?

koehn · 2021-11-11T01:46:58Z

I can confirm that this is still happening with the master build kit build moby/buildkit.

All local using qemu for the emulated Arm64 bits.

buu700 · 2022-01-25T04:08:28Z

I've been consistently running into this with Docker for Mac 4.4.2 on my M1 MacBook Air, specifically when building a Dockerfile with a COPY instruction. Downgrading to 4.3.2 seems to have fixed the problem.

errordeveloper · 2022-01-25T04:31:19Z

I've been consistently running into this with Docker for Mac 4.4.2 on my M1 MacBook Air, specifically when building a Dockerfile with a COPY instruction. Downgrading to 4.3.2 seems to have fixed the problem.

Do you use 'docker buildx create'?

buu700 · 2022-01-25T04:37:07Z

I tried it both ways with the same result, but when I did use buildx create the command was docker buildx create --buildkitd-flags '--oci-worker-gc --oci-worker-gc-keepstorage 50000' --name cyph_build_context ; docker buildx use cyph_build_context. The buildkit flags were set based on @Alexander-Bartosh's suggested workaround above, and I also tried setting that in the global configuration file through the Docker for Mac preferences window.

jamshid · 2022-04-02T20:32:03Z

I've been trying to figure out why I get this dreaded error when doing a few concurrent DOCKER_BUILDKIT=1 docker build(x) builds:

failed to solve with frontend dockerfile.v0: failed to read dockerfile: no active session for kqupcokbxys31oapli62ax81k: context deadline exceeded

Btw I don't get the error when DOCKER_HOST is blank. Only get it when using DOCKER_HOST=ssh://docker.example.com. I'm on that server docker.example.com (passwordless ssh is configured and working).

Anyway I saw this closed bug and figure it's been fixed for so long I must have the fix, but my docker 20.10.14 on ubuntu 20 still uses buildx: Docker Buildx (Docker Inc., v0.8.1-docker)! Why is it so old? Is there any way to upgrade it?

errordeveloper · 2022-04-03T12:26:27Z

@jamshid does the error occur right away in your case or as a timeout? This was originally reported as a timeout case, if you are seeing it right away it might be a different bug altogether.

jamshid · 2022-04-03T16:38:14Z

@errordeveloper thanks yes its pretty immediate.

Where are the useful logs, just journalctl -u docker? I can enable debug logging for docker client (--debug), server, and buildx somehow?

I haven't had much luck narrowing it down to something easily reproducible but I can file a bug with the logs. But I'd still like to upgrade buildx to see if this is already fixed.

errordeveloper · 2022-04-04T09:46:31Z

Anyway I saw this closed bug and figure it's been fixed for so long I must have the fix, but my docker 20.10.14 on ubuntu 20 still uses buildx: Docker Buildx (Docker Inc., v0.8.1-docker)! Why is it so old? Is there any way to upgrade it?

Are you using official Docker package for Ubuntu? I'd recommend trying using latest official packages, it's probably a good idea to also remove ~/.docker/cli-plugins/docker-buildx before installing the latest version.

errordeveloper · 2022-04-04T10:00:55Z

I have to admit, what I said above is just a general point about upgrading, but having looked more specifically at the details, I can see that you don't have to upgrade.

Firstly, buildx v0.8.1 is only two weeks old and one patch relase behind latests (v0.8.2).
Secondly, this bug is most likely to be in buildkitd, and by default it's part of Docker engine. Docker 21.x is not out yet, so you won't get latest buildkit unless you use container driver.

You should try this to get an instance of buildkitd running in a container:

docker buildx create --use  --driver docker-container

Having done that, you should get latest stable version out of the box.
If you can reproduce the behaviour this way, please do open another issue as from you are saying it looks like a new bug associated with SSH transport.

jamshid · 2022-04-04T17:25:17Z

I am using latest official ubuntu docker packages. Ok will file a new bug about the SSH transport.

Sorry my confusion about buildx versions was that I was looking at the buildkit releases under https://github.com/moby/buildkit/releases. That versioning is apparently unrelated to the buildx cli plugin (https://github.com/docker/buildx) versioning.

Cveinnt · 2022-05-21T12:34:11Z

Getting this error for GitHub Action - has anyone encountered anything similar?

aikoven · 2022-06-22T04:20:36Z

I was getting this error in GitHub Actions. Solved by bumping action versions:

docker/setup-buildx-action@v2
docker/login-action@v2
docker/build-push-action@v3

UPD: still getting the error sometimes 😔

UPD2: solved by running on a machine with more memory.

buu700 · 2022-06-28T20:09:02Z

buu700 commented on Jan 24

I've been consistently running into this with Docker for Mac 4.4.2 on my M1 MacBook Air, specifically when building a Dockerfile with a COPY instruction. Downgrading to 4.3.2 seems to have fixed the problem.

I confirmed that this is still an issue in version 4.9.1, this time testing on x86 macOS. As before, downgrading to 4.3.2 is an effective workaround.

* append cuda version to tags * revertme: push to hub * Update docker readme * Build base-conda-py3.9-torch1.12-cuda11.3.1 * Use new images in conda tests * revertme: push to hub * Revert "revertme: push to hub" This reverts commit 0f7d534. * Revert "revertme: push to hub" This reverts commit 46a05fc. * Run conda if workflow edited * Run gpu testing if workflow edited * Use new tags in release/Dockerfile * Build base-cuda and PL release images with all combinations * Update release docker * Update conda from py3.9-torch1.12 to py3.10-torch.1.12 * Fix ubuntu version * Revert conda * revertme: push to hub * Don't build Python 3.10 for now... * Fix pl release builder * updating version contribute to the error? docker/buildx#456 * Update actions' versions * Update slack user to notify * Don't use 11.6.0 to avoid bagua incompatibility * Don't use 11.1, and use 11.1.1 * Update .github/workflows/ci-pytorch_test-conda.yml Co-authored-by: Luca Medeiros <67411094+luca-medeiros@users.noreply.github.com> * Update trigger * Ignore artfacts from tutorials * Trim docker images to distribute * Add an image for tutorials * Update conda image 3.8x1.10 * Try different conda variants * No need to set cuda for conda jobs * Update who to notify ipu failure * Don't push * update filenaem Co-authored-by: Luca Medeiros <67411094+luca-medeiros@users.noreply.github.com>

* update version and changelog for 1.7.2 release * Reset all results on epoch end (#14061) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> * Skip ddp fork tests on windows (#14121) * Fix device placement when `.cuda()` called without specifying index (#14128) * Convert subprocess test to standalone test (#14101) * Fix entry point test for Python 3.10 (#14154) * Fix flaky test caused by weak reference (#14157) * Fix saving hyperparameters in a composition where parent is not a LM or LDM (#14151) Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Remove DeepSpeed version restriction from Lite (#13967) * Configure the check-group app (#14165) Co-authored-by: Jirka <jirka.borovec@seznam.cz> * Update onnxruntime requirement from <=1.12.0 to <1.13.0 in /requirements (#14083) Updates the requirements on [onnxruntime](https://github.com/microsoft/onnxruntime) to permit the latest version. - [Release notes](https://github.com/microsoft/onnxruntime/releases) - [Changelog](https://github.com/microsoft/onnxruntime/blob/master/docs/ReleaseManagement.md) - [Commits](microsoft/onnxruntime@v0.1.4...v1.12.1) --- updated-dependencies: - dependency-name: onnxruntime dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Update gcsfs requirement from <2022.6.0,>=2021.5.0 to >=2021.5.0,<2022.8.0 in /requirements (#14079) Update gcsfs requirement in /requirements Updates the requirements on [gcsfs](https://github.com/fsspec/gcsfs) to permit the latest version. - [Release notes](https://github.com/fsspec/gcsfs/releases) - [Commits](fsspec/gcsfs@2021.05.0...2022.7.1) --- updated-dependencies: - dependency-name: gcsfs dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Fix a bug that caused spurious `AttributeError` when multiple `DataLoader` classes are imported (#14117) fix * CI: Replace `_` of in GHA workflow filenames with `-` (#13917) * Rename workflow files * Update docs * Fix azure badges * Update the main readme * bad rebase * Update doc * CI: Update Windows version from 2019 to 2022 (#14129) Update windows * CI/CD: Add CUDA version to docker image tags (#13831) * append cuda version to tags * revertme: push to hub * Update docker readme * Build base-conda-py3.9-torch1.12-cuda11.3.1 * Use new images in conda tests * revertme: push to hub * Revert "revertme: push to hub" This reverts commit 0f7d534. * Revert "revertme: push to hub" This reverts commit 46a05fc. * Run conda if workflow edited * Run gpu testing if workflow edited * Use new tags in release/Dockerfile * Build base-cuda and PL release images with all combinations * Update release docker * Update conda from py3.9-torch1.12 to py3.10-torch.1.12 * Fix ubuntu version * Revert conda * revertme: push to hub * Don't build Python 3.10 for now... * Fix pl release builder * updating version contribute to the error? docker/buildx#456 * Update actions' versions * Update slack user to notify * Don't use 11.6.0 to avoid bagua incompatibility * Don't use 11.1, and use 11.1.1 * Update .github/workflows/ci-pytorch_test-conda.yml Co-authored-by: Luca Medeiros <67411094+luca-medeiros@users.noreply.github.com> * Update trigger * Ignore artfacts from tutorials * Trim docker images to distribute * Add an image for tutorials * Update conda image 3.8x1.10 * Try different conda variants * No need to set cuda for conda jobs * Update who to notify ipu failure * Don't push * update filenaem Co-authored-by: Luca Medeiros <67411094+luca-medeiros@users.noreply.github.com> * Avoid entry_points deprecation warning (#14052) Co-authored-by: Adam J. Stewart <ajstewart426@gmail.com> Co-authored-by: Akihiro Nitta <nitta@akihironitta.com> * Configure the check-group app (#14165) Co-authored-by: Jirka <jirka.borovec@seznam.cz> * Profile batch transfer and gradient clipping hooks (#14069) Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Avoid false positive warning about using `sync_dist` when using torchmetrics (#14143) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Avoid raising the sampler warning if num_replicas=1 (#14097) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: otaj <6065855+otaj@users.noreply.github.com> * Remove skipping logic in favor of path filtering (#14170) * Support checkpoint save and load with Stochastic Weight Averaging (#9938) Co-authored-by: thomas chaton <thomas@grid.ai> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com> Co-authored-by: Kushashwa Ravi Shrimali <kushashwaravishrimali@gmail.com> Co-authored-by: Jirka <jirka.borovec@seznam.cz> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * Use fsdp module to initialize precision scalar for fsdp native (#14092) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: Laverne Henderson <laverne.henderson@coupa.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> * add more issues types (#14174) * add more issues types * Update .github/ISSUE_TEMPLATE/config.yml Co-authored-by: Mansy <ahmed.mansy156@gmail.com> * typo Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com> Co-authored-by: Mansy <ahmed.mansy156@gmail.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Laverne Henderson <laverne.henderson@coupa.com> Co-authored-by: Akihiro Nitta <nitta@akihironitta.com> * CI: clean building docs (#14216) * CI: clean building docs * group * . * CI: docker focus on PL only (#14246) * CI: docker focus on PL only * group * Allowed setting attributes on `DataLoader` and `BatchSampler` when instantiated inside `*_dataloader` hooks (#14212) Co-authored-by: otaj <6065855+otaj@users.noreply.github.com> * Revert "Remove skipping logic in favor of path filtering (#14170)" (#14244) * Update defaults for WandbLogger's run name and project name (#14145) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Jirka <jirka.borovec@seznam.cz> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Akihiro Nitta <nitta@akihironitta.com> Co-authored-by: Luca Medeiros <67411094+luca-medeiros@users.noreply.github.com> Co-authored-by: Adam J. Stewart <ajstewart426@gmail.com> Co-authored-by: otaj <6065855+otaj@users.noreply.github.com> Co-authored-by: Adam Reeve <adreeve@gmail.com> Co-authored-by: thomas chaton <thomas@grid.ai> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Kushashwa Ravi Shrimali <kushashwaravishrimali@gmail.com> Co-authored-by: Laverne Henderson <laverne.henderson@coupa.com> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com> Co-authored-by: Mansy <ahmed.mansy156@gmail.com>

* append cuda version to tags * revertme: push to hub * Update docker readme * Build base-conda-py3.9-torch1.12-cuda11.3.1 * Use new images in conda tests * revertme: push to hub * Revert "revertme: push to hub" This reverts commit 0f7d534. * Revert "revertme: push to hub" This reverts commit 46a05fc. * Run conda if workflow edited * Run gpu testing if workflow edited * Use new tags in release/Dockerfile * Build base-cuda and PL release images with all combinations * Update release docker * Update conda from py3.9-torch1.12 to py3.10-torch.1.12 * Fix ubuntu version * Revert conda * revertme: push to hub * Don't build Python 3.10 for now... * Fix pl release builder * updating version contribute to the error? docker/buildx#456 * Update actions' versions * Update slack user to notify * Don't use 11.6.0 to avoid bagua incompatibility * Don't use 11.1, and use 11.1.1 * Update .github/workflows/ci-pytorch_test-conda.yml Co-authored-by: Luca Medeiros <67411094+luca-medeiros@users.noreply.github.com> * Update trigger * Ignore artfacts from tutorials * Trim docker images to distribute * Add an image for tutorials * Update conda image 3.8x1.10 * Try different conda variants * No need to set cuda for conda jobs * Update who to notify ipu failure * Don't push * update filenaem Co-authored-by: Luca Medeiros <67411094+luca-medeiros@users.noreply.github.com> (cherry picked from commit d5f35ec)

padrepitufo · 2024-01-10T23:28:32Z

I was seeing this error crop up but only when using v0.33.2 of tilt, having upgraded to v0.33.10 the error went away 🤷‍♂️

errordeveloper mentioned this issue Nov 27, 2020

Multi platform image cilium/proxy#27

Closed

maxlaverse mentioned this issue Sep 16, 2021

error: failed to authorize: no active session for <session-id>: context deadline exceeded moby/buildkit#2367

Closed

tonistiigi closed this as completed Sep 24, 2021

akihironitta added a commit to Lightning-AI/pytorch-lightning that referenced this issue Jul 26, 2022

updating version contribute to the error? docker/buildx#456

109bc2f

alafanechere mentioned this issue Jun 9, 2023

dagger: `error reading from server: EOF", received prior goaway: code: NO_ERROR airbytehq/airbyte#27168

Open

mzihlmann mentioned this issue Oct 20, 2023

ERROR: failed to solve: DeadlineExceeded: context deadline exceeded moby/buildkit#4327

Closed

no active session for <id>: context deadline exceeded #456

no active session for <id>: context deadline exceeded #456

Comments

errordeveloper commented Nov 27, 2020

errordeveloper commented Nov 27, 2020

SphtKr commented Dec 4, 2020

tonistiigi commented Dec 4, 2020

errordeveloper commented Dec 4, 2020

errordeveloper commented Dec 4, 2020

errordeveloper commented Dec 29, 2020

tonistiigi commented Dec 29, 2020

errordeveloper commented Dec 29, 2020

tonistiigi commented Dec 29, 2020

carlonluca commented Feb 21, 2021

pgehring commented Feb 25, 2021 • edited

errordeveloper commented Mar 12, 2021 • edited

deepio commented Apr 6, 2021

hfedcba commented May 23, 2021

Alexander-Bartosh commented Jun 29, 2021

wojiushixiaobai commented Jul 8, 2021

Oliveirakun commented Jul 16, 2021

wojiushixiaobai commented Jul 18, 2021 • edited

tonistiigi commented Jul 18, 2021

ifeelingz commented Jul 26, 2021

ddomnik commented Aug 16, 2021 • edited

ifeelingz commented Aug 16, 2021 • edited

Alexander-Bartosh commented Aug 17, 2021

Alexander-Bartosh commented Aug 24, 2021 • edited

wojiushixiaobai commented Aug 29, 2021

tonistiigi commented Sep 24, 2021

tonistiigi commented Sep 29, 2021

koehn commented Nov 11, 2021 • edited

buu700 commented Jan 25, 2022

errordeveloper commented Jan 25, 2022

buu700 commented Jan 25, 2022

jamshid commented Apr 2, 2022

errordeveloper commented Apr 3, 2022

jamshid commented Apr 3, 2022

errordeveloper commented Apr 4, 2022

errordeveloper commented Apr 4, 2022

jamshid commented Apr 4, 2022

Cveinnt commented May 21, 2022

aikoven commented Jun 22, 2022 • edited

buu700 commented Jun 28, 2022

padrepitufo commented Jan 10, 2024

pgehring commented Feb 25, 2021 •

edited

errordeveloper commented Mar 12, 2021 •

edited

wojiushixiaobai commented Jul 18, 2021 •

edited

ddomnik commented Aug 16, 2021 •

edited

ifeelingz commented Aug 16, 2021 •

edited

Alexander-Bartosh commented Aug 24, 2021 •

edited

koehn commented Nov 11, 2021 •

edited

aikoven commented Jun 22, 2022 •

edited