Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

getcwd error on volume mount #1509

Closed
ccakes opened this issue Apr 5, 2017 · 65 comments
Closed

getcwd error on volume mount #1509

ccakes opened this issue Apr 5, 2017 · 65 comments

Comments

@ccakes
Copy link

ccakes commented Apr 5, 2017

Expected behavior

Able to mount local directories (specifically, pwd) to containers and use them without issue

Actual behavior

Occasionally the container seems to lose access to the mountpoint. The shell doesn't have timestamps but this log is over a period of ~2mins

$ docker run -it --rm --link postgres --link redis --network test_net -v $(pwd):/app -w /app perl-dev
/app #
/app #
ash: getcwd: No such file or directory
(unknown) #
ash: getcwd: No such file or directory
(unknown) #
ash: getcwd: No such file or directory
(unknown) # cd /
/ # cd /app
ash: getcwd: No such file or directory
(unknown) #
ash: getcwd: No such file or directory
(unknown) #
ash: getcwd: No such file or directory
(unknown) #
ash: getcwd: No such file or directory
(unknown) #
/app #
/app #
/app # %

For reference, the Dockerfile used to create the perl-dev container is as follows

FROM alpine:edge

# Basic dependencies
RUN apk -U add --no-cache tzdata git curl wget perl perl-dev libressl libressl-dev make gcc libc-dev zlib-dev

# Extras
RUN apk add --no-cache postgresql-dev unzip

# Tidy up
RUN rm /var/cache/apk/*

RUN curl -L https://cpanmin.us/ -o /usr/local/bin/cpanm && \
    chmod +x /usr/local/bin/cpanm
RUN cpanm -i Carton Minilla Version::Next CPAN::Uploader

CMD ash

Information

  • Diagnostic ID is 976E2A45-9CDF-4CCD-BA75-AB1B0EAE7179
  • This has only started happening since the latest update (17.03.1-ce-mac5). I've been using Docker for Mac for months prior and not seen this error
  • I'm not sure what version I was running previously but I upgrade whenever the prompter suggests, so presumably I came from -mac4?

Forum thread for reference

Steps to reproduce the behavior

I haven't been able to reliably reproduce this but it happens often enough that it's frustrating. I'll keep poking it at my end and see if I can make it reliably break but for me just by creating a new container and working for a while. As with the example, even sitting at the shell will eventually trigger it.

@jeanlaurent
Copy link
Member

@ccakes thanks for the detailed report, we're looking on our end if we could reproduce your issue and let you know here.

@clintonium-119
Copy link

I'm also seeing this issue on 17.03.1-ce-mac12

Seems transient. I've only recently started using Docker for Mac, so I don't know if downgrading would help (though it seems that is rather difficult: #1120)

@jmbowman
Copy link

I and several of my coworkers have been hitting this pretty often on 17.06.0-ce-mac18 (and the previous version). In our case, calls to Python's os.path.abspath() fail with OSError: [Errno 2] No such file or directory. (That method calls os.getcwdu() internally, which is basically a pwd call.) But if we put that call in a loop, sleeping for a second and retrying each time, it suddenly starts working after 30 seconds or so. The loop isn't starting until the container has already been running for 15-30 seconds, so it seems like it's sometimes taking about a minute after container startup for pwd to start working in mounted volumes.

@gsong
Copy link

gsong commented Jul 17, 2017

FYI, this is still happening in:

Version 17.06.0-ce-mac19 (18663)
Channel: stable
c98c1c25e0

@3axap4eHko
Copy link

I can still reproduce that on my Mac, any updates?

@dsheets
Copy link
Contributor

dsheets commented Aug 16, 2017

@3axap4eHko do you have a reliable way to reproduce it? I've tried the edX repro steps and couldn't get it to happen.

@3axap4eHko
Copy link

3axap4eHko commented Aug 16, 2017

@dsheets I'm just build my container and run it by docker-compose and it's appear randomly after each rebuild.

Ivans-MacBook-Pro:bundler ivan$ docker exec -it 6002f4c002e2 sh
sh: getcwd: No such file or directory
(unknown) #

but after rebuild without any changes it works

Ivans-MacBook-Pro:bundler ivan$ docker exec -it 6002f4c002e2 sh
/appdir/webapps/bundler #

UPD: Just figured out it can start work even after restart containers
UPD2: It works on level above directory just with cd ../

@clintonium-119
Copy link

I've stopped seeing this issue since I started mounting volumes with :cached as per https://docs.docker.com/compose/compose-file/#caching-options-for-volume-mounts-docker-for-mac

@dsheets
Copy link
Contributor

dsheets commented Aug 17, 2017

@3axap4eHko could you share your reproduction case? It's very hard to track this down without a reliable way to reproduce it locally.

@dsheets
Copy link
Contributor

dsheets commented Aug 21, 2017

It's occurred to me that you may be seeing this behavior if you (or software you run) removes the working directory of the shell. Even recreating the directory will not allow the shell to recover and so may cause this strange situation where the pwd of the shell exists but the shell's actual working directory does not. I believe this is the case regardless of file system or configuration (i.e. does not require Docker for Mac). If you're having this difficulty with a shell, you can try cd `pwd`. If that fixes the issue, it's almost certainly due to a behind-the-scenes removal and recreation. If you're seeing this issue with other software, it may be due to removal/recreation and a race between different processes/threads.

It would still be fantastic to have a consistent reproduction for the issue as it may not be due to the above root cause. If you're interested in debugging it in situ, I recommend looking for places where directory removal or recursive removal occurs. I can imagine scenarios like make clean all where this could happen consistently and sometimes development containers could be unaware.

@axot
Copy link

axot commented Aug 31, 2017

Related to issue #2019

@gerhat
Copy link

gerhat commented Sep 1, 2017

Issue appearing also in:

Version 17.06.1-ce-mac24 (18950)
Channel: stable
54dc09c3e3

The actual errors printed are:

sh: 0: getcwd() failed: No such file or directory
shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory

After docker restart <container> it works fine.

Cached volume mount didn't fix the issue. I am now in the process of testing the delegated option https://docs.docker.com/docker-for-mac/osxfs-caching/#delegated.

Update: delegated didn't work either

@tinder-jlee
Copy link

I've been having the same issue all the time, tried with cached and delegated and both didn't work.

@clintonium-119
Copy link

I can confirm that cached and delegated did not actually fix it for me. I went quite a while without it occurring after making those changes, but it just happened again this morning, and I needed to restart macOS as usual to get it working again.

@astanciu
Copy link

I'm also seeing this issue, fwiw

@mrsmuneton
Copy link

mrsmuneton commented Sep 20, 2017

I also have this issue on my Mac, quitting and restarting docker fixes it for me temporarily. I am running multiple containers, and when it occurs in any container, there is a a chance where next command i execute in any other container throws the same error:

(unknown) # ls
Dockerfile          config              lib  
sh: getcwd: No such file or directory

Docker Community Edition
Version 17.06.2-ce-mac27 (19124)
Channel: stable
428bd6ceae

@pvledoux
Copy link

pvledoux commented Oct 10, 2017

Had the same issue with redis container, restart the container solved the problem.

Docker version 17.09.0-ce, build afdb6d4

@elmarbeckmann
Copy link

I'm having this issue with Version 17.12.0-ce-mac47 (21805).
Any updates?

@orotemo
Copy link

orotemo commented Jan 17, 2018

via docker attach all works fine. it is when you use docker exec that the problem manifests

@raunofreiberg
Copy link

Ditto, getting the same issue here as well @ Version 17.09.0-ce-mac35.

Can confirm that restarting the container helps, but is cumbersome.

@ghost
Copy link

ghost commented Oct 23, 2019

My issue turned out to be that I was using docker-compose with multiple project names and then when I stopped specifying the project name and used the defaults, it looked like it was using the wrong image and the permission checks got screwed up in the volumes.

I deleted all images and rebuilt and it worked. I assume that once you start using the COMPOSE_PROJECT_NAME environment variable you have to keep using it or you would run into this issue.

@viktorianer
Copy link

viktorianer commented Dec 13, 2019

I was facing the same issue and could fix it for now.

Setup
In my Rails-App I use multi-stage build, mainly for bundle:

1. Stage build

FROM ruby:2.4.1-slim AS build-bundle
...
# Copy the Gemfile as well as the Gemfile.lock and install the RubyGems. 
COPY Gemfile Gemfile.lock ./
RUN bundle install -j $(nproc) --retry 3 --without production

2. Stage copy it from stage 1

FROM ruby:2.4.1-slim
...
COPY --from=build-bundle /usr/local/bundle /usr/local/bundle

In my docker-compose.yml I add volume for bundle_data:

app:
  ...
  volumes:
      - .:/app:cached
      - rails_cache_data:/app/tmp/cache
      - bundle_data:/usr/local/bundle
...
volumes:
  rails_cache_data:
  bundle_data:

It runs perfect, until I change my Gemfile and add some gems, e.g.: multipart-post-2.1.1.

docker-compose up app

Will result in error message:

app       | Could not find multipart-post-2.1.1 in any of the sources
app       | Run `bundle install` to install missing gems.
app       | Could not find multipart-post-2.1.1 in any of the sources
app       | Run `bundle install` to install missing gems.
app exited with code 7

I try to start again...

docker-compose up app

... and now I get different error message:

app       | /usr/local/bundle/gems/bundler-1.17.3/lib/bundler/vendor/thor/lib/thor.rb:486:in `class_eval': No such file or directory - getcwd (Errno::ENOENT)
...

How to fix:

docker-compose down
docker volume rm app_bundle_data
docker-compose up app

It works for me with:

Docker version 19.03.5, build 633a0ea
docker-compose version 1.24.1, build 4667896b
docker desktop for mac 2.1.0.5 (40693)

douglasnaphas added a commit to douglasnaphas/mljsapi that referenced this issue Jan 30, 2020
I was trying to work around a failure with a message about
Node/Express/body-parser that seems to actually have been environmental.
Docker seems to occasionally lose rights over a directory/file
(docker/for-mac#1509) on Mac.
@docker-robott
Copy link
Collaborator

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale comment.
Stale issues will be closed after an additional 30d of inactivity.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so.

Send feedback to Docker Community Slack channels #docker-for-mac or #docker-for-windows.
/lifecycle stale

@douglasnaphas
Copy link

/remove-lifecycle stale

@OmarIthawi
Copy link

We have a resource-intensive project (Open edX Platform) with a docker-based devstack with multiple containers and volumes reuse between containers. This getcwd issue happened only on Mac and often hindered out development experience.

The only workaround that we found is to use NFS with Docker on Mac: https://github.com/edx/devstack/pull/465

@thebaron
Copy link

thebaron commented Apr 9, 2020

Hopefully this experience can add a bit more color to the issue as I am having a related problem, if not the same one.

A coworker recently added the :delegated to a volume mount on a build script for a project we are working on... this definitely triggered the inconsistency issue for me. Sometimes the whole build pipe'd work fine, sometimes it'd fail on the last step. 50,000 ft view, the process looks like
$ docker build -v (code_root):/container/path:delegated ..... go build (blah) ./binary
and then
$ docker build -v (code_root):/container/path:delegated ..... go build (blah) ./binary2

those work fine, or seem to as far as I can tell. And they definitely work faster, as expected, with :delegated applied. But the last step in the pipeline, which is essentially a

$ docker build .

which has a Dockerfile in (code_root) that takes those two container-built ELF binaries deposited into the mac filesystem by the previous build steps using the -v mount and packages them up into a final container for deployment.

It's that last step fails to work consistently with :delegated used in the first 2 steps. Sometimes it will, most times it won't. When it works, you see the whole docker-uploading-build-context and it works fine. When it fails, it gives the getcwd: can't find . thing and dies immediately.

If I remove the delegated semantics and go with default, it's slower, as expected, but works consistently each time.

It really feels like some sort of race condition is happening with the shared volume support where somehow the writeback from the first 2 containers is somehow confusing the last step. I tried, on a lark, adding a sync call to the end of each of the first 2 build containers just to see if perhaps forcing the writeback sooner would fix it, but it didn't.

Not sure if this is helpful or not, but at least it is reproducible.

@mghantous
Copy link

mghantous commented May 18, 2020

I am also seeing the getcwd issue when using a bind mount volume (tried both cached and non-cached). I was seeing it occasionally on 2.2.0.5 but now I see it consistently since upgrading to 2.3.0.2. Containers seem to start up successfully the first time, but if I ctrl-c the container and try to bring it up again, I can recreate the getcwd issue. Restarting docker for mac clears up the issue, and so does doing docker-compose down on the container. Neither of these are very practical options, especially since I don't want to keep removing containers. I may try downgrading back to 2.2.0.5 where the issue was less frequent / more manageable. I've seen others mentioning a workaround of using NFS, but I would like to avoid that and use native docker for mac features if possible.

@asbjornu
Copy link

I'm seeing this with the jekyll-plantuml image on macOS 10.14.6, Docker version 19.03.8, build afacb8b. In the container, WORKDIR and VOLUME points to the same path. May that be the source of the problem, somehow?

@OmarIthawi
Copy link

I'm seeing this with the jekyll-plantuml image on macOS 10.14.6, Docker version 19.03.8, build afacb8b. In the container, WORKDIR and VOLUME points to the same path. May that be the source of the problem, somehow?

Very interesting point! But having WORKDIR and VOLUME to be the same path is very common pattern for devstacks.

@Taytay
Copy link

Taytay commented Jun 11, 2020

@asbjornu, @OmarIthawi : I just got done debugging this issue in our Docker container, and sure enough, our WORKDIR in our Dockerfile was set to the same folder (in the container) that is later used as a volume mount folder in docker-compose.yml. I added this line to the end of my Dockerfile:
WORKDIR / and the problem went away. Whew!

So, @asbjornu : You were exactly right. This pattern appears to be error-prone:

Dockerfile:
WORKDIR /app/foo

Docker-compose:

<snip>
volumes:
  - ./foo:/app/foo
<snip>

Incidentally, I was lucky enough to get a 100% repro to make it easy to test. I could docker-compose exec web bash, and be dropped into our previous default WORKDIR of /app/ynab_api folder with no issues. But then, if I ran git checkout some_other_branch on my host, that Docker session would go bad. For instance:

  • bundle install would fail with 'pwd': No such file or directory - getcwd (Errno::ENOENT)
  • pwd would fail with pwd: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory

Furthermore, from then on, subsequent calls to docker-compose exec web bash would take me into a bash prompt filled with lots of errors like:

docker-compose exec web bash -l -c "bash"
shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory
job-working-directory: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory

That would eventually get me a bash prompt, but from that prompt, calls to pwd would fail with the error above. If I ran cd ., the session would be partially repaired, but bundle install would still fail with the same error as before.

At this point, we're well over my head as far as Linux and Docker are concerned, but I sure hope this is fixable given how convenient and common it is.

@Taytay
Copy link

Taytay commented Jun 12, 2020

@jeanlaurent : I wanted to give you a direct ping to let you know that I have a 100% repro, as described above, and I hope it will be helpful in tracking this down!

@scott-vsi
Copy link

I came across this problem and was able to run strace on a simple C program that calls unistd.h:getcwd(). I ran this both from a directory that worked (/data) and the one that did not (/src). Both are directories mounted from the host. The only significant difference is this:

- getcwd("/data", 4096)                   = 6
+ getcwd(0x1ba4590, 4096)                 = -1 ENOENT (No such file or directory)

No idea if that is helpful.

macOS: 10.15.5
Engine: 19.03.8
Compose: 1.25.5
Docker Desktop: 2.3.0.3 (45519)

@Taytay
Copy link

Taytay commented Aug 20, 2020

This is biting us again on a frequent basis. Our developers have to restart our Docker containers frequently due to this bug. I'm going to experiment with removing :delegated to see if that is the culprit.
@jeanlaurent, I noticed that this issue is still in "triage" status. Is that because the Docker team is not sure if it's worth fixing/investigating?

@Taytay
Copy link

Taytay commented Aug 20, 2020

I've just upgraded my machine to Dropbox Edge (version 2.3.4.0 (46980) at the time of this post) which uses Mutagen to do syncing for :delegated volumes, and so far, switching between Git branches is not exposing the problem. It's only been a few minutes, but this might have solved the issue...

@booleanbetrayal
Copy link

Still seeing the CWD error resolved with Mutagen @Taytay ?

@brendonrapp
Copy link

Mutagen was removed from Docker Edge in 2.3.5.0.

@ryechus
Copy link

ryechus commented Sep 21, 2020

I am still seeing this error after removing :delegated declarations and setting WORKDIR to / in the Dockerfile. My use case is three volume mounts for one service and two volume mounts in a second service. The two volumes overlap with the three volumes. I am going to experiment with using bind mounts instead of volume mounts because it certainly seems like some sort of race condition. I only see the error when there are concurrent i/o intense operations.

@ryechus
Copy link

ryechus commented Sep 25, 2020

I got to the bottom of the problems I was experiencing. I removed one of the volume mounts and the behavior stopped.

@docker-robott
Copy link
Collaborator

Issues go stale after 90 days of inactivity.
Mark the issue as fresh with /remove-lifecycle stale comment.
Stale issues will be closed after an additional 30 days of inactivity.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so.

Send feedback to Docker Community Slack channels #docker-for-mac or #docker-for-windows.
/lifecycle stale

amcaplan pushed a commit to amcaplan/simple_jsonapi_client that referenced this issue Dec 30, 2020
Fix bundler issue in build, and getcwd error due to volumes (similar to
docker/for-mac#1509)
@docker-robott
Copy link
Collaborator

Closed issues are locked after 30 days of inactivity.
This helps our team focus on active issues.

If you have found a problem that seems similar to this, please open a new issue.

Send feedback to Docker Community Slack channels #docker-for-mac or #docker-for-windows.
/lifecycle locked

@docker docker locked and limited conversation to collaborators Feb 22, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests