Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker-sync does not sync (again) #624

Closed
ohcibi opened this issue Nov 30, 2018 · 16 comments
Closed

docker-sync does not sync (again) #624

ohcibi opened this issue Nov 30, 2018 · 16 comments

Comments

@ohcibi
Copy link

ohcibi commented Nov 30, 2018

Error

Using the following configuration:

version: "2"

options:
  verbose: true # make the whole sync verbose
syncs:
  app-sync: #tip: add -sync and you keep consistent names as a convention
    src: 'app/'
    sync_host_ip: 'auto'
    sync_host_port: 10872
    sync_userid: '501'
    watch_args: '-v' # make watching verbose
    sync_exclude: ['build']

The initial sync with docker-sync start after a docker-sync clean works as expected but after this nothing is synced. Not even by explicitly calling docker-sync sync. Also the container uses constantly CPU. It also doesn't just works after a few minutes as former variants of this issue.

Docker Driver

docker for mac

Sync strategy

native_osx

docker-sync.yml

version: "2"

options:
  verbose: true # make the whole sync verbose
syncs:
  app-sync: #tip: add -sync and you keep consistent names as a convention
    src: 'app/'
    sync_host_ip: 'auto'
    sync_host_port: 10872
    sync_userid: '501'
    watch_args: '-v' # make watching verbose
    sync_exclude: ['build']

OS

macOS 10.14.1

@EugenMayer
Copy link
Owner

please remove

    sync_host_ip: 'auto'
    sync_host_port: 10872

how many files are you watching?
I did not test Mojave yet, so it could be related to that - you would need to find out yourself.

@ohcibi
Copy link
Author

ohcibi commented Nov 30, 2018

how many files are you watching?
200k

Will try again with removed settings and debug guide tomorrow and report back.

@EugenMayer
Copy link
Owner

far too many, read the guides on how to reduce that wit excludes

@ohcibi
Copy link
Author

ohcibi commented Dec 1, 2018

It is not possible to reduce that number. I’m using docker-sync for a reason. Th number without the excluded is WAY higher.

I still had some progress though. it turns out that eventually the CPU usage dropped to a normal rate. After that happens the sync acts as fast as expected and does its job without permanent cpu usage. Additionally I turned off sync during full build as that spits another magnitude of files into the mountpoint which helped reducing waiting times greatly. I even excluded the first pass of the build files but I guess docker-sync still needs to decide for all of those wether they are excluded and therefore slowed down the build heavily when it was active during full build. (Full build Takes 4-5 Hours on a MacBook Pro 2018)

What would have helped avoiding issues here is some kind of a progress indicator on what docker-sync is doing upon subsequent calls to docker-sync start and even more important when that is finished. I’m not even talking about an eta or so just to be clear. It would suffer to know that something is not yet finished and one needs to wait for it to finish (just like the initial sync after running docker-sync clean does). Especially due to the fact that the initial docker-sync start blocks until it finishes its initial tasks when subsequent calls to it don’t made me think docker-sync was ready to be used when it wasn’t just yet.

Tl;dr docker-sync even works in my scenario but one has to wait for it to finish setup tasks on every start of sync not only initial. Something that tells when those tasks are finish and docker-sync ready to be used would be helpful in this case.

@EugenMayer
Copy link
Owner

What you are waiting for is for unison to build its initial catalog / index, lesser the initial sync.

200k files is not sane for any project, let me explain.

Deal with it, d4m is a crook on OSX - you either adopt to the fact or change the OS or try to fail over and over again. Docker-sync can help you, but it cannot erase the initial "ground issues".

That said, if you want to reliably develop nuder MacOS you have to adopt your project layout to avoid the "dark area" and problem spots - if you do not, you will never have anything that you can rely on. It will fail randomly, more randomly on weaker devices and even happen more often when you MBP runs longer (or d4m).
So you either fight the fact that your project layout has to be adopted and lose in the long run - or you take it.

That said, no way your project has 200 files for source code - i just be balsy and tell you that. We either talk about 200k including vendor/dependencies or including blank assets for some sorts of application.

No matter what, for sure not source code - or let me rephrase "parts of the application you will be changing with your IDE/on your host while running the docker stack". Those files hardly ever go above 5000k (far less).

I can understand, project layout has to be a compromise, so landing on 20/30k is ok, taking on some of the vendor symbols.

Whatever 200k is - its beyond that.

Restructure your project, ensure you split vendor out and do neither sync it nor watch it. At this number, use an initial docker cp task for the host to container transport and a resync.sh for doing that again ( no matter its container to host or host to container ).

You can call this opinionated but its not, it's not my opinion. I would LOVE to not go down this road. It's just a fact i have to bow down against too.

Again, i am not teaching your with my opinion, i am telling you what you can do. You can get all the reasons and facts behind this reading the isssues in docker-sync, understanding osxfs of d4m, understanding unison, inotify events and OSX fs events, understand how the hypervisor works for d4m and all that - gather all that, do benchmarks and make up your own thing.

Or take it as a "experiment a lot of people in here already done" and there seem to be only one solution - project layout adjustments and some compromises where we can do those.

@ohcibi
Copy link
Author

ohcibi commented Dec 2, 2018

First off: as I said docker-sync does work under said circumstances and the fact that an initial catalog build would take longer than usual is a given for anyone working on this project. It’s just that one needs to know it which you helped me with by explaining. So everything is good here. If not already done and I just missed it the only thing I’d improve here is verbosity.

About the project you are absolutely right that I’m not talking about 200k of source files. It is an entire operating system which is build with yocto and I already excluded the majority of files by excluding the built files. It contains an entire openembedded layer and circular soft link messes make everything that scans the complete source dir behave super slow. I was working in a Ubuntu vm and Idea would crash reliably while building the index. There’s only 3 or 4 developers with a Mac and therefore there is no urge to change this. I could think of all sorts of ways optimizing this but can’t apply them just yet.

I don’t think/care that/if what you’re saying is opinionated because I simply agree. The project layout sucks hell but I simply can’t do anything about it because it „just works“ on the majority of the developers devices. Some even simply don’t build with their PCs but wait for CI to finish. It’s a mess you don’t have to tell me 8-). What I’m doing is finding the easiest way to gain ground in this mess and so far docker-sync helped me a lot because waiting for the catalog to be built is way less frustrating than that Ubuntu vm I was using before. The upside is this only has to work for me.

@ohcibi
Copy link
Author

ohcibi commented Dec 2, 2018

Could you elaborate on what way the said options you told me to remove are optimizing this?

@EugenMayer
Copy link
Owner

how often do your files changes, lets say in 10 minutes?

@ohcibi
Copy link
Author

ohcibi commented Dec 2, 2018

During development in an absolute normal rate. Can’t tell numbers but it’s really just the full build when file changes go berserk as it downloads and extracts a gazillion of files. So I need to discipline turning off sync during full builds which shouldn’t happen too often hopefully. The irony is that I’m not even trying to solve troubles with the number of files with docker-sync. The build fails on d4m without docker-Sync because of the softlink mess 8-). Until it fails the performance is not even notably worse than on a native docker setup

I noticed that docker-sync has trouble with .git/index.lock as it is deleted to fast but it’s not causing other trouble nor should that file be synced anyways but as I need .git inside the container (don’t ask 8-) I just left it as is for now.

@EugenMayer
Copy link
Owner

I think in your special situation, it would take be far more effort then i can provide to fiddle out how to optimize it, sorry.

You are stretching ds into a region it is known to have handicaps and your are stretch d4m into a region where hope might be lost entirely.

I cannot really offer you to understand your build, build tools and packaging / artefact flow nor the docker container runtime to find a good solution - its just nothing i can bring up enough time, sorry.

I see that you are very much aware of the layout issues and your hands are tight together ... so i guess you are kind of your own.

As far as i can see, i cannot really help with anything docker-sync related and would close the issue therefore - any doubts?

@ohcibi
Copy link
Author

ohcibi commented Dec 2, 2018

Did you get the point that a) I understood that you can’t help here and therefore I don’t request help, b) everything even basically works as expected but I have to wait a bit more than usual on ds start as the only downside and c) as for the matter of this issue I would appreciate ds being more verbose when it builds the catalog and when this is finished.

I feel like there’s a major misunderstanding here as you are trying to solve stuff I already told you can’t be solved. I also understood and expected the limitations of ds even before opening this issue so please don’t put any more effort into this but focus on my verbosity request and tell me if I simply missed it or if it would be feasible to add said verbosity when the catalog is build as you explained

@EugenMayer
Copy link
Owner

I understood.

If you have an idea how to do that, i am happy to review a PR anytime :)

@metalfm
Copy link

metalfm commented Dec 2, 2018

I have a similar problem.
Try docker volume prune
Helped me

@ohcibi
Copy link
Author

ohcibi commented Dec 2, 2018

@EugenMayer unfortunately i dont. I just wasnt sure if i maybe simply have missed it. But isnt it simply possible to write something like "Starting to build unison catalog, this could take a long time if you have a large amount of files" to the output of docker-sync start ?? And maybe once thats finished simply log it? Or is there no way to know when it is finished. I could in fact PR the "start" log but not sure about the "finish" one.

As my setup also pointed me to issues that are caused by a too small docker vm virtualdisk, I'd additionally recommend to not log into a file and then tail -f that file but symlink /dev/stdout to said file. But you might have reasons for not doing that, that I dont know or understand. So ignore this if thats the case.

@metalfm what do you mean with similar problem? The initial problem of docker-sync not syncing is a false negative as the real problem was that the catalog needed to be build. Are you talking about that or a different thing? Will prune speed up the catalog build?

@EugenMayer
Copy link
Owner

@ohcibi not sure there is a an easy way to determine when its done and make that transparent .. i focus on pure maintenance of docker-sync right now so any feature has to be coming from a contributor .. sorry ;/

The docker volume prune thing might be related to the anon-volume we create in unison which is then overlayed by the osxfs mount - if things are really broken in d4m there might be a side effect (which i never have hear of).

Also prune could be something that gets rid of a volume this way, that the internal osxfs "service" under the hv gets flushed / killed resetted.

All this are hard assumptions and there is hardly eny prove to it - it's just not entirely absurd.

@ohcibi please open a new FR issue for the task to "make the unison catalog building inside the container more transparent" if you want to work on it - lets close this issue since it's is already drifting away. You might pick up http://www.cis.upenn.edu/~bcpierce/unison/download/releases/stable/unison-manual.html .. maybe specificly http://www.cis.upenn.edu/~bcpierce/unison/download/releases/stable/unison-manual.html#unisondir i do not know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants