Skip to content

fix apply chown permissions in parallel for large workspace #73

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Sep 16, 2021

Conversation

gjrtimmer
Copy link
Contributor

@gjrtimmer gjrtimmer commented Jul 4, 2021

chown is always bound to a single core. See man page chown. Applying the permissions through a single process on a large workspace causes the container boot process to be slow. In order to mitigate slow container start times for large work spaces chown can be run in parallel. This PR provides that functionality.

Gains

  • Increase container boot time by a factor ~20
  • Significant reduction in memory use on container start because the entire /config will not be loaded in a singular chown process which for a large config directory and/or workspace can easily increase to over 1.2 GB of additional memory.
  • Automatic parallel process allocation based upon available cores

  • I have read the contributing guideline and understand that I have made the correct modifications

Description:

Apply permissions through chown in parallel.

Benefits of this PR and context:

Benefits are pure speed when it comes to starting the container.

The following test results are from a workspace with the following details.

Files: 134,834
Size: 9.8 GB

Results before PR

real    17m27.335s
user    0m1.222s
sys     0m50.652s

Results after PR

# chown /config
real    0m15.775s
user    0m0.809s
sys     0m12.201s

# chown /config/workspace
real    0m33.875s
user    0m2.461s
sys     0m42.309s

How Has This Been Tested?

  • Container build
  • prefixed command with linux time command, recorded differences and attached them in this PR.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for opening this pull request! Be sure to follow the pull request template!

@aptalca aptalca self-assigned this Jul 4, 2021
@LinuxServer-CI
Copy link
Collaborator

@gjrtimmer
Copy link
Contributor Author

[EDIT] Updated PR to include automatic process allocation based upon available number of cores in container

@gjrtimmer
Copy link
Contributor Author

58e865e fixes the error found in the logs of LinuxSever-CI

log

...
setting permissions::workspace
chown: missing operand after ‘abc:abc’
Try 'chown --help' for more information.
...

Fixed in 58e865e do not run xargs when empty

@gjrtimmer
Copy link
Contributor Author

@aptalca Ready for evaluation

@LinuxServer-CI
Copy link
Collaborator

2 similar comments
@LinuxServer-CI
Copy link
Collaborator

@LinuxServer-CI
Copy link
Collaborator

@gjrtimmer
Copy link
Contributor Author

gjrtimmer commented Jul 14, 2021

@aptalca Any update ?

(Rebased / Synced due to bot update)

@LinuxServer-CI
Copy link
Collaborator

@aptalca
Copy link
Member

aptalca commented Jul 20, 2021

@gjrtimmer thanks for the PR. I've been quite busy with real life and other projects.
I did do a quick review and it looks good.

To be honest, when I first put in the chown in there, I didn't consider that folks may have many gigs of data in the config folder. In such cases (like plex or emby where they have hundreds of gigs), we usually check the top folder and if the perms there are good, we skip the chown.

I'm thinking we can do the same here for the workspace folder, where we check /config/workspace and skip its subfolders if its perms are good. The rest of the config folder can be chowned via xargs.

It also got me thinking, since we provide sudo/root access in this container, the user may have files in workspace with different perms or owners intentionally. We certainly don't want to mess with them.

Let me know if you see any issues with that plan. If not, please modify it accordingly (only chown /config/workspace if needed, and not recursive, and the rest of /config can be chowned recursively).

Thanks

@gjrtimmer
Copy link
Contributor Author

gjrtimmer commented Jul 20, 2021

@aptalca I totally understand, I work myself in medical IT, so I know how taxing real life can be.

Your suggestions got me also thinking, what if we make a few changes and introduce some variables, this way we can do even both, have the user decide and provide some flexibility which even can be ported to other containers.

I will change the PR according to your suggestions, and keep your initial suggestion as the default action.

I would still like to preserve some of this improvement because as you said, we can recursively chown the rest of /config without the /config/workspace. We can still save time on just the /config by running it in parallel because that will still give a severe boost.

I've included a recent overview of my current running docker-code-server its also the workspace in where I even keep this PR.

❯ du -sh *
24K     crontabs
0       custom-cont-init.d
0       custom-services.d
279M    data
435M    extensions
0       npm-global
11G     workspace

As you can see, I this current container there is not much npm data, but as you can see, the data and the extensions take some space than initially expected, and all of those are small javascript files. I think we should go with the hybrid you suggested. I will whip up something nice, that will be usable for all end-users.

BTW

I've noticed something really weird, did you know that the base image sets the user for the custom-cont-init.d and custom-services.d to root:root even if the users creates the directories; is that a bug ?

@Roxedus
Copy link
Member

Roxedus commented Aug 8, 2021

I've noticed something really weird, did you know that the base image sets the user for the custom-cont-init.d and custom-services.d to root:root even if the users creates the directories; is that a bug ?

No this is intended, to block compromised apps to be able to write to files that are executed by root.

I can also mention that I mounted my projects to another folder, mainly because my /config is backed up.

@github-actions
Copy link

github-actions bot commented Sep 8, 2021

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@gjrtimmer
Copy link
Contributor Author

@Roxedus Thank you.

@gjrtimmer
Copy link
Contributor Author

@aptalca | @Roxedus | Request review

@LinuxServer-CI
Copy link
Collaborator

@aptalca
Copy link
Member

aptalca commented Sep 16, 2021

I'm not in favor of adding an env var (let alone 2) for permissions. I think we should simply chown everything under config recursively except for workspace, which we'll chown non-recursive (so code-server can access it on start). Anything inside workspace will be the user's own responsibility.

@LinuxServer-CI
Copy link
Collaborator

@gjrtimmer
Copy link
Contributor Author

@aptalca | I understand, will provide fix within 5m

@aptalca
Copy link
Member

aptalca commented Sep 16, 2021

Thanks so much. While you're at it, can you also update the changelog entry to today's date?

@gjrtimmer
Copy link
Contributor Author

@aptalca | Fix provided, corrections made upon comments, please review

@LinuxServer-CI
Copy link
Collaborator

@aptalca
Copy link
Member

aptalca commented Sep 16, 2021

From the container startup log:

xargs: WARNING: a NUL character occurred in the input.  It cannot be passed through in the argument list.  Did you mean to use the --null option?

@gjrtimmer
Copy link
Contributor Author

gjrtimmer commented Sep 16, 2021

@aptalca during the rewrite, I forget to put the --null back in. Fix is building. Request review; I tested the fix, I missed this during my last test, I apologize.

@LinuxServer-CI
Copy link
Collaborator

@gjrtimmer
Copy link
Contributor Author

@aptalca DONE

@LinuxServer-CI
Copy link
Collaborator

@aptalca aptalca merged commit 44f7b9f into linuxserver:master Sep 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants