Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker container (gitlab-ce) fails to start when using mergerfs as mounted data #1267

Closed
tarchive opened this issue Oct 20, 2023 · 6 comments

Comments

@tarchive
Copy link

Describe the bug

When using a mergerfs pool as the data volume of a fresh docker container using the official gitlab-ce docker image, it is unable to start due to the startup reconfigure failing.
After changing the docker volume mounts to use the disk directly the container is able to run it's reconfiguration script without error.

To Reproduce

docker compose of container. All the data directories are empty to begin with.

version: '3'
services:
  gitlab-ce:
    container_name: GitLab-CE
    image: 'gitlab/gitlab-ce:15.4.6-ce.0'
    ports:
      - '9081:80'
    volumes:
        - '/mnt/data/gitlab-ce/config:/etc/gitlab'
        - '/mnt/data/gitlab-ce/data:/var/opt/gitlab'
        - '/mnt/data/gitlab-ce/log:/var/log/gitlab'`

Container starts then terminates.

Gitlab container error message. (Click to show)
Recipe: gitlab::gitlab-shell
  * storage_directory[/var/opt/gitlab/.ssh] action create
    * ruby_block[directory resource: /var/opt/gitlab/.ssh] action run
      
      ================================================================================
      Error executing action `run` on resource 'ruby_block[directory resource: /var/opt/gitlab/.ssh]'
      ================================================================================
      
      Mixlib::ShellOut::ShellCommandFailed
      ------------------------------------
      Expected process to exit with [0], but received '1'
      ---- Begin output of chgrp git /var/opt/gitlab/.ssh ----
      STDOUT: 
      STDERR: chgrp: changing group of '/var/opt/gitlab/.ssh': No such file or directory
      ---- End output of chgrp git /var/opt/gitlab/.ssh ----
      Ran chgrp git /var/opt/gitlab/.ssh returned 1
      
      Cookbook Trace: (most recent call first)
      ----------------------------------------
      /opt/gitlab/embedded/cookbooks/cache/cookbooks/package/libraries/storage_directory_helper.rb:35:in `run_command'
      /opt/gitlab/embedded/cookbooks/cache/cookbooks/package/libraries/storage_directory_helper.rb:52:in `ensure_permissions_set'
      /opt/gitlab/embedded/cookbooks/cache/cookbooks/package/resources/storage_directory.rb:42:in `block (3 levels) in class_from_file'
      /opt/gitlab/embedded/cookbooks/cache/cookbooks/package/resources/storage_directory.rb:36:in `block in class_from_file'
      
      Resource Declaration:
      ---------------------
      # In /opt/gitlab/embedded/cookbooks/cache/cookbooks/package/resources/storage_directory.rb
      
       36:   ruby_block "directory resource: #{new_resource.path}" do
       37:     block do
       38:       # Ensure the directory exists
       39:       storage_helper.ensure_directory_exists(new_resource.path)
       40: 
       41:       # Ensure the permissions are set
       42:       storage_helper.ensure_permissions_set(new_resource.path)
       43: 
       44:       # Error out if we have not achieved the target permissions
       45:       storage_helper.validate!(new_resource.path)
       46:     end
       47:     not_if { storage_helper.validate(new_resource.path) }
       48:   end
       49: end
      
      Compiled Resource:
      ------------------
      # Declared in /opt/gitlab/embedded/cookbooks/cache/cookbooks/package/resources/storage_directory.rb:36:in `block in class_from_file'
      
      ruby_block("directory resource: /var/opt/gitlab/.ssh") do
        action [:run]
        default_guard_interpreter :default
        declared_type :ruby_block
        cookbook_name "gitlab"
        recipe_name "gitlab-shell"
        block #<Proc:0x0000557356bac5d8 /opt/gitlab/embedded/cookbooks/cache/cookbooks/package/resources/storage_directory.rb:37>
        not_if { #code block }
�
d
      
      System Info:
      ------------
      chef_version=17.10.0
      platform=ubuntu
      platform_version=20.04
      ruby=ruby 2.7.6p219 (2022-04-12 revision c9c2245c0a) [x86_64-linux]
      program_name=/opt/gitlab/embedded/bin/cinc-client
      executable=/opt/gitlab/embedded/bin/cinc-client
      
    
    ================================================================================
    Error executing action `create` on resource 'storage_directory[/var/opt/gitlab/.ssh]'
    ================================================================================
    
    Mixlib::ShellOut::ShellCommandFailed
    ------------------------------------
    ruby_block[directory resource: /var/opt/gitlab/.ssh] (gitlab::gitlab-shell line 36) had an error: Mixlib::ShellOut::ShellCommandFailed: Expected process to exit with [0], but received '1'
    ---- Begin output of chgrp git /var/opt/gitlab/.ssh ----
�
    STDOUT: 
    STDERR: chgrp: changing group of '/var/opt/gitlab/.ssh': No such file or directory
    ---- End output of chgrp git /var/opt/gitlab/.ssh ----
    Ran chgrp git /var/opt/gitlab/.ssh returned 1
    
    Cookbook Trace: (most recent call first)
    ----------------------------------------
    /opt/gitlab/embedded/cookbooks/cache/cookbooks/package/libraries/storage_directory_helper.rb:35:in `run_command'
    /opt/gitlab/embedded/cookbooks/cache/cookbooks/package/libraries/storage_directory_helper.rb:52:in `ensure_permissions_set'
    /opt/gitlab/embedded/cookbooks/cache/cookbooks/package/resources/storage_directory.rb:42:in `block (3 levels) in class_from_file'
    /opt/gitlab/embedded/cookbooks/cache/cookbooks/package/resources/storage_directory.rb:36:in `block in class_from_file'
    
    Resource Declaration:
    ---------------------
    # In /opt/gitlab/embedded/cookbooks/cache/cookbooks/gitlab/recipes/gitlab-shell.rb
    
     34:   storage_directory dir do
     35:     owner git_user
     36:     group git_group
     37:     mode "0700"
     38:   end
�
     39: end
    
    Compiled Resource:
    ------------------
    # Declared in /opt/gitlab/embedded/cookbooks/cache/cookbooks/gitlab/recipes/gitlab-shell.rb:34:in `block in from_file'
    
    storage_directory("/var/opt/gitlab/.ssh") do
      action [:create]
      default_guard_interpreter :default
      declared_type :storage_directory
      cookbook_name "gitlab"
      recipe_name "gitlab-shell"
      owner "git"
      group "git"
      mode "0700"
      path "/var/opt/gitlab/.ssh"
    end
    
    System Info:
    ------------
    chef_version=17.10.0
    platform=ubuntu
    platform_version=20.04
    ruby=ruby 2.7.6p219 (2022-04-12 revision c9c2245c0a) [x86_64-linux]
    program_name=/opt/gitlab/embedded/bin/cinc-client
    executable=/opt/gitlab/embedded/bin/cinc-client
    
[2023-10-20T12:20:17-04:00] INFO: Running queued delayed notifications before re-raising exception
Running handlers:
[2023-10-20T12:20:17-04:00] ERROR: Running exception handlers
There was an error running gitlab-ctl reconfigure:
storage_directory[/var/opt/gitlab/.ssh] (gitlab::gitlab-shell line 34) had an error: Mixlib::ShellOut::ShellCommandFailed: ruby_block[directory resource: /var/opt/gitlab/.ssh] (gitlab::gitlab-shell line 36) had an error: Mixlib::ShellOut::ShellCommandFailed: Expected process to exit with [0], but received '1'
---- Begin output of chgrp git /var/opt/gitlab/.ssh ----
STDOUT: 
STDERR: chgrp: changing group of '/var/opt/gitlab/.ssh': No such file or directory
---- End output of chgrp git /var/opt/gitlab/.ssh ----
Ran chgrp git /var/opt/gitlab/.ssh returned 1
Running handlers complete
[2023-10-20T12:20:17-04:00] ERROR: Exception handlers complete
Infra Phase failed. 3 resources updated in 09 seconds
[2023-10-20T12:20:17-04:00] FATAL: Stacktrace dumped to /opt/gitlab/embedded/cookbooks/cache/cinc-stacktrace.out
[2023-10-20T12:20:17-04:00] FATAL: ---------------------------------------------------------------------------------------
[2023-10-20T12:20:17-04:00] FATAL: PLEASE PROVIDE THE CONTENTS OF THE stacktrace.out FILE (above) IF YOU FILE A BUG REPORT
[2023-10-20T12:20:17-04:00] FATAL: ---------------------------------------------------------------------------------------
[2023-10-20T12:20:17-04:00] FATAL: Mixlib::ShellOut::ShellCommandFailed: storage_directory[/var/opt/gitlab/.ssh] (gitlab::gitlab-shell line 34) had an error: Mixlib::ShellOut::ShellCommandFailed: ruby_block[directory resource: /var/opt/gitlab/.ssh] (gitlab::gitlab-shell line 36) had an error: Mixlib::ShellOut::ShellCommandFailed: Expected process to exit with [0], but received '1'
---- Begin output of chgrp git /var/opt/gitlab/.ssh ----
STDOUT: 
STDERR: chgrp: changing group of '/var/opt/gitlab/.ssh': No such file or directory
---- End output of chgrp git /var/opt/gitlab/.ssh ----
Ran chgrp git /var/opt/gitlab/.ssh returned 1

System information:

  • OS, kernel version: Linux photon-5 6.1.45-8.ph5-esx #1-photon SMP Wed Sep 20 03:11:02 UTC 2023 x86_64 GNU/Linux

  • mergerfs version: mergerfs v2.37.1

  • mergerfs settings: /mnt/DISK/NytroWarpDrive:/mnt/DISK/FlashMaxIII:/mnt/SAN/NAS_LUN-1 /mnt/data fuse.mergerfs cache.files=partial,dropcacheonclose=true,moveonenospc=true,minfreespace=20G,fsname=mergerfsPool 0 0

  • List of drives, filesystems, & sizes:

    • df -h
    mergerfsPool               175G   11G  163G   6% /mnt/data
    /dev/sdd1                   25G  3.8M   25G   1% /mnt/DISK/FlashMaxIII
    /dev/sdb1                   50G   11G   40G  21% /mnt/DISK/NytroWarpDrive
    /dev/sdc1                  100G  3.8M   98G   1% /mnt/SAN/NAS_LUN-1
    
    • lsblk -f
    sdb
    └─sdb1 btrfs              7bb09c40-6032-45c3-b1c1-f8414a19d853   39.8G    20% /mnt/DISK/NytroWarpDrive
    sdc
    └─sdc1 btrfs              193fb243-9dd0-476a-a2ba-e3c8793f1851     98G     0% /mnt/SAN/NAS_LUN-1
    sdd
    └─sdd1 btrfs              f5419648-dbe6-461c-b88c-3e42f590b070   24.5G     0% /mnt/DISK/FlashMaxIII
    
  • A strace of the application having a problem:
    Same strace as mentioned in container log above. Renamed to .txt for upload.
    cinc-stacktrace.out.txt

  • strace of mergerfs while app tried to do it's thing:
    mergerfs.strace.txt

@trapexit
Copy link
Owner

494   16:20:17.797566 newfstatat(AT_FDCWD, "/mnt/DISK/NytroWarpDrive/gitlab-ce/data/.ssh", 0x7f8d00b101a0, AT_SYMLINK_NOFOLLOW) = -1 EACCES (Permission denied) <0.000030>
494   16:20:17.797655 newfstatat(AT_FDCWD, "/mnt/DISK/FlashMaxIII/gitlab-ce/data/.ssh", 0x7f8d00b101a0, AT_SYMLINK_NOFOLLOW) = -1 EACCES (Permission denied) <0.000028>
494   16:20:17.797737 newfstatat(AT_FDCWD, "/mnt/SAN/NAS_LUN-1/gitlab-ce/data/.ssh", 0x7f8d00b101a0, AT_SYMLINK_NOFOLLOW) = -1 EACCES (Permission denied) <0.000029>

Because you don't have a strace of the offending app I can't really correlate things properly but this certainly looks like a problem. The underlying filesystems are returning perm denied for that file / path. Have you confirmed perms are properly set?

@tarchive
Copy link
Author

Ah I misread that support step and only gave the stacktrace gitlab generated internally.
Here is the strace from app command gitlab-ctl reconfigure since I'm not sure which is the most basic command is failing. This is just a snippet from where i see the error begin until the end. Let me know if you need the whole file (90M)
snippet.app.strace.txt

I don't think folder permissions are the issue because these are all folders that the container is creating and setting of permissions itself. Their script is here: update-permissions

Further testing if i swap the volumes in my compose to use /mnt/DISK/NytroWarpDrive directly instead of the /mnt/data pool, the container starts up normally without error. Issue only exists when using pool as a mount

version: '3'
services: 
  gitlab-ce:
    container_name: GitLab-CE
    image: 'gitlab/gitlab-ce:15.4.6-ce.0'
    ports:
      - '9081:80'
    volumes:
        - '/mnt/DISK/NytroWarpDrive/gitlab-ce/config:/etc/gitlab'
        - '/mnt/DISK/NytroWarpDrive/gitlab-ce/data:/var/opt/gitlab'
        - '/mnt/DISK/NytroWarpDrive/gitlab-ce/log:/var/log/gitlab'`

Other notes: NytroWarpDrive is the only disk with the gitlab-ce folders. This is a fresh vm install which is why disk sizes are small and empty.

@trapexit
Copy link
Owner

trapexit commented Oct 20, 2023

From mergerfs' perspective the OS is absolutely returning permission denied. It is there in the mergerfs strace as I shared. It stat'ed the .ssh path and all three mounts returned EACCES.

I need both the strace from mergerfs and the strace from the app at the same time so I can correlate what request the app sends with the behavior of mergerfs.

Issue only exists when using pool as a mount

Yes, but perms can be different due to how containers work and how you have mergerfs setup. For instance: many people don't share groups between a container and host leading to a supplemental group difference. Some users use user namespacing which further changes what is going on between a container and the host. You could have perm errors on parts of the path that don't translate to a bind mount. Not everything can be exactly replicated between mergerfs and the underlying filesystem. Hence why I need all the information possible about the setup to comment. Almost certainly the issue is permissions... but I need to know what they are.

@trapexit
Copy link
Owner

This is a fresh vm install

So what are the perms of the /mnt/DISK/* ? Are you positive they setup properly? mergerfs sees /mnt/DISK/foo... not what you bind mount. It evaluates not the binded point down. It looks at the whole path. If the base of the path is not properly permissioned such that from outside the container it works then it won't work in the container either.

@tarchive
Copy link
Author

D'oh! You were right it was a permission issue. I went over the folder permissions earlier before opening this issue but i must have misread the output. Specifically /mnt/DISK had 750 for permission. Once i changed it to 755 and recreated the gitlab container the startup ran smoothly without error. Sorry to waste your time. I swear i checked all this before hand.

Thank you for your ongoing support and overall awesome projects. Long time scorch user, but still a mergerfs newbie.

@trapexit
Copy link
Owner

no problem. glad we resolved it.

This is a somewhat complicated situation. If I rewrote mergerfs to work more like what a bind mount would do it would 1) require a major rewrite and 2) means that branches can't as easily be added and removed without explicitly configurating mergerfs because it would require holding an open file the life of the usage. It might be worth doing but need to carefully consider all the consequences.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants