Skip to content

dsconf backup create hangs on a container created with 389ds/dirsrv:3.1 docker image #6372

Closed
@mayurcrunchr

Description

@mayurcrunchr

Issue Description

We have a cronjob that runs the following command to backup the ldap database online every hour at 55 minutes( 55 * * * *)
"docker exec -i ldap dsconf -D "cn=Directory Manager" localhost backup create /tmp/ansible.1bu66l05/db"

The job takes backup successfuly and finishes successfully for few hours(23 hours in this case) and then hangs suddenly while copying backup files (after 24 iterations or 24 hours in this case)

"[20/Oct/2024:22:55:33.029998779 +0000] - INFO - archive_copyfile - Copying /etc/dirsrv/slapd-localhost/slapd-collations.conf to /tmp/ansible.1bu66l05/db/config_files/slapd-collations.conf"

We are using the docker image 389ds/dirsrv:3.1 on a Ubuntu 20.04.6 LTS host

Package Version and Platform:

  • Platform: openSUSE Tumbleweed , release: 20240726
  • Package and version: [389-ds 3.1.1~git0.aef1668-180.1 x86_64]

Steps to Reproduce
Steps to reproduce the behavior:

  1. Create a ldap docker container using docker image 389ds/dirsrv:3.1 on a ubuntu 20.04.6 LTS host
  2. Add a backend and set the nsslapd-backend-implement to bdb(we have also observed the same problem on fresh mdb database)
  3. Create a test group
  4. Create a test user and add the test user to the test group created in step 3
  5. Enable the cronjob to backup the ldap db every hour at 55 minutes
    55 * * * * docker exec -i ldap dsconf -D "cn=Directory Manager" localhost backup create /tmp/ansible.1bu66l05/db
  6. After few hours(24 hours in our case), the dsconf backup hangs

Screenshots
image
image
image

Additional context
Please find attached the backend config, nsslapd config params, access logs, error logs, db_stat output, gdb backtrace for ns-slapd, dsconf and dscontainer processes.

389ds-3.1-debug-files-all.tar.gz

GDB outputs has been taken using the command described in https://www.port389.org/docs/389ds/FAQ/faq.html

We did try running the command docker exec -i ldap dsconf -D "cn=Directory Manager" localhost backup create /tmp/ansible.1bu66l05/db" in a for loop 140 times, and were not able to repro the issue using this method.

FYI: We also have a health check service running on the host that does the following every 30 seconds
"/usr/bin/ldapwhoami ldapsearch -H ldap://127.0.0.1:3389 -x -D 'cn=Directory Manager' -y /etc/ldap_password"

We did not encounter this problem with 389ds/dirsrv:2.4

Metadata

Metadata

Assignees

No one assigned

    Labels

    needs triageThe issue will be triaged during scrumpriority_highneed urgent fix / highly valuable / easy to fix

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions