Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WMAgent: install/run CouchDB from Dockerhub #11312

Closed
amaltaro opened this issue Oct 3, 2022 · 18 comments
Closed

WMAgent: install/run CouchDB from Dockerhub #11312

amaltaro opened this issue Oct 3, 2022 · 18 comments
Assignees
Labels
containerization deployment Issue related to deployment of the services WMAgent

Comments

@amaltaro
Copy link
Contributor

amaltaro commented Oct 3, 2022

Impact of the new feature
WMAgent

Fixed by: dmwm/CMSKubernetes#1409

Is your feature request related to a problem? Please describe.
As part of the migration to PyPi and RPM-less deployment, we should start looking into running CouchDB as a container in the WMAgent nodes.

Describe the solution you'd like
Update and/or use the CouchDB dockerfile already available in (latest stable tag is v5):
https://github.com/dmwm/CMSKubernetes/tree/master/docker/couchdb

Also provide relevant configuration changes and or scripts (we might have to update the configuration with the replication section, which is not used in central services)

Changes to the WMAgent deployment script might be required as well, moving away from a localhost MariaDB deployment to a containerized model (through host network or similar).

Note that all of the CouchDB data/logs needs to be in a persisted storage (resilient to container restarts/recreation).

Lastly, once the image is stable, we need to label it accordingly to avoid automated registry cleanup.
Valentin says that the policy is: tag needs to have a -stable suffix.

Describe alternatives you've considered
None

Additional context
None

Part of the following meta issue: #11314

@todor-ivanov
Copy link
Contributor

logging it here, because this is important.

Whenever I was trying to run this container I was getting a silent Killed message and the couchdb process was never started. Upon 2 days of heavy debugging of the couchdb setup I finally stumbled on the actual culprit. Looking into the host's system log I find:

Aug 21 12:32:59  kernel: Node 0 DMA32: 779*4kB (UME) 968*8kB (UE) 1500*16kB (UME) 257*32kB (UME) 61*64kB (UME) 33*128kB (UME) 0*256kB 0*512kB 0*1024kB 1*2048kB (H) 0*4096kB = 53260kB
Aug 21 12:32:59  kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
Aug 21 12:32:59  kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Aug 21 12:32:59  kernel: 2640 total pagecache pages
Aug 21 12:32:59  kernel: 0 pages in swap cache
Aug 21 12:32:59  kernel: Swap cache stats: add 0, delete 0, find 0/0
Aug 21 12:32:59  kernel: Free swap  = 0kB
Aug 21 12:32:59  kernel: Total swap = 0kB
Aug 21 12:32:59  kernel: 475125 pages RAM
Aug 21 12:32:59  kernel: 0 pages HighMem/MovableOnly
Aug 21 12:32:59  kernel: 76879 pages reserved
Aug 21 12:32:59  kernel: 0 pages cma reserved
Aug 21 12:32:59  kernel: 0 pages hwpoisoned
Aug 21 12:32:59  kernel: Tasks state (memory values in pages):
Aug 21 12:32:59  kernel: [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
Aug 21 12:32:59  kernel: [    572]     0   572     5491      890    86016        0          -250 systemd-journal
...
Aug 21 12:32:59  kernel: [   6841]  5984  6841   685997   254053  2146304        0             0 beam.smp
Aug 21 12:32:59  kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=docker-d5d2eeeab1a5040463c16d98b5ae7cb83b7c629573c6058e697af962f6d7522a.scope,mems_allowed=0,global_oom,task_memcg=/system.slice/docker-d5d2eeeab1a5040463c16d98b5ae7cb83b7c629573c6058e697af962f6d7522a.scope,task=beam.smp,pid=6841,uid=5984
Aug 21 12:32:59  kernel: Out of memory: Killed process 6841 (beam.smp) total-vm:2743988kB, anon-rss:1016188kB, file-rss:8kB, shmem-rss:16kB, UID:5984 pgtables:2096kB oom_score_adj:0
Aug 21 12:32:59  systemd-journald[572]: Data hash table of /run/log/journal/1bdd8c2e8625485f9936c0b6e3b89a4f/system.journal has a fill level at 75.0 (5403 of 7203 items, 4149248 file size, 767 bytes per hash table item), suggesting rotation.
Aug 21 12:32:59  systemd-journald[572]: /run/log/journal/1bdd8c2e8625485f9936c0b6e3b89a4f/system.journal: Journal header limits reached or header out-of-date, rotating.
Aug 21 12:32:59  systemd[1]: docker-d5d2eeeab1a5040463c16d98b5ae7cb83b7c629573c6058e697af962f6d7522a.scope: A process of this unit has been killed by the OOM killer.
Aug 21 12:32:59  rsyslogd[644]: imjournal: journal files changed, reloading...  [v8.2102.0-105.el9 try https://www.rsyslog.com/e/0 ]

So it was the process was killed by the OOM killer ......

@todor-ivanov
Copy link
Contributor

Indeed, my VM's profile was of type m2.small, which is:

  • 1 Core
  • 2 GB RAM

While, reading the CouchDB documentation, the minimum recommended resources for CouchDB should be:

  • 2 Cores
  • 4 GB RAM

Adding a swap file to the machine did fix the issue, for the time being.

@todor-ivanov
Copy link
Contributor

I just noticed, because the PR for resolving this issue is in the CMSKubrnetes repository, it is not able to automatically resolve the current issue in the DMWM repository. So here is the linik: dmwm/CMSKubernetes#1409

@khurtado Please consider giving it a try and eventually some initial feedback. I plan to apply few stuff from what we have discovered for the MariaDB container here as well though.

@amaltaro
Copy link
Contributor Author

I just found an interesting documentation on this which might be useful:
https://github.com/apache/couchdb-docker

I spent only a few minutes reading it, but it made me wonder if we should run the service with the standard image user and work on the nodes puppet template to have that user added to the schedds and in the same group as the users we use to run WMAgent(?)

@todor-ivanov
Copy link
Contributor

@amaltaro I'd prefer not to involve puppet in any of this. Lets resolve the USERS related issue in the MariaDB container see my comment there And we will apply the same technique to all our containers for WMAgent.

@todor-ivanov
Copy link
Contributor

hi @khurtado I figured out I was not adding the local.ini file to the container, but I was rather copying it manually in the config directory mounted from the host. I just changed that. Here is how I use the container now. The steps are almost the same as with mariadb:

user@vocms0290 wmagent-couchdb]$ ./couchdb-docker-build.sh -t 3.2.2
#0 building with "default" instance using docker driver

#1 [internal] load build definition from Dockerfile
#1 transferring dockerfile: 2.79kB done
#1 DONE 0.0s
...

cmst1@vocms0290:wmagent-couchdb $ docker image ls
REPOSITORY                        TAG       IMAGE ID       CREATED         SIZE
local/couchdb                     3.2.2     d2bcf802b682   6 minutes ago   819MB
local/couchdb                     latest    d2bcf802b682   6 minutes ago   819MB
local/wmagent                     2.3.0     d46ffbe2ca29   19 hours ago    2.05GB
local/wmagent                     latest    d46ffbe2ca29   19 hours ago    2.05GB
local/mariadb                     10.6.5    f05b41bf2cd3   23 hours ago    950MB
local/mariadb                     latest    f05b41bf2cd3   23 hours ago    950MB


cmst1@vocms0290:wmagent-couchdb $ docker kill couchdb
couchdb

cmst1@vocms0290:wmagent-couchdb $ ./couchdb-docker-run.sh -t 3.2.2
Starting the couchdb:3.2.2 docker container with the following parameters:  --user cmst1
088e1dbeb8e5d233a87551b007997066547cded33d0f826e2d7233c68750e8c8

cmst1@vocms0290:wmagent-couchdb $ docker ps 
CONTAINER ID   IMAGE                 COMMAND      CREATED         STATUS         PORTS     NAMES
088e1dbeb8e5   local/couchdb:3.2.2   "./run.sh"   4 seconds ago   Up 3 seconds             couchdb

cmst1@vocms0290:wmagent-couchdb $ docker exec -it couchdb bash
(CouchDB-3.2.2) [cmst1@vocms0290:data]$ 

(CouchDB-3.2.2) [cmst1@vocms0290:data]$ manage status  
ME : 3.2.2
TOP : /data
ROOT : /data/srv
CFGDIR : /data/srv/couchdb/3.2.2/config
LOGDIR : /data/srv/couchdb/3.2.2/logs
STATEDIR : /data/srv/couchdb/3.2.2/state
KEYFILE : /data/srv/couchdb/auth//hmackey.ini

COUCH_ROOT_DIR : /data
COUCH_BASE_DIR : /data/srv/couchdb
COUCH_STATE_DIR : /data/srv/couchdb/3.2.2/state
COUCH_INSTALL_DIR : /data/srv/couchdb/3.2.2/install
COUCH_CONFIG_DIR : /data/srv/couchdb/3.2.2/config

3.2.2 is NOT RUNNING.

(CouchDB-3.2.2) [cmst1@vocms0290:data]$ manage start
ME : 3.2.2
TOP : /data
ROOT : /data/srv
CFGDIR : /data/srv/couchdb/3.2.2/config
LOGDIR : /data/srv/couchdb/3.2.2/logs
STATEDIR : /data/srv/couchdb/3.2.2/state
KEYFILE : /data/srv/couchdb/auth//hmackey.ini

COUCH_ROOT_DIR : /data
COUCH_BASE_DIR : /data/srv/couchdb
COUCH_STATE_DIR : /data/srv/couchdb/3.2.2/state
COUCH_INSTALL_DIR : /data/srv/couchdb/3.2.2/install
COUCH_CONFIG_DIR : /data/srv/couchdb/3.2.2/config

Which couchdb: /opt/couchdb/bin/couchdb
  With configuration directory: /data/srv/couchdb/3.2.2/config
  With logdir: /data/srv/couchdb/3.2.2/logs
  nohup couchdb -couch_ini /data/srv/couchdb/3.2.2/config >> /data/srv/couchdb/3.2.2/logs/couch.log 2>&1 &


(CouchDB-3.2.2) [cmst1@vocms0290:data]$ manage status 
ME : 3.2.2
TOP : /data
ROOT : /data/srv
CFGDIR : /data/srv/couchdb/3.2.2/config
LOGDIR : /data/srv/couchdb/3.2.2/logs
STATEDIR : /data/srv/couchdb/3.2.2/state
KEYFILE : /data/srv/couchdb/auth//hmackey.ini

COUCH_ROOT_DIR : /data
COUCH_BASE_DIR : /data/srv/couchdb
COUCH_STATE_DIR : /data/srv/couchdb/3.2.2/state
COUCH_INSTALL_DIR : /data/srv/couchdb/3.2.2/install
COUCH_CONFIG_DIR : /data/srv/couchdb/3.2.2/config

3.2.2 is RUNNING
{"error":"unauthorized","reason":"Name or password is incorrect."}

@todor-ivanov
Copy link
Contributor

todor-ivanov commented Mar 2, 2024

@amaltaro As we talked yesterday, This:

(CouchDB-3.2.2) [cmst1@vocms0290:data]$ manage status 
...
{"error":"unauthorized","reason":"Name or password is incorrect."}

Is the minor Error, I was talking about, The one I had it solved 6 moths ago and did not preserve the solution in the config files. I hope you've met that unauthorized error before. same when I try to push the couchappps:

(CouchDB-3.2.2) [cmst1@vocms0290:data]$ manage pushapps
ME : 3.2.2
TOP : /data
ROOT : /data/srv
CFGDIR : /data/srv/couchdb/3.2.2/config
LOGDIR : /data/srv/couchdb/3.2.2/logs
STATEDIR : /data/srv/couchdb/3.2.2/state
KEYFILE : /data/srv/couchdb/auth//hmackey.ini

COUCH_ROOT_DIR : /data
COUCH_BASE_DIR : /data/srv/couchdb
COUCH_STATE_DIR : /data/srv/couchdb/3.2.2/state
COUCH_INSTALL_DIR : /data/srv/couchdb/3.2.2/install
COUCH_CONFIG_DIR : /data/srv/couchdb/3.2.2/config

Couchapps not found. Installing from latest WMCore tag.
/data/srv/couchdb/3.2.2/install/stagingarea/tmp /data
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  4340  100  4340    0     0  25086      0 --:--:-- --:--:-- --:--:-- 24942

Pulling couchapps version latest from Github...
2024-03-02 12:59:57 URL:https://codeload.github.com/dmwm/WMCore/tar.gz/refs/tags/2.3.1 [11601952] -> "2.3.1.tar.gz" [1]

....
    return CouchdbResponse(resp).json_body
  File "/usr/local/lib/python3.9/dist-packages/couchapp/client.py", line 62, in json_body
    raise Unauthorized(str(self.response))
couchapp.errors.Unauthorized: CouchDB Error! Exit code: None, Reason: <Response [401]>, Response: None
An error occured while cleaning the views. Please look in the CouchDB logs.

It seems I forget to set the user password somewhere. Do you have the previous configuration on top of your head?

@khurtado
Copy link
Contributor

khurtado commented Mar 4, 2024

@todor-ivanov I can confirm the setup is working as intended. Additionally, I went through that authorization issue and fixed it by using the COUCH_USER and COUCH_PASS information from the original WMAgent secrets file, in the following format in local.ini:

COUCH_USER = COUCH_PASS

Then, couchDB encrypts COUCH_PASS when it runs for the first time.
Here is my output of status and pushapps successfully running (see below, at the end of the post).

With that said, should we require a new file CouchDB.secrets to get that information from, just for consistency with what we do with MariaDB (or we could just get it from the WMAgent secrets file as well, but the former seems more intuitive for someone new to understand what's happening in this WMAgent/MariaDB/CouchDB decoupled deployment approach)?

That is:

  1. We require a CouchDB.secrets with the COUCH_USER and COUCH_PASS admin information (like with MariaDB)
  2. In the *build.sh script, we read that information and modify local.ini accordingly
  3. Once couch starts, that information will be encrypted and the DB should start running properly
(CouchDB-3.2.2) [cmst1@vocms0265:data]$ manage status
ME : 3.2.2
TOP : /data
ROOT : /data/srv
CFGDIR : /data/srv/couchdb/3.2.2/config
LOGDIR : /data/srv/couchdb/3.2.2/logs
STATEDIR : /data/srv/couchdb/3.2.2/state
KEYFILE : /data/srv/couchdb/auth//hmackey.ini

COUCH_ROOT_DIR : /data
COUCH_BASE_DIR : /data/srv/couchdb
COUCH_STATE_DIR : /data/srv/couchdb/3.2.2/state
COUCH_INSTALL_DIR : /data/srv/couchdb/3.2.2/install
COUCH_CONFIG_DIR : /data/srv/couchdb/3.2.2/config

4.2.2 is NOT RUNNING.
(CouchDB-3.2.2) [cmst1@vocms0265:data]$ manage start
ME : 3.2.2
TOP : /data
ROOT : /data/srv
CFGDIR : /data/srv/couchdb/3.2.2/config
LOGDIR : /data/srv/couchdb/3.2.2/logs
STATEDIR : /data/srv/couchdb/3.2.2/state
KEYFILE : /data/srv/couchdb/auth//hmackey.ini

COUCH_ROOT_DIR : /data
COUCH_BASE_DIR : /data/srv/couchdb
COUCH_STATE_DIR : /data/srv/couchdb/3.2.2/state
COUCH_INSTALL_DIR : /data/srv/couchdb/3.2.2/install
COUCH_CONFIG_DIR : /data/srv/couchdb/3.2.2/config

Which couchdb: /opt/couchdb/bin/couchdb
  With configuration directory: /data/srv/couchdb/3.2.2/config
  With logdir: /data/srv/couchdb/3.2.2/logs
  nohup couchdb -couch_ini /data/srv/couchdb/3.2.2/config >> /data/srv/couchdb/3.2.2/logs/couch.log 2>&1 &
(CouchDB-3.2.2) [cmst1@vocms0265:data]$ manage status
ME : 3.2.2
TOP : /data
ROOT : /data/srv
CFGDIR : /data/srv/couchdb/3.2.2/config
LOGDIR : /data/srv/couchdb/3.2.2/logs
STATEDIR : /data/srv/couchdb/3.2.2/state
KEYFILE : /data/srv/couchdb/auth//hmackey.ini

COUCH_ROOT_DIR : /data
COUCH_BASE_DIR : /data/srv/couchdb
COUCH_STATE_DIR : /data/srv/couchdb/3.2.2/state
COUCH_INSTALL_DIR : /data/srv/couchdb/3.2.2/install
COUCH_CONFIG_DIR : /data/srv/couchdb/3.2.2/config

4.2.2 is RUNNING
No active tasks (e.g. compactions)
(CouchDB-3.2.2) [cmst1@vocms0265:data]$ ls
admin  certs  manage  run.sh  srv
(CouchDB-3.2.2) [cmst1@vocms0265:data]$ manage pushapps
ME : 3.2.2
TOP : /data
ROOT : /data/srv
CFGDIR : /data/srv/couchdb/3.2.2/config
LOGDIR : /data/srv/couchdb/3.2.2/logs
STATEDIR : /data/srv/couchdb/3.2.2/state
KEYFILE : /data/srv/couchdb/auth//hmackey.ini

COUCH_ROOT_DIR : /data
COUCH_BASE_DIR : /data/srv/couchdb
COUCH_STATE_DIR : /data/srv/couchdb/3.2.2/state
COUCH_INSTALL_DIR : /data/srv/couchdb/3.2.2/install
COUCH_CONFIG_DIR : /data/srv/couchdb/3.2.2/config

Couchapps not found. Installing from latest WMCore tag.
/data/srv/couchdb/3.2.2/install/stagingarea/tmp /data
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  4340  100  4340    0     0  25380      0 --:--:-- --:--:-- --:--:-- 25380

Pulling couchapps version latest from Github...
2024-03-04 21:02:14 URL:https://codeload.github.com/dmwm/WMCore/tar.gz/refs/tags/2.3.1 [11601952] -> "2.3.1.tar.gz" [1]

Pulling additional reqmon and t0_reqmon dependencies...

Downloading jquery-ui.min.js...
2024-03-04 21:02:14 URL:https://ajax.googleapis.com/ajax/libs/jqueryui/1.8.18/jquery-ui.min.js [201842/201842] -> "jquery-ui.min.js" [1]

Downloading jquery.min.js...
2024-03-04 21:02:14 URL:http://code.jquery.com/jquery-1.7.2.min.js [94840/94840] -> "jquery-1.7.2.min.js" [1]

Downloading Datatables...
2024-03-04 21:02:15 URL:https://datatables.net/releases/DataTables-1.9.1.zip [2415658/2415658] -> "DataTables-1.9.1.zip" [1]

Downloading YUI...
2024-03-04 21:02:15 URL:https://yui.github.io/yui2/archives/yui_2.9.0.zip [14294111/14294111] -> "yui_2.9.0.zip" [1]
/data/srv/couchdb/3.2.2/install/stagingarea/tmp/yui /data/srv/couchdb/3.2.2/install/stagingarea/tmp /data
/data/srv/couchdb/3.2.2/install/stagingarea/tmp /data
Removing old couchapps...
Installing new couchapps...
/data
Cleaning up!
Installing ACDC app into database: http://localhost:5984/acdcserver
Installing GroupUser app into database: http://localhost:5984/acdcserver
Installing ReqMgrAux app into database: http://localhost:5984/reqmgr_auxiliary
Installing ReqMgr app into database: http://localhost:5984/reqmgr_workload_cache
Installing ConfigCache app into database: http://localhost:5984/reqmgr_config_cache
Installing WorkloadSummary app into database: http://localhost:5984/workloadsummary
Installing LogDB app into database: http://localhost:5984/wmstats_logdb
Installing WMStats app into database: http://localhost:5984/wmstats
Installing WMStatsErl app into database: http://localhost:5984/wmstats
Installing WMStatsErl1 app into database: http://localhost:5984/wmstats
Installing WMStatsErl2 app into database: http://localhost:5984/wmstats
Installing WMStatsErl3 app into database: http://localhost:5984/wmstats
Installing WMStatsErl4 app into database: http://localhost:5984/wmstats
Installing WMStatsErl5 app into database: http://localhost:5984/wmstats
Installing WMStatsErl6 app into database: http://localhost:5984/wmstats
Installing WMStatsErl7 app into database: http://localhost:5984/wmstats
Installing T0Request app into database: http://localhost:5984/t0_request
Installing WorkloadSummary app into database: http://localhost:5984/t0_workloadsummary
Installing LogDB app into database: http://localhost:5984/t0_logdb
Installing WMStats app into database: http://localhost:5984/tier0_wmstats
Installing WMStatsErl app into database: http://localhost:5984/tier0_wmstats
Installing WMStatsErl1 app into database: http://localhost:5984/tier0_wmstats
Installing WMStatsErl2 app into database: http://localhost:5984/tier0_wmstats
Installing WMStatsErl3 app into database: http://localhost:5984/tier0_wmstats
Installing WMStatsErl4 app into database: http://localhost:5984/tier0_wmstats
Installing WMStatsErl5 app into database: http://localhost:5984/tier0_wmstats
Installing WMStatsErl6 app into database: http://localhost:5984/tier0_wmstats
Installing WMStatsErl7 app into database: http://localhost:5984/tier0_wmstats
Installing WorkQueue app into database: http://localhost:5984/workqueue
Installing WorkQueue app into database: http://localhost:5984/workqueue_inbox

@todor-ivanov
Copy link
Contributor

todor-ivanov commented Mar 11, 2024

Hi @khurtado,
Sorry for the delay. I finally added the pieces which would implement the same logic for parsing and adding credentials to the CouchDB container, similar to the one used in the MariaDB container. Please give it a try.

p.s. I was hoping to merge this before our meeting today, but I could not provide the code on time... Anyway, we can still merge it later, though.

@khurtado
Copy link
Contributor

khurtado commented Mar 11, 2024

Hi @todor-ivanov
I just tested and found a problem likely with sed.

I created the following files:

cmst1@vocms0265:admin $ pwd
/data/dockerMount/admin
cmst1@vocms0265:admin $ ls couchdb/
CouchDB.secrets
cmst1@vocms0265:admin $ ls wmagent/
WMAgent.secrets

There, I have COUCH_ROOT and COUCH_ROOTPASS in CouchDB.secrets and COUCH_USER and COUCH_PASS in the agent secrets.

Then, I build and run the container. I can verify those files were binded into /data/admin/{couchdb,wmagent} inside the container.

The content of /data/srv/couchdb/3.2.2/config/locals.ini ended up wrong though, with the following format (note the duplicated couch user and couchRoot not being replaced:

[admins]
cmst1 = somepasshere
cmst1 = somepasshere
couchRoot = Fixme
; produser = FIXME
; unittestagent = passwd

Extra info:

Couch secrets looks like this

COUCH_ROOT=root
COUCH_ROOTPASS=somepasswordhere

WMAgent like this:

COUCH_USER=cmst1
COUCH_PASS=Somepasswordhereaswell

The original locals.ini of course is like this:

[admins]
couchRoot = Fixme
; produser = FIXME
; unittestagent = passwd

@todor-ivanov
Copy link
Contributor

todor-ivanov commented Mar 11, 2024

thanks @khurtado

Well one thing is for sure:

the couchRoot user was not replaced, because you did not configure couchRoot in your CouchDB.secrets, but rather root.... This makes me think: Shouldn't we enforce the same logic as with MariaDB - the user running the service, should mandatory be the root (admin) user for the databse. This helps also to keep consistent in the logic for all services. That way, local OS username and the service root (admin) user configured in the CoucDB.secrets file should be the same. This also simplifies the code, but requires an extra user to be configured in the WMAgent.secrets file which most probably should not be put in the local.ini file, but rather pushed through the regular REST interface as this is explained here: https://docs.couchdb.org/en/stable/intro/security.html#creating-a-new-user

As of the duplicated cmst1 account, It is indeed interesting to investigate. Could you please send mi the logs as well.

  • If the container is still running:
docker logs couchdb
  • if not, maybe the init.log from your last CouchDB initialization may have survived:
cat   /data/dockerMount/srv/couchdb/current/logs/run.log

@khurtado
Copy link
Contributor

@todor-ivanov Ahh, thank you!

I deleted the container, but once I changed COUCH_ROOT=couchRoot things worked fine and the couch user did not get duplicated either.
With that in mind, some small description like:

# COUCH_ROOT: admin user defined in the locals.ini couchDB configuration file (default: couchRoot)

would be useful.

the couchRoot user was not replaced, because you did not configure couchRoot in your CouchDB.secrets, but rather root.... This makes me think: Shouldn't we enforce the same logic as with MariaDB - the user running the service, should mandatory be the root (admin) user for the databse. This helps also to keep consistent in the logic for all services. That way, local OS username and the service root (admin) user configured in the CoucDB.secrets file should be the same. This also simplifies the code, but requires an extra user to be configured in the WMAgent.secrets file which most probably should not be put in the local.ini file, but rather pushed through the regular REST interface as this is explained here: https://docs.couchdb.org/en/stable/intro/security.html#creating-a-new-user

I'm confused here. Let's say we run as cmst1. If we set the CouchDB admin to be cmst1, then who is the couchDB user?

@todor-ivanov
Copy link
Contributor

todor-ivanov commented Mar 11, 2024

hi @khurtado

I'm confused here. Let's say we run as cmst1. If we set the CouchDB admin to be cmst1, then who is the couchDB user?

I get it, so am I. We need to invent a new one. So far (just like it was with MariaDB) we've never had a user with only database access rights. We were allways running boldly with the server admin user dealing also with access from the WMAgent itself to the database. Which was not a big deal until now, as long as everything was running in the same installation. But now we are on the path of splitting the databases from the WMAgent service. So we better have:

  • one user defined as a Service admin user (the cmst1 one) defined at CouchDB.secrets and filled in the local.init
  • one user defined as database user (we invent a new name) defined at WMAgent.secrets and conigurred only at the database level, using the REST interface provided

Then here are some more details on how to give that new user the proper database only access level.

@amaltaro
Copy link
Contributor Author

My experience with authz in CouchDB tells me that this setup isn't simple, as users need to be granted privileges by database (and I think the allowed operations as well).

Is there any problem in sticking with the current setup (single admin user for running the service and accessing the database)? If such changes are not required at the moment, then I would suggest to come back to this once it's no longer an urgent migration.

@khurtado
Copy link
Contributor

@todor-ivanov Thank you! I think I get it now. In that case, I do agree that is the ideal but it should likely be separated from this current issue, which is trying to replicate exactly what we had but in containers, that includes software versions and user account setup.

@todor-ivanov
Copy link
Contributor

hi @khurtado @amaltaro,

Ok then, Please take a look at my latest commit to the PR in the CMSKubernetes repository: dmwm/CMSKubernetes@ca5ddbc . It fixes the double account issue. And it is now using only whatever password is provided at the WMAgent.secrets file.

@todor-ivanov
Copy link
Contributor

With all the comments from the PR review addressed, I have pushed one image to the registry as well. Here is the rsult from running this from both local repository and from CERN registry:

  • From local builds:
cmst1@vocms0290:wmagent-couchdb $ ./couchdb-docker-run.sh -t 3.2.2
Starting the couchdb:3.2.2 docker container with the following parameters:  --user cmst1
c891b19e299ecf85257c396348789aa99c3de21cceb2d5ea9b637e50d28cf77d

cmst1@vocms0290:wmagent-couchdb $ docker ps 
CONTAINER ID   IMAGE                         COMMAND      CREATED         STATUS         PORTS     NAMES
c891b19e299e   local/wmagent-couchdb:3.2.2   "./run.sh"   4 seconds ago   Up 3 seconds             couchdb
  • From CERN registry:
cmst1@vocms0290:wmagent-couchdb $ ./couchdb-docker-run.sh -t 3.2.2 -p
Pulling Docker image: registry.cern.ch/cmsweb/wmagent-couchdb:3.2.2
3.2.2: Pulling from cmsweb/wmagent-couchdb
Digest: sha256:3c6e1c80de0f736ad0da04cebe5cdfda6ebf9ea40c5b1fd81ad84024f7d1734c
Status: Image is up to date for registry.cern.ch/cmsweb/wmagent-couchdb:3.2.2
registry.cern.ch/cmsweb/wmagent-couchdb:3.2.2
Starting the couchdb:3.2.2 docker container with the following parameters:  --user cmst1
335c02fa07be2f7cea26ea03f337ec37cc6dd0ecec866c11f36e935bf3534a2f

cmst1@vocms0290:wmagent-couchdb $ docker ps 
CONTAINER ID   IMAGE                                    COMMAND      CREATED         STATUS         PORTS     NAMES
335c02fa07be   registry.cern.ch/wmagent-couchdb:3.2.2   "./run.sh"   5 seconds ago   Up 5 seconds             couchdb
cmst1@vocms0290:wmagent-couchdb $ dokcer exec -it couchdb bash
bash: dokcer: command not found
cmst1@vocms0290:wmagent-couchdb $ docker exec -it couchdb bash
(CouchDB-3.2.2) [cmst1@vocms0290:data]$ 
(CouchDB-3.2.2) [cmst1@vocms0290:data]$ manage status 
ME : 3.2.2
TOP : /data
ROOT : /data/srv
CFGDIR : /data/srv/couchdb/3.2.2/config
LOGDIR : /data/srv/couchdb/3.2.2/logs
STATEDIR : /data/srv/couchdb/3.2.2/state
KEYFILE : /data/srv/couchdb/auth//hmackey.ini

COUCH_ROOT_DIR : /data
COUCH_BASE_DIR : /data/srv/couchdb
COUCH_STATE_DIR : /data/srv/couchdb/3.2.2/state
COUCH_INSTALL_DIR : /data/srv/couchdb/3.2.2/install
COUCH_CONFIG_DIR : /data/srv/couchdb/3.2.2/config

3.2.2 is RUNNING

With this I'd say we are ready for merging and closing this issue as well.

@todor-ivanov
Copy link
Contributor

resolved by: dmwm/CMSKubernetes#1409

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
containerization deployment Issue related to deployment of the services WMAgent
Projects
Status: Done
Development

No branches or pull requests

3 participants