Refactor WMAgent initialization #1410

todor-ivanov · 2023-08-23T11:43:49Z

READY

With the current PR few modifications are suggested:

Start reading every parameter needed for the agent initialization process only from WMAgent.secrets file and create the config file based only on those (avoid reading any parameter from the command line)
Removes the stage of initializing stuff to the container at runtime (like copying the WMAgent secrets file every time from the host to the docker image and working only with the copy).
Removes the HostAdmin mount point and leaves the container to interact directly with the so provided WMAgent.secrets file.
Improves WMAgent.secrets validation, parsing and loading mechanism.
Improves the container runtime wrapper, to properly create current links to the actual wmagent tag to be used in the host mount area.
Adds a file with common definitions of functions between manage and run.sh.
Unifies the initialization steps validation between manage and run.sh in a single mechanism.
Splits actions between the run.sh and manage scripts such that
- run.sh takes care of all initialization and validation markers creation and keeping track of, while the other
- manage acts just as a sole executor of management commands and only checks if a given management action is allowed or not.
Adds fine grained initialization markers on every initialization step rather than on full areas as it was before
Removes most hard coded paths and rely only on the relational paths created at the Docker file stemming from a single $WMA_ROOT_DIR
Expose everything from the runtime on proper (extended) set of mount points.
The above is done in parallel with reorganization of the directory structure of the agent, such as:
- Removing extra nested levels. We no longer need
  * /data/srv/wmagnet/current/install/{wmagent,myslq,couchdb} or
  * /data/srv/wmagent/current/config/{wmagent,myslq,couchdb}
  * /data/srv/wmagent/current/logs/{wmagent,myslq,couchdb}
  * /data/srv/wmagent/current/state/{wmagent,myslq,couchdb}
  Because now every service is supposed to run in a separate container with it's won mount point at the host, so we now need the following structure:
  * /data/srv/wmagnet/current/{install,config,logs,state} and
  * /data/srv/couchdb/current/{install,config,logs,state} and
  * /data/srv/mysql/current/{install,config,logs,state}
- Adopt a common directory structure for all docker containers as described above - to be applied later to the wmagent-couch and wmagent-mysql containers as well
Improves variable naming convention by adding WMA_ prefix to anything related to the agent container and environment
Simplifies some manage functionalities and fixes others.
Adds minimal help to the manage script.
Adds automatic initialization of the whole environment at login time - improves operations. E.g manage script usage now is as simple as: [1]
Removes management functions related to databases manipulation from the manage script and prepares them to be adopted in the manage scripts for the relevant CouchDB and MariaDB docker container. (I've already prepared a big portion of them, so half of the future work done as well.)
Removes socket connections to the SQL database and connects only through 127.0.0.1
Improves status checks for all databases both Relational and CouchDB.
Adds checks of databases directly for both Mysql and Oracle, rather than, as of now, by simply tracking the Mysql server process (this PIDFile is not reachable anymore).
- Adds marking of the SQL database with the current WMAgent Build Id and hostname in order to prevent a container to database mismatch
- Adds a full schema validation at initialization time to protect database integrity.

[1]

$ docker exec -it wmagent bash
(WMAgent-2.2.3.2) [cmst1@:current]$ manage status 
----------------------------------------------------------------------
Status of services:
+ Status of CouchDB:
++ {"couchdb":"Welcome","version":"3.3.2","git_sha":"11a234070","uuid":"72e2936520d0a6104b7cfb64c986c6ab","features":["access-ready","partitioned","pluggable-storage-engines","reshard","scheduler"]}

+ Status of MySQL
Uptime: 139112  Threads: 1  Questions: 5744  Slow queries: 0  Opens: 151  Flush tables: 2  Open tables: 90  Queries per second avg: 0.041
+ MySQL connection is OK!

----------------------------------------------------------------------
Status of WMAgent components:
ERROR: This agent is not fully initialized. Cannot use it.
----------------------------------------------------------------------

todor-ivanov · 2023-08-24T22:36:02Z

hi @amaltaro, As the last time, I have no rights in this repository to ask for a review. I do not know why. But anyway.. Things are starting to get shape now. Feel free to look at the code at you convenience.

amaltaro · 2023-08-25T10:31:48Z

@todor-ivanov please consider adding a clear description of this pull request.
Once we settle with this development, please also update this wiki accordinly:
https://github.com/dmwm/WMCore/wiki/WMAgent-in-Docker

todor-ivanov · 2023-08-25T11:43:01Z

hi @amaltaro I am adding the description in a while.
Indeed, the documentation in this wiki is quite old and relates actually to this really old image we were having based o n the rpm packages. I can start modifying this one immediately actually.

todor-ivanov · 2023-08-25T12:44:33Z

@amaltaro , Sorry it took a while but I am done with the update of this PR description now. Please read it again.

todor-ivanov · 2023-08-30T17:25:33Z

@amaltaro, @vkuznet, @khurtado

The code is ready and tested
please feel free to start start reviewing it at your convenience

vkuznet · 2023-09-05T17:23:16Z

docker/pypi/wmagent/README.md

+
+
+### List of all config/.init* files
+* config/.initActive


Todor, why do you name all init files with dot prefix by making them hidden? Is there any concern about their visibility? Second question, where these files? and what is their structure? Which env/and other setup parameters they set. Finally, is there any order in their initialization?

Well..., this has been like this before. And mostly, having them hidden prevents them from messing up with the rest of the config and stateful files mounted at the host.

where these files

Previously those were scattered all over the WMAgent deployment area, depending to which step they were referring to. Now they are all concentrated only in the config directory. Meaning one can find all of them there:

From inside the container at /data/srv/wmagent/<WMA_TAG>/config/.init*

From the host at /data/dockerMount/srv/wmagent/<WMA_TAG>/config/.init*

Which env/and other setup parameters they set.

They hold only one variable - $WMA_BUILD_ID - which is the WMAgent image unique identifier set at build time

Finally, is there any order in their initialization?

yes, there is. I think I have listed them in the README in thee order of initialization. if I have not, I will fix that

vkuznet · 2023-09-05T17:24:41Z

docker/pypi/wmagent/README.md

+```
+[cmst1@unit02:current]$ docker kill wmagent
+
+[cmst1@unit02:current]$ rm /data/dockerMount/srv/wmagent/<WMA_TAG>/config/.init*


it is unclear to me why .init files should be removed?

The general mechanism is as follows:
On every container restart, the run.sh script goes and makes a full check of all .init * files if they:

are present and

are containing the same wma_build_id
if one of them is either missing or holding a different wma_build_id it means the step has never been properly completed and needs to be performed during this restart (and eventually any other step depending on it)

Same for force intialization. If one (for any reason) would want to repeat an initialization step togather with all it's dependent steps, he just need to delete the proper .init file.

docker/pypi/wmagent/Dockerfile

todor-ivanov · 2023-09-07T11:51:36Z

Hi @vkuznet,
I added the requested information to the README. You may find it in the latest version of the file: https://github.com/dmwm/CMSKubernetes/blob/7f775b51fd0cf810f2ee9cb6ca692634afbf214e/docker/pypi/wmagent/README.md Please go ahead and take another look.

todor-ivanov · 2023-09-08T12:04:26Z

Hi @amaltaro, @vkuznet, @khurtado,

A Major update on this....
I went a little bit further in the testing of this setup.
I was brave and decided to use this containerized setup for one of my next debugging. So I have deployed the docker image to vocms0260, using the basic databases setup as provided with the other two temporary PRs I have created for CouchDB and MariaDB. Then I connected the agent to cmsweb-testbed.cern.ch and this way I have tested not only the initialization process ... and all the scripts from this PR, but I also successfully started injecting work into the agent.
Here is one example:

https://cmsweb-testbed.cern.ch/reqmgr2/fetch?rid=tivanov_TaskChain_Prod_SiteBlockedList_v1_230908_100533_7777

p.s. I do keep the JobCreator down as I need to have this workflow blocked and not progressing any further than WMBS , so no condor jobs will be created out of this one, but I have no doubt, once I release this workflow it will go through successfully.

vkuznet

Todor, I went through this PR multiple times and it is very difficult to follow due to large number of changes. I mostly paid attention to high level logic which seems fine to me. Said that, I still do not understand why did you remove couch and maria DB configs from pypi/wmagent/etc area as they are needed to start corresponding DB. My feedback on this PR is the following. If you managed it to run it we should merge and iterate on missing or misbehaving parts and manage them in smaller chunks. But if @amaltaro will insist on detailed review of every single line then I bet it will require few hours of careful screenining and I still we may easily miss many things due to large impact/changes to lots of different places. The more productive approach it seems to me to apply it to WMA deployment and verify how it will go.

todor-ivanov · 2023-09-12T10:04:36Z

Thanks @vkuznet for your additional look into this PR.

I agree this is a big one, and yes it does change a lot of things, and yes it does work well. I am already using this setup for my constant debugging in vocms0260. People may login there, connect to the container and play around so they can get used to the environment.

As of:

I still do not understand why did you remove couch and maria DB configs from pypi/wmagent/etc area as they are needed to start corresponding DB

Those are not needed in the wmagent image any more. They are supposed to be moved in the relative CoucDB and MariaDB images which are the containers to drive those databases. Those two configs are mostly related to the respective server's configurations and access rights, and hardly to the databases themselves. The later are to be created during the initialization process and the pieces exercising it are part of the wmagent's code itself

This later is true at the very least for CouchDB. About the SQL wmagent database there are two different cases covered:

Oracle - predefined instances at the server, managed per user and identified during connection (no direct equivalence to the MySQL database per se, the closest one should be a tablespace - where the wmagent's schema should be created),
MariaDB - the container driving the SQL server needs to create the database at startup.

So long story short: other than which are the correct scripts from WMCore/bin to be called at init time ( through the command manage execute-agent .....) nothing more is needed in this container/image.

amaltaro

Todor, in addition to the comments made along the code, I wanted to leave more general questions/concerns here:

Please make a PR against the WMCore repo updating the 2 config/secrets file under: https://github.com/dmwm/WMCore/tree/master/deploy
Please create a new WMCore GH issue reporting what are the Oracle functionalities that are pending implementation (list any MariaDB pending functionalities as well please)
As mentioned along the code, the new relational database table - for container management - should be defined in the WMCore repository, likely under: https://github.com/dmwm/WMCore/tree/master/src/python/WMCore/Agent/Database
- We can address it in 2 ways, update this PR and create the relevant WMCore PR; Or create yet another WMCore ticket to re-refactor this. I would prefer the former, but given the changes to inject issues into this PR, maybe we should stick to the latter.
- In addition, I don't think it is a good idea to use case sensitive table/column names, as both Oracle and MariaDB are not case sensitive. The best is to use underscore in the variable names, e.g. init_param.
Concerning the manage-common.sh script. Right now (RPM) we need to source the "init.sh" agent file in order to get sqlplus/mysqldump/etc in the environment. If a similar case is needed for the container, then please update the top of that script with the needed information.
As previously discussed, if it is simple enough, please provide a data flow diagram (even in ASCII will be fine) with the order of scripts executed in a standard container initialization.
Is there any functionality currently present in https://github.com/dmwm/WMCore/blob/master/deploy/deploy-wmagent.sh which is not yet in run.sh? I guess one of those is the addition of HPC resources, but there could be others. I would suggest to make a GH issue for this.

docker/pypi/wmagent/bin/manage

docker/pypi/wmagent/bin/manage-common.sh

amaltaro · 2023-09-12T13:21:49Z

docker/pypi/wmagent/init.sh

+
+echo "Start sleeping now ...zzz..."
+
+while true; do sleep 10; done


Is this line just so the container keeps running? If that is the case, is there any reason to keep the start-agent line below?

Yes, this line is just for keeping the container alive. (Which BTW leads to another change in the wmagent_docker_run.sh, but I will fix that one separately).

Regarding the line bellow - we better remove it indeed.

@amaltaro , I've always been with the impression the names of those two scripts should be swapped. And now as I look at the ASCII diagram showing script relations, this impression only solidifies. But I did not proceed with the renaming because this would smear all the changes and the Review would become practically impossible... I may suggest, once we are ready for merging as a last step I should swap the names of those two scripts in a single commit separate from the rest. What do you think about it ?

@amaltaro , Sorry if it was not clear enough - I was talking about swapping the names of: init.sh <--> run.sh .
Such that run.sh gets back as a default entry point for Docker and just calls init.sh and goes into an endless loop, while init.sh drives the whole initialization process.

@todor-ivanov apologies for the delay. You have a better view of this than me, so please go ahead if you consider it better. If you do so though, please make sure the README file reflects it as well.

docker/pypi/wmagent/install.sh

amaltaro · 2023-09-12T13:26:22Z

docker/pypi/wmagent/install.sh

@@ -176,7 +189,7 @@ echo "Start $stepMsg"
 chmod +x $WMA_DEPLOY_DIR/deploy/renew_proxy.sh $WMA_DEPLOY_DIR/deploy/restartComponent.sh

 crontab -u $WMA_USER - <<EOF
-55 */12 * * * $WMA_DEPLOY_DIR/deploy/renew_proxy.sh
+55 */12 * * * $WMA_MANAGE_DIR/manage renew-proxy


Oh, this is where the continuous proxy renewal process is defined.

And the reason why it lives in install.sh instead of run.sh, is because it is a predefined logic independent of any runtime parameter. So no need to be part of the init process at all. It is is easily definable at build time. I may suggest while you are looking to the rest of the PR, if you notice there is more of the logic, which is independent of any runtime parameters or initial configuration parameters, to point them and we can make the effort to move them here as well.

docker/pypi/wmagent/run.sh

amaltaro · 2023-09-12T13:43:45Z

docker/pypi/wmagent/wmagent-docker-build.sh

@@ -52,15 +52,15 @@ done

 dockerOpts=" --network=host --progress=plain --build-arg WMA_TAG=$WMA_TAG "

-docker build $dockerOpts -t wmagent:$WMA_TAG -t wmagent:latest  .
+docker build $dockerOpts -t local/wmagent:$WMA_TAG -t local/wmagent:latest  .


Please educate me on why prefixing it with local?

I changed this line based on experience with with coucdb container. Because there was an upstream repository called couchdb which is actually the official one.

Same here - it would be good during local builds and tests one to distinguish whether he/she runs from cmsweb/wmagent or local/wmagent.

todor-ivanov · 2023-09-14T16:58:22Z

Hi @amaltaro,

Thanks for this Review. I think I have addressed all requested code changes, but let me try to answer your general comments one at a time:

Please make a PR against the WMCore repo updating the 2 config/secrets file under: https://github.com/dmwm/WMCore/tree/master/deploy

Here it is: dmwm/WMCore#11717

Please create a new WMCore GH issue reporting what are the Oracle functionalities that are pending implementation (list any MariaDB pending functionalities as well please)

Here it is: dmwm/WMCore#11720

As mentioned along the code, the new relational database table - for container management - should be defined in the WMCore repository, likely under: https://github.com/dmwm/WMCore/tree/master/src/python/WMCore/Agent/Database

Here I'd chose your second suggestion - to create a new issue just for moving the meta data table creation under wmcore repository .

Here it is: dmwm/WMCore#11721

Concerning the manage-common.sh script. Right now (RPM) we need to source the "init.sh" agent file in order to get sqlplus/mysqldump/etc in the environment. If a similar case is needed for the container, then please update the top of that script with the needed information.

With this implementation init.sh is completly obsolete. Now the manage and run.sh scripts load the common definitions from manage-common.sh on every execution, and the set of environment variables come mostly from 3 different places as explained in the README:

variables set at buildtime by Dockerfile:
variables set at runtime by sourcing $WMA_ENV_FILE
variables set at runtime defined inside the manage-common.sh file itself

Everything is supposed to be already in the environment, so as long as one sticks to the usage of the manage script for executing commands it would all be fine. There could be few cases when one would want to execute some of those common functions directly in the shell instead of through manage, but this is fairly easy as well - just the WMAGent.secrets file needs to be loaded like that:

(WMAgent-2.2.3.2) [cmst1@unit02:current]$ source $WMA_DEPLOY_DIR/bin/manage-common.sh
(WMAgent-2.2.3.2) [cmst1@unit02:current]$ _load_wmasecrets
(WMAgent-2.2.3.2) [cmst1@unit02:current]$ _renew_proxy

This is also explained in the README file.

As previously discussed, if it is simple enough, please provide a data flow diagram (even in ASCII will be fine) with the order of scripts executed in a standard container initialization.

Here it is:

Docker entrypoint: init.sh

+--------------------+     +--------+     +------------------+
|    run.sh          | --> |init.sh | <-- | manage-common.sh |
| (goes to inf loop) |     |        |     +------------------+
+--------------------+     |        |                     /
                           |        |     +--------+     /
                           |        | --> | manage | <--+
                           |        |     | <Step> |
                           |        |     +--------+
                           |        | --> .init<Step>
                              ....
                           |        |
                           |        |     +--------+
                           |        | --> | manage |
                           |        |     | <Step> |
                           |        |     +--------+
                           |        | --> .init<Step>
                           +--------+

And it was also added to the README file as well.

Is there any functionality currently present in https://github.com/dmwm/WMCore/blob/master/deploy/deploy-wmagent.sh which is not yet in run.sh? I guess one of those is the addition of HPC resources, but there could be others. I would suggest to make a GH issue for this.

Here it is: dmwm/WMCore#11722

Alan, please feel free to look into this PR again, while I am working on preparing the new set of issues.

todor-ivanov · 2023-09-14T21:42:49Z

@amaltaro
I think I have addressed all of your requests for changes + I have updated my previous comment reflecting the new issues created and the new info, so please read it again. And feel free to take another look of the end result at your convenience.
Thanks!

BTW: There was only one remark I could not find what it relates to:

In addition, I don't think it is a good idea to use case sensitive table/column names, as both Oracle and MariaDB are not case sensitive. The best is to use underscore in the variable names, e.g. init_param

Originating from the same reasoning I intentionally used only names of tables and columns with underscores instead of case sensitive naming. If you have found such please let me know so I can fix that, but I personally double checked and found none.

The table name is wma_init
The column names are as follows:
- initparam
- initvalue
The value names preserved are as follows:
- wma_build_id
- hostname
- is_active

amaltaro · 2023-09-14T21:57:11Z

@todor-ivanov I don't understand your statement above. Did you keep:

The column names are as follows:
initparam
initvalue

or not? My suggestion was to actually rename them to init_param and init_value.

todor-ivanov · 2023-09-15T05:48:47Z

My suggestion was to actually rename them to init_param and init_value.

That's easy enough. I thought you were left with the impression those names are following some case sensitive naming convention e.g. initParam etc. Never mind - I'll make the change in a minute.

todor-ivanov · 2023-09-15T06:33:54Z

docker/pypi/wmagent/run.sh

-        fi
-        [[ $errVal -eq 0 ]] && echo $WMA_BUILD_ID > $WMA_CONFIG_DIR/$service/.dockerInit
-    done
+    # TODO: This is to be removed once we decide to run it directly from the deploy area


@amaltaro I'd like to draw your attention to these two lines bellow. To me these are obsolete. I kept them just for legacy reasons, but the way how things are currently organized, there is no need to copy the manage script from the $WMA_DEPLOY_DIR to $WMA_MANAGE_DIR both are listed in the agent's PATH, so the script would be easily run from any of those locations. The only difference is that if we use it from $WMA_DEPLOY_DUR it would be run from the docker image, and would never show up in the host mount area ($WMA_MANAGE_DIR is equivalent currently to $WMA_CONFIG_DIR). The current situation gives us a little bit of freedom, though - we have the ability to test some new changes or fix bugs to the manage script without the need to build a new container every time. And this was the only reason I have not removed this step at once. I'd like to hear your opinion on this one as well.

todor-ivanov · 2023-09-16T01:08:44Z

@amaltaro @vkuznet @khurtado , Yet another report of tests here:

As you know, I am already using this setup at vocms0260 for debugging. So far I have been able to test only the correct population of database tables and major functionalities about workflow creation and synchronization between central and local work queue etc. etc...

But today I also released the two paused workflows from the local queue and let them propagate down to condor. And here is the result:

cmst1@vocms0260:/data/dockerMount/srv/wmagent/2.2.3.2/install $ condor_q 


-- Schedd: vocms0260.cern.ch : <188.185.29.211:4080?... @ 09/16/23 03:01:40
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI    SIZE CMD
8209.0   cmst1           9/16 01:40   0+00:00:00 I  190000  0.0 submit.sh tivanov_TaskChain_Prod_SiteBlockedList_v1_230908_100533_7777-Sandbox.tar.bz2 1 0
8209.1   cmst1           9/16 01:40   0+00:00:00 I  190000  0.0 submit.sh tivanov_TaskChain_Prod_SiteBlockedList_v1_230908_100533_7777-Sandbox.tar.bz2 2 0
8209.2   cmst1           9/16 01:40   0+00:00:00 I  190000  0.0 submit.sh tivanov_TaskChain_Prod_SiteBlockedList_v1_230908_100533_7777-Sandbox.tar.bz2 3 0
8209.3   cmst1           9/16 01:40   0+00:00:00 I  190000  0.0 submit.sh tivanov_TaskChain_Prod_SiteBlockedList_v1_230908_100533_7777-Sandbox.tar.bz2 4 0
8209.4   cmst1           9/16 01:40   0+00:00:00 I  190000  0.0 submit.sh tivanov_TaskChain_Prod_SiteBlockedList_v1_230908_100533_7777-Sandbox.tar.bz2 5 0
8209.5   cmst1           9/16 01:40   0+00:00:00 I  190000  0.0 submit.sh tivanov_TaskChain_Prod_SiteBlockedList_v1_230908_100533_7777-Sandbox.tar.bz2 6 0
8210.0   cmst1           9/16 03:01   0+00:00:00 I  190000  0.0 submit.sh tivanov_TaskChain_Prod_SiteBlockedList_v3_230915_234534_345-Sandbox.tar.bz2 7 0
8210.1   cmst1           9/16 03:01   0+00:00:00 I  190000  0.0 submit.sh tivanov_TaskChain_Prod_SiteBlockedList_v3_230915_234534_345-Sandbox.tar.bz2 8 0
8210.2   cmst1           9/16 03:01   0+00:00:00 I  190000  0.0 submit.sh tivanov_TaskChain_Prod_SiteBlockedList_v3_230915_234534_345-Sandbox.tar.bz2 9 0
8210.3   cmst1           9/16 03:01   0+00:00:00 I  190000  0.0 submit.sh tivanov_TaskChain_Prod_SiteBlockedList_v3_230915_234534_345-Sandbox.tar.bz2 10 0
8210.4   cmst1           9/16 03:01   0+00:00:00 I  190000  0.0 submit.sh tivanov_TaskChain_Prod_SiteBlockedList_v3_230915_234534_345-Sandbox.tar.bz2 11 0
8210.5   cmst1           9/16 03:01   0+00:00:00 I  190000  0.0 submit.sh tivanov_TaskChain_Prod_SiteBlockedList_v3_230915_234534_345-Sandbox.tar.bz2 12 0

Total for query: 12 jobs; 0 completed, 0 removed, 12 idle, 0 running, 0 held, 0 suspended 
Total for all users: 12 jobs; 0 completed, 0 removed, 12 idle, 0 running, 0 held, 0 suspended

I couldn't be happier...... Hope you like it as well.

amaltaro · 2023-09-25T13:30:41Z

@todor-ivanov we have 38 commits in this PR, which will make our life miserable if we need to go through the history of these files. On the other hand, I'd rather not risk squashing commits and losing developments along the way. Please let me know how you would like to proceed with the merge of this PR.

Note that it is not ready yet and there is one change that you wanted to make (swap run.sh and init.sh). So let us wait for that one.

todor-ivanov · 2023-09-25T20:16:51Z

Thanks for taking yet another look @amaltaro

First bullet. About:

we have 38 commits in this PR, which will make our life miserable if we need to go through the history of these files. On the other hand, I'd rather not risk squashing commits and losing developments along the way

Well I do not think there is any groundbreaking history in this development that would be lost and we will not be able to restore to a previous state if we need. So I went and merged those 38 commits in 2 - one for code change and one for README updates.

Second, I am about to swap the names of those two files run.sh and init.sh, because this will make the startup process more clear and all the containers' entrypoints will be aligned.
Third.... while working on the above I just found, why all Condor jobs were failing with error code: 99303. It was due to path misalignment between condor and the docker container. I fixed that with my last latest commit. I will squash the rest of the code commits later.

amaltaro · 2023-09-25T20:42:36Z

@todor-ivanov okay, please let me know when it's ready for a final look.

todor-ivanov · 2023-09-25T21:52:55Z

hi @amaltaro
Everything is now ready and tested. I have left the 3 new commits separate, so you may take a final look. Once you approve them, I can squash everything again to only 2 commits before the merge.

amaltaro · 2023-09-25T22:32:05Z

@todor-ivanov for your last commit to renew proxy once it is below 24h. Please change it to 7 days.

todor-ivanov · 2023-09-25T22:38:18Z

hi @amaltaro

for your last commit to renew proxy once it is below 24h. Please change it to 7 days.

Done

amaltaro · 2023-09-25T22:48:16Z

Thanks Todor. Please squash the commits accordingly and we can proceed with merging it.

… to HOST_MOUNT_DIR && Remove HOSTADMIN mount point. Typo Add all files needed for the new initialisation logic. Remaking the interaction between run and manage scripts Move *-new files as defaults && Remove unneded files. Move _renew_proxy to manage-common && Add manage renew-proxy command && Fix broken _parse_wmasecrets call && Fix broken aliases && Strip login variables and exports from WMA_ENV_FILE Remove redundant parameter to db-prompt && Fix password typos Fix missing return value from init-agent Add database status functions Fix forgotten return from bad WMAgent.secrets parsing Start building the full database checks mechanisms before agent startup Move _status_of_couch to manage-common && call it from run.sh Add rough, and currently broken, mechanism for databse validation Add check for empty database for mysql && reduce wmagent-docker* paramters && improve all functions' printout messages Implement clean-mysql && Reduce global definitions in manage script Remove wmaInitSqlDB when cleaning the whole agent as well. Reorder database checks and agent activation && Fix COUCH_HOST to be read from WMAgent.secrets && Improve printouts Write WMA_BUILD_ID and hostname at the sql database. Implement _sql_dbid_valid Move wmagent database creation to init-agent Stop cleaning databases on WMAgent.secrets file change && Fix RUCIO_HOME && Fix default wmaSchemaFile Implement a single function _exec_mysql to use for all calls to mysql client Fix _renew_proxy Switch -v to -t option in install.sh. Remove unneeded check_services function && Start the agent on a restart upon full intialization Cleanup manage script from old comment lines Fix sql schema dump options Add better pager options to the mysql db-prompt Addressing review comments Fix renamed variables from MYSQL_ADDRESS to MYSQL_HOST Rename wma_init columns Fix broken tree at the host for condor jobs. Fix broken tree at the host for condor jobs one level down - at /data/srv/wmagent instead of /data/srv .

Update README Update README Update README Update README Update README Adding ASCII diagram to the README file && Cleaning comments && Removing a redundant base check. Update README

Renew myProxy 7 days before expiration Renew myProxy 7 days before expiration

todor-ivanov · 2023-09-25T23:14:24Z

hi @amaltaro

Please squash the commits accordingly and we can proceed with merging it.

Done. I left the commit for swapping run.sh with init.sh separate, though. It is a significant change which is worth having it in it won commit. And I actually failed one attempt to move it up over the Update README commits and decided not to push too hard at the very end of the whole process. So let them be as they are (the 3 commits I mean).

amaltaro · 2023-09-26T00:44:39Z

Thanks Todor.

todor-ivanov changed the title ~~Refactor WMAgent initialization && Fix WMA_TAG && Rename WMA_ROOT_DIR to HOST_MOUNT_DIR && Remove HOSTADMIN mount point.~~ Refactor WMAgent initialization Aug 24, 2023

amaltaro requested review from vkuznet, amaltaro and khurtado September 5, 2023 14:55

vkuznet requested changes Sep 5, 2023

View reviewed changes

todor-ivanov requested a review from vkuznet September 7, 2023 11:51

vkuznet reviewed Sep 11, 2023

View reviewed changes

amaltaro requested changes Sep 12, 2023

View reviewed changes

todor-ivanov mentioned this pull request Sep 14, 2023

Add additional variables to WMAgent.secrets needed for the docker container intialization process dmwm/WMCore#11717

Merged

todor-ivanov requested review from amaltaro and vkuznet September 14, 2023 21:30

todor-ivanov commented Sep 15, 2023

View reviewed changes

amaltaro mentioned this pull request Sep 25, 2023

Deploy and run WMAgent with Docker container dmwm/WMCore#11314

Open

36 tasks

todor-ivanov force-pushed the WMAgent_RefactorInit branch from 8665806 to b6bb29e Compare September 25, 2023 18:41

todor-ivanov force-pushed the WMAgent_RefactorInit branch from 91c88d1 to ec7349c Compare September 25, 2023 21:39

todor-ivanov force-pushed the WMAgent_RefactorInit branch from 40675ae to 60b72bd Compare September 25, 2023 22:37

todor-ivanov added 3 commits September 26, 2023 01:04

Update README

91440eb

Update README Update README Update README Update README Update README Adding ASCII diagram to the README file && Cleaning comments && Removing a redundant base check. Update README

Swap run.sh and init.sh

4acb39b

Renew myProxy 7 days before expiration Renew myProxy 7 days before expiration

todor-ivanov force-pushed the WMAgent_RefactorInit branch from 60b72bd to 4acb39b Compare September 25, 2023 23:09

amaltaro approved these changes Sep 26, 2023

View reviewed changes

amaltaro merged commit 9fa6e00 into dmwm:master Sep 26, 2023

amaltaro mentioned this pull request Sep 26, 2023

Refactor/reorganize agent initialization and manage actions dmwm/WMCore#11627

Closed


		echo "Start sleeping now ...zzz..."

		while true; do sleep 10; done

Refactor WMAgent initialization #1410

Refactor WMAgent initialization #1410

Conversation

todor-ivanov commented Aug 23, 2023 • edited Loading

todor-ivanov commented Aug 24, 2023

amaltaro commented Aug 25, 2023

todor-ivanov commented Aug 25, 2023

todor-ivanov commented Aug 25, 2023

todor-ivanov commented Aug 30, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

todor-ivanov Sep 5, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

todor-ivanov commented Sep 7, 2023

todor-ivanov commented Sep 8, 2023 • edited Loading

vkuznet left a comment

Choose a reason for hiding this comment

todor-ivanov commented Sep 12, 2023

amaltaro left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

todor-ivanov commented Sep 14, 2023 • edited Loading

todor-ivanov commented Sep 14, 2023 • edited Loading

amaltaro commented Sep 14, 2023

todor-ivanov commented Sep 15, 2023 • edited Loading

todor-ivanov Sep 15, 2023 • edited Loading

Choose a reason for hiding this comment

todor-ivanov commented Sep 16, 2023

amaltaro commented Sep 25, 2023

todor-ivanov commented Sep 25, 2023 • edited Loading

amaltaro commented Sep 25, 2023

todor-ivanov commented Sep 25, 2023

amaltaro commented Sep 25, 2023

todor-ivanov commented Sep 25, 2023

amaltaro commented Sep 25, 2023

todor-ivanov commented Sep 25, 2023

amaltaro commented Sep 26, 2023

todor-ivanov commented Aug 23, 2023 •

edited

Loading

todor-ivanov Sep 5, 2023 •

edited

Loading

todor-ivanov commented Sep 8, 2023 •

edited

Loading

todor-ivanov commented Sep 14, 2023 •

edited

Loading

todor-ivanov commented Sep 14, 2023 •

edited

Loading

todor-ivanov commented Sep 15, 2023 •

edited

Loading

todor-ivanov Sep 15, 2023 •

edited

Loading

todor-ivanov commented Sep 25, 2023 •

edited

Loading