New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implement lightweight CRAB distribution as part of CMSSW distribution #4844
Comments
Update from November 6th: |
@lecriste @belforte please see cms-sw/cmsdist#5356 Once the cmsdist PR is merged then new crab wrapper scripts ( https://github.com/cms-sw/cms-common/blob/master/common/crab-prod {dev,pre} ) will be available under @belforte , this only solves the command-line usage of crab-client. The |
By the way, as mentioned during our meeting. Can we change crab so that the crab wrapper script always unset cmssw env (i.e by running |
hmmm... how would crab find python and pycurl then ?
Sorry if I did not understand things in the vidyo meeting. Was this discussed with Leo and Marco ?
…On 15/11/2019 07:08, Malik Shahzad Muzaffar wrote:
By the way, as mentioned during our meeting. Can we change crab so that the crab wrapper script always unset cmssw env (i.e by running |eval scram unset -sh|) and crab python sets it back when ever it has to deal with cmssw?
|
Sorry for poor communication. There can be a lot of python files involved,
but AFAIK the only binary dependence is pycurl. We do not use cjson anymore,
I removed references from the code but did not think about the spec file,
should I ? Could Leo ?
If further simplifications are needed, we will look into it.
Can you elaborate on the problem with CRAB python API ?
AFAIU (and I can be wrong which why I asked Marco's opinion) all that
CRAB needs after cmsenv is to put a bunch of stuff on $PYTHONPATH
And nothing in that stuff should have same names, and hence possibly
conflict, with CMSSW. What am I missing ?
…On 15/11/2019 07:06, Malik Shahzad Muzaffar wrote:
My initial understanding was that there is few simple python scripts for crab client which only depend on pycurl but it turned out (trying to build https://github.com/cms-sw/cmsdist/blob/comp_gcc630/crabclient.spec ) that it depends on many python and DMWM packages. For now I have simplified the build recipe ( https://github.com/cms-sw/cmsdist/pull/5356/files#diff-7d6e1f3d0fe3aaef69481cabb9985504 ) to just get directly the DMWM tools and only depend on |py2-pycurl py2-python-cjson|
|
crab wrapper script finds the latest version of I am not sure how exactly crab uses cmssw? If it is just running some cmssw commands to dump cmssw configuration then unsetting cmssw env in crab wrapper and setting it again (in sub-shell/process) before calling cmssw will work. But if crab imports cmssw python configuration then it will not work. |
problem with PYTHON API is that when someone imports |
one thing at a time
|
One problem here is that there's a large distance between what's needed to make CRABClient run, and what's declared in current spec files, and what's defined in current setup script. FWIW I could run crab submit, crab status and crab checkwrite (hence I suspect everyhing) after:
some additional definitions will be needed to e.g. make command autocompletion work, but I think that's what I mean when I say "CRAB Client is just a set of python script + pycurl" question is: what's the simplest way to configure things so that thin can easily setup and run. |
The "python version" question needs to be clarified. |
I do not think we have any active cmssw release where we have python 2.4. CMSSW_5_3 uses python 2.6 |
|
As long as it possible, should not be hard !
Would be good to know @amaltaro 's plans on this.
…On 18/11/2019 03:25, Malik Shahzad Muzaffar wrote:
|3 python version:|
If you have two different versions of CRAB for python2 and 3 then yes it will hard to setup the env. How hard it is to have crab client (and its dependencies e.g CarbServer, DBS and WMCore) to support both version of python?
|
Our goal for WMCore is to have it supporting both python2.7 and python3.x, at least until we can fully migrate it to python3. |
Great. I guess that if you ever decide to make http request stuff (pycurl wrappers)
incompatible with python 2.7, we can fork a frozen version inside CrabClient
repository. I guess we need to make our core py2/3 compatible, which for
some reasons was not fully done, iteration on dictionaries key/val is
still in py27 format. We'll see.
|
Crab client is now available
Can you please give it a try and see if you can run most of crab commands? [a]
|
In order to update crab in cmssw distribution one needs to request changes via PR for
For now all of these are using crab client |
I can't understand this: system pycurl does not have proper SSL support. Why do we build our own otherwise ? |
we should ask Yuyi to remove cjson from dbs. I'll follow up. |
I tested crab-prod and while other command seemed to work, submit fails
|
@lecriste does it work for you ? |
@belforte , pycurl just needs to load libcurl.so. As we added curl dependency for crab, so pycurl wiill load cms libcurl (which has SSL support) [a] System pycurl and system curl
[b] System pycurl and our curl
|
THanks, I think I understand the point about pycurl now. I am fine with keeping CRAB Client compatible with sytem python, I am also a bit puzzled how I can go about testing new code, should I make a personal copy of /cvmfs/cms-ib.cern.ch/week1/common/_crab-startup and edit it to point PYTHONPATH to my git area ? SImply stated, this looks different from what we talked about, or at least what I had in mind, and I am lost. |
We can move https://github.com/cms-sw/cmsdist/blob/IB/CMSSW_11_1_X/master/_crab-startup.file in to crab repository and make it work in standalone mode too. By the way, I noticed that crab needs to have cmssw PYTHONPATH available when it reads cmssw configuration. That was the reason you were not able to submit a job as mentioned here #4844 (comment) ) . I have proposed two changes
New crab startup script to set With these two changes I was able to submit a test job [a]
|
All in all "starting with 2020" is a good point to change naming standard. We could e.g. drop the |
@smuzaffar see answers/comments inline
OK. But surely it will be good to remove those old commands (which I did not even know existed !), who can do this ?
Let's stay with an explicit crab-setup for a while. In a few months, when user have transitioned and
Thanks ! |
@belforte , the old commands are part of new distribution mechanism and were installed as a part of CMSSW_11_1_0_pre1 release. New crab version (and its |
@smuzaffar as I see that you are going ahead with more work here, let me ask before you finalize next release: |
@belforte , I think I am mostly done with the changes. I was just waiting for CMSSW 11.1.0.pre2 to be out so that it can deploy the new crab-* versions. |
@smuzaffar do you know if anything changed in |
No nothing changed due to new crab. looks like /cvmfs/cms.cern.ch/crab3/crab_standalone.sh is missing or env CRAB_SOURCE_SCRIPT is not set properly before sourcing the crab,sh. I have no idea who creates/deploy this file
|
looks like there are broken symlinks
|
Ok, hopefully Bockjoo con help there. |
yes this is expected ... crab-prod/pre/dev commands which are available via /cvmfs/cms.cern.ch/common currently only work for slc7. No one should use these and we should not advertise these new commands to users. New crab packaging in cmssw should work for any arch (slc6, slc7 , cc8) but those will only be available once cmssw 11.1.0.pre is deployed. Still I would suggest not to tell users to use new commands, I would like you to test them before you inform cms users about these |
@smuzaffar @amaltaro I have found a small sticky point. I see 4 ways out and would like your opinion on which to choose or to suggest a better one
Kindly let me know. I think we can build CRABClient with old WMCore 1.2.8 until this is sorted out, but should not get locked out of WMCore updates. [1]
|
Currently my preference is for 3. so to give me time to do 2. As to going to dasgoclient... it is appealing, but of all tools around DBS Python API is the one with the best documentation IMHO, so it looks faster. The only drawback wrt using dasgoclient is that cjson is needed, but it is there already. |
opts 3 and 4 are not going to fix existing cmssw versions. Better to go for 1 |
I'm not very fond of it, but another possibility would be to actually move that function |
By the way, new crabclient 3.3.2001 has been distributed via CMSSW 11.1.0.pre2 and is available on lxplus (under In order to use crabclient python API, one needs to Can you please give it a try on lxplus. Again this is still under tests so please do not recommend these |
thanks. Yes, I will test carefully and get back to you. I will also make a PR to roll back WMCore from 1.2.9 to 1.2.8 until I have changed the code in order not to need py2-retry. I will implemento either 1. or 2. in my list. No action needed from @smuzaffar nor @amaltaro |
@smuzaffar crab commands autocompletion does not work. I guess this line Of course I keep testing by sourcing myself the |
@smuzaffar aside from "it works", about the overall design. I.e. how it is supposed to work. I do not see it as user-friendly that one has to surce some script only when using the python API. Is this because of the worry that adding CRAB stuff to PYTHONPATH may break something in CMSSW ? In any case it may easily lead to confusion on user sides at at minimum a lot of run/fail/oops/setup/runagain/swear-and-curse if not complain. I would find it simpler if we start like now : user needs to source something after "cmsenv" to use CRAB (also avoids that they source current setup on top of what is in CMSSW). Then if/when you are confident that CRAB does not break anything, we can make it part of cmsenv and dummyfy the setup script. Or am I confused and missing some important fact ? |
@belforte , looks like new cms-common packages was not installed for /cvmfs/cms.cern.ch . cms-common is a special package and needs special treatment from Bockjoo. I am going to ask him to update it. New cms-common should automatically source the autocomplete scripts for crab-* commands (/cvmfs/cms.cern.ch/share/etc/profile.d/S99crab-env.sh ). |
No, you did not confuse me. Yes, for users the instructions will be that they have to source some setup scripts. But during this test phase, I wanted to check the proof of concept that we can run command-line crab client without sourcing anything. For now sourcing /cvmfs/cms.cern.ch/common/crab-setup.sh only updates PYTHONPATH that is why I asked to source it (for your testing) for python API tests. But there is no harm to source it always. |
ThanksSent from. Stefano's phone
|
@smuzaffar Bockjoo updated cvmfs and now the command completion works. I presume this is intentional, even if a bit confusing. I am indeed intrigued by the magic in was prepared to tell users to fetch them gitHub, or to clone them to CRABClient/bin or whatever (they are extremely stable), but how do we run them ? Somehow inserting |
@belforte , Yes the python path is intentional, For the same reason above, I only made CRABAPI and CRABClient available via PYTHONPATH (after sourcing crab-setup.sh). This distribution is suppose to provide CRAB* API and not dbs. So user should import CRAB* first :-) |
I understand. Thanks. |
I have updated the spec for dev with the new CRABClient tag which should work with new WMCore.\cms-sw/cmsdist#5501 |
I have started the test. For now tst only make usre that we can built it. If you can provide us a simple
you an also provide a python script to test CRAB Python API functionality. By the way, why does crab always create crab.log file? Sometimes I run it from cvmfs direcotry and it fails to create this log file [a]. [a] |
thanks. Testing will hopefully come up care of our new operator.
About the log.. well it has been like this since ever, we try to keep
stdout terse but when someone reports a problem the full log is needed.
I do not know if there's a way to avoid it, but if useful it can be added,
I will have a look.
In any case things appear to work quite well. I'd like to understand
the process now: how/when get things deployed, what is automatic and
what require action, and again "what's the best way to use my
git clone in place of the libraries in CVMFS for developing",
and how do we roll back in case we screw up.
|
there is no option in CRABCLient to disable the log file. Let me know if you need it. I think it could be rather easy via an env.var, otherwise I have to learn the part of CRAB code which parses command arguments and add an option. |
Everything is not working satisfactorely in my opinion, ad the new procedures to change, deploy and use CRAB Client are documented in the twiki Latest CRAB Client works with latest WMCore (1.2.9) and in crab report compares result from DBS Python API and DASGOCLIENT. When I ill be sure of my handling of dasgoclient, we can stop usig DBS API i the client, but keep as dependency for distribution. Thanks all |
As per kickoff chat on Oct 30 . Participant were:
Steano, Marco Mascheroni, Leonardo, Bockjoo, Shahzad, David Lange.
Notes from the chat from Stefano. If anything wrong is written, my fault.
These are also in the Minutes in Indico:
https://indico.cern.ch/event/859940/
Thanks Leonardo for taking responsibility for this from now on.
Crab Client is only python 2.7 with only dependance from pycurl.
Crab developers do not want to bring any binary.
We agreed to try to move CRAB Client into an external distributed via CMSSW
distribution release. The idea is that CRAB Client and CMSSW can be
executed in the same environment. Many users have python scripts where
they call CRAB python API while manipulating CMSSW configuration, we
do not want to break those.
CrabClient will not be part of the release, so CRAB version stays the same whichever
CMSSW version people use. CRAB version will be updated when the distribution is
updated, i.e. at the time of any new release build (usually every two weeks).
The distribution update will push out whatever CRAB tag we set in the spec file
(in CMSSW repository) via a PullRequest. The new version will be available in the
IB in the meanwhile.
Given the above constrains we will have two versions of CRAB Client in the
CMSSW distribution: pre and prod, or beta an prod, or whatever... which we
will update via separate PR. So that we can update the PRE, make sure it
works, and only when sure, update production. If we push a bug in production
we may need to wait two weeks for the fix !!
If this works, Bockjoo can stop the current machinery for CRAB and will not need
to do anything.
We will most likely need to change/add some setup script in CRAB, details need
to be figured out.
Marco Mascheroni is available to help. Shahzad is willing to help us write
the needed spec file.
Leonardo is in charge of this effort and will communicate progress or problems.
Next week Marco will be at CERN and Leonardo, Shahzad and Marco will get
toghether to get things started.,
We do not need to have python3 compatible version. But of course in time
we may want to.
The text was updated successfully, but these errors were encountered: