-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new component Tier0Feeder #1960
Comments
metson: Milestone T0 2_0_0 deleted |
hufnagel: For each run/stream combination we need to setup a fileset, subscription and either a repack or express workflow. The association between run/stream and subscription is kept in T0AST in a run_stream_sub_assoc table, the association to fileset is via another join on the wmbs_subscription table. New unused streamer files will be automatically (all inside a single Oracle query) feed into the correct target fileset based on the run_stream_sub_assoc table. New streamer files without an association in the run_stream_sub_assoc table will trigger a RunConfig population to setup the needed filesets/subscriptions/workflows and association. When a run ends and closes out, this is passed along to the repacking and express processing by closing the filesets. The job splitters check on the open status of their input filesets and take the appropriate actions. This also means that we eliminate the concept of "late" arriving data. First there shouldn't be any because with the run and lumi closing we should have a full accounting from the StorageManager. If there is anything beyond that, even if the transfer system inserts streamer files into the Tie0 they will not be processed until they have been feed into the repack/express input filesets and that does not happen after the filesets are closed. |
hufnagel: Please Review |
mnorman: {{{
My two cents. |
mnorman: So, elaboration, I can't make python load the T0Feeder at all, unless I move the T0.src.python.WMComponent to T0.src.python.T0Component. This should be an arbitrary change but it fixes most of my problems. I think this may be a defensive measure on python's part. After all, you can run: and it can't let that load two directories. It may only allow one top-level name in the path. In other notes, the code still doesn't work. In the Tier0FeederPoller the code:
Gives me a syntax error. I thought it was just the end of the log statement, but when I fixed that it seemed to get really confused... |
hufnagel: Ok, I am not going to reuse the WMComponent path then and instead use T0Component. Apart from the syntax error in that try/except/else clause there is a problem with transaction handling, likely related to the fact that the DAO has two queries in it. I found a way that works for now but could cause problems if for some reason the first query succeeds and the second does not (only way I can see that happening is if there is some Oracle server problem hitting at exactly the right/wrong time). I'll leave this as is for now and open another ticket to take another look later. In the current form, the Tier0Feeder works for me, it picks up new data, configures the run and run/stream settings, it creates the input filesets for repack/express, sets up the subscriptions etc. I then get a followup error in the JobCreator that some information is missing, but I'll also open another ticket for that. For reviewing this, just make sure the unit tests work (both RunConfig and Tier0Feeder). Starting the Tier0Feeder is non-trivial, you would need a new version of the manage script from #2967 with an additional change to also install the T0.WMBS schema with the agent (which would break pure WMAgent deployment). Again, I'll open another ticket for that to work on proper deployment procedures, but I do not want to hold this one hostage. |
hufnagel: Please Review |
mnorman: Here are the outputs: [mnorman@cms-xen39 Tier0Feeder_t]$ python2.6 Tier0Feeder_t.py .Ran 1 test in 38.347s [mnorman@cms-xen39 RunConfig_t]$ python2.6 RunConfig_t.py .Ran 1 test in 20.199s OK Things look good to me, so I think we can pass this back for commit. |
hufnagel: (In 77cef11) Provide first Tier0Feeder version, fixes #1960 Signed-off-by: Dirk Hufnagel Dirk.Hufnagel@cern.ch |
When the Tier0 detects new data, it will have to read it's configuration file and setup RunConfig in T0AST.Then it has to create WMSpecs for repacking and express and setup the input filesets for these. It then has to employ data feeders to populate these filesets. With the WMSpecs and input filesets work requests can go into an internal (shallow) Tier0 workqueue.
For he data feeders to work, they will use association tables between data type (run/stream) and input filesets. These association tables can be populated in the component, which means the data feeder can be somewhat generic based on the content of these association tables.
All the RunConfig related database code should be reviewed and possible simplified and improved in the process if porting it over. It should also be changed to DAOs to be consistent with other database access code in WMBS/Tier0.
The text was updated successfully, but these errors were encountered: