Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new component Tier0Feeder #1960

Closed
hufnagel opened this issue Jul 15, 2011 · 9 comments
Closed

new component Tier0Feeder #1960

hufnagel opened this issue Jul 15, 2011 · 9 comments
Assignees

Comments

@hufnagel
Copy link
Member

When the Tier0 detects new data, it will have to read it's configuration file and setup RunConfig in T0AST.Then it has to create WMSpecs for repacking and express and setup the input filesets for these. It then has to employ data feeders to populate these filesets. With the WMSpecs and input filesets work requests can go into an internal (shallow) Tier0 workqueue.

For he data feeders to work, they will use association tables between data type (run/stream) and input filesets. These association tables can be populated in the component, which means the data feeder can be somewhat generic based on the content of these association tables.

All the RunConfig related database code should be reviewed and possible simplified and improved in the process if porting it over. It should also be changed to DAOs to be consistent with other database access code in WMBS/Tier0.

@drsm79
Copy link

drsm79 commented Aug 17, 2011

metson: Milestone T0 2_0_0 deleted

@hufnagel
Copy link
Member Author

hufnagel: For each run/stream combination we need to setup a fileset, subscription and either a repack or express workflow. The association between run/stream and subscription is kept in T0AST in a run_stream_sub_assoc table, the association to fileset is via another join on the wmbs_subscription table.

New unused streamer files will be automatically (all inside a single Oracle query) feed into the correct target fileset based on the run_stream_sub_assoc table. New streamer files without an association in the run_stream_sub_assoc table will trigger a RunConfig population to setup the needed filesets/subscriptions/workflows and association.

When a run ends and closes out, this is passed along to the repacking and express processing by closing the filesets. The job splitters check on the open status of their input filesets and take the appropriate actions.

This also means that we eliminate the concept of "late" arriving data. First there shouldn't be any because with the run and lumi closing we should have a full accounting from the StorageManager. If there is anything beyond that, even if the transfer system inserts streamer files into the Tie0 they will not be processed until they have been feed into the repack/express input filesets and that does not happen after the filesets are closed.

@hufnagel
Copy link
Member Author

hufnagel: Please Review

@DMWMBot
Copy link

DMWMBot commented Dec 22, 2011

mnorman: {{{

os.listdir('/uscms/home/mnorman/T0/src/python')
['T0', 'WMComponent']
os.listdir('/uscms/home/mnorman/T0/src/python/WMComponent')
['Tier0Feeder', 'init.py', 'init.pyc']
from WMComponent.Tier0Feeder import Tier0Feeder
Traceback (most recent call last):
File "", line 1, in
ImportError: No module named Tier0Feeder

shutil.copytree('/uscms/home/mnorman/T0/src/python/WMComponent', '/uscms/home/mnorman/T0/src/python/T0Component')
os.listdir('/uscms/home/mnorman/T0/src/python')
['T0', 'WMComponent', 'T0Component']
from T0Component.Tier0Feeder import Tier0Feeder
Traceback (most recent call last):
File "", line 1, in
File "/uscms/home/mnorman/T0/src/python/T0Component/Tier0Feeder/Tier0Feeder.py", line 15, in
from WMComponent.Tier0Feeder.Tier0FeederPoller import Tier0FeederPoller
ImportError: No module named Tier0Feeder.Tier0FeederPoller
}}}

My two cents.

@DMWMBot
Copy link

DMWMBot commented Dec 22, 2011

mnorman: So, elaboration, I can't make python load the T0Feeder at all, unless I move the T0.src.python.WMComponent to T0.src.python.T0Component. This should be an arbitrary change but it fixes most of my problems.

I think this may be a defensive measure on python's part. After all, you can run:
{{{
import WMComponent
}}}

and it can't let that load two directories. It may only allow one top-level name in the path.

In other notes, the code still doesn't work. In the Tier0FeederPoller the code:

  •    except:
    
  •        myThread.transaction.rollback()
    
  •        logging.exception("Can't feed data, bailing out..."
    
  •        raise
    

Gives me a syntax error. I thought it was just the end of the log statement, but when I fixed that it seemed to get really confused...

@hufnagel
Copy link
Member Author

hufnagel: Ok, I am not going to reuse the WMComponent path then and instead use T0Component. Apart from the syntax error in that try/except/else clause there is a problem with transaction handling, likely related to the fact that the DAO has two queries in it. I found a way that works for now but could cause problems if for some reason the first query succeeds and the second does not (only way I can see that happening is if there is some Oracle server problem hitting at exactly the right/wrong time).

I'll leave this as is for now and open another ticket to take another look later.

In the current form, the Tier0Feeder works for me, it picks up new data, configures the run and run/stream settings, it creates the input filesets for repack/express, sets up the subscriptions etc. I then get a followup error in the JobCreator that some information is missing, but I'll also open another ticket for that.

For reviewing this, just make sure the unit tests work (both RunConfig and Tier0Feeder). Starting the Tier0Feeder is non-trivial, you would need a new version of the manage script from #2967 with an additional change to also install the T0.WMBS schema with the agent (which would break pure WMAgent deployment). Again, I'll open another ticket for that to work on proper deployment procedures, but I do not want to hold this one hostage.

@hufnagel
Copy link
Member Author

hufnagel: Please Review

@DMWMBot
Copy link

DMWMBot commented Dec 29, 2011

mnorman: Here are the outputs:

[mnorman@cms-xen39 Tier0Feeder_t]$ python2.6 Tier0Feeder_t.py
You do not have WMAGENT_CONFIG in your environment
Using reference HLT config instead

.

Ran 1 test in 38.347s

[mnorman@cms-xen39 RunConfig_t]$ python2.6 RunConfig_t.py
You do not have WMAGENT_CONFIG in your environment
Using reference HLT config instead

.

Ran 1 test in 20.199s

OK

Things look good to me, so I think we can pass this back for commit.

@hufnagel
Copy link
Member Author

hufnagel commented Jan 7, 2012

hufnagel: (In 77cef11) Provide first Tier0Feeder version, fixes #1960

Signed-off-by: Dirk Hufnagel Dirk.Hufnagel@cern.ch

@ghost ghost assigned hufnagel Jul 24, 2012
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants