Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tickets/dm 6459 #17

Merged
merged 31 commits into from
Jul 20, 2016
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
bdad6bd
make Butler class docstrings consistent
n8pease May 17, 2016
32fc8a9
add cfg and repo management code to butler
n8pease May 19, 2016
85cf64a
implement Butler input/output API and fix unit tests
n8pease May 24, 2016
02d56d4
implement repository tagging
n8pease May 31, 2016
52d223b
fix Policy to accomodate set
n8pease Jun 1, 2016
631f247
reuse RepoData from outputs if its cfg matches
n8pease Jun 6, 2016
34246b1
allow empty dicts as valid values when getting keys
n8pease Jun 7, 2016
1cfb8af
add DataId class with tag support and convert butler
Jun 8, 2016
e70330b
add mapper inference for cfg in repository
n8pease Jun 10, 2016
cf4eadd
make repository test use new execution code
n8pease Jun 22, 2016
cc178c2
change repository cfg to RepositoryArgs
n8pease Jun 28, 2016
c7ba1c7
fix typos
n8pease Jul 2, 2016
d256cb0
add dataId unit test
n8pease Jul 2, 2016
e9770bd
add support for RepositoryCfg version and fix parent cfg redundancy i…
n8pease Jul 5, 2016
8d9b691
remove old, unused assignment of globals = {}
n8pease Jul 6, 2016
fadbfc8
make Storage a proper base class (again)
n8pease Jul 7, 2016
e268dbe
add optimization code for RepoDataContainer.all
n8pease Jul 7, 2016
64a7917
clean up code that consructs repositories
n8pease Jul 8, 2016
15452cf
use dissimilar repo paths in tests
n8pease Jul 8, 2016
515261f
add WriteOnceCompareSame support and use on repo cfg
n8pease Jul 11, 2016
0ef38c7
change import statments to be relative
n8pease Jul 18, 2016
d989bc8
add docstrings for RepoData and RepoDataContainer
n8pease Jul 18, 2016
818c87b
add tag dostring to Butler.getKeys
n8pease Jul 18, 2016
a3a4f44
make butler.getKeys accept non-set tag arg
n8pease Jul 18, 2016
b17d368
improve queryMetadata unit test for new DataId
n8pease Jul 18, 2016
245428c
make imports from this package relative
n8pease Jul 18, 2016
76a22d6
use different folder names in different test classes
n8pease Jul 19, 2016
09a7e99
compare different object types in getDefaultMapper
n8pease Jul 19, 2016
2fb8145
add comment to explain impl of setify()
n8pease Jul 19, 2016
eabcc23
allow getMapperClass to retreive instance
n8pease Jul 19, 2016
ba289af
remove erroneous comment from test code
n8pease Jul 19, 2016
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
36 changes: 20 additions & 16 deletions python/lsst/daf/persistence/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,21 +22,25 @@

"""Python interface to lsst::daf::persistence classes
"""
from registries import *
from fsScanner import *
from persistenceLib import *
from butlerExceptions import *
from policy import *
from registries import *
from butlerLocation import *
from readProxy import *
from butlerSubset import *
from access import *
from posixStorage import *
from mapper import *
from repositoryMapper import *
from repository import *
from butler import *
from butlerFactory import *
from .utils import *
from .registries import *
from .fsScanner import *
from .persistenceLib import *
from .butlerExceptions import *
from .policy import *
from .registries import *
from .dataId import *
from .butlerLocation import *
from .readProxy import *
from .butlerSubset import *
from .access import *
from .repositoryCfg import *
from .storage import *
from .posixStorage import *
from .mapper import *
from .repositoryMapper import *
from .repository import *
from .butler import *
from .butlerFactory import *
from .version import *

470 changes: 359 additions & 111 deletions python/lsst/daf/persistence/butler.py

Large diffs are not rendered by default.

8 changes: 6 additions & 2 deletions python/lsst/daf/persistence/butlerFactory.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need this class at all? I think all the places that use it can be converted to just the Butler() constructor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do that but it'll touch maybe a dozen packages, and I think it would be more manageable to do it as part of a separate ticket.

"""This module defines the ButlerFactory class."""

from lsst.daf.persistence import Butler
from lsst.daf.persistence import Butler, RepositoryArgs, PosixStorage

class ButlerFactory(object):
"""ButlerFactory creates data Butlers containing data mappers. Use of it
Expand Down Expand Up @@ -70,4 +70,8 @@ def create(self):
@returns a new Butler.
"""

return Butler(None, mapper=self.mapper)
if hasattr(self.mapper, 'root'):
root = self.mapper.root
else:
root = None
return Butler(root=root, mapper=self.mapper)
6 changes: 3 additions & 3 deletions python/lsst/daf/persistence/butlerLocation.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,12 +47,12 @@ def __repr__(self):
'ButlerLocation(pythonType=%r, cppType=%r, storageName=%r, locationList=%r, additionalData=%r, mapper=%r)' % \
(self.pythonType, self.cppType, self.storageName, self.locationList, self.additionalData, self.mapper)

def __init__(self, pythonType, cppType, storageName, locationList, dataId, mapper, access=None):
def __init__(self, pythonType, cppType, storageName, locationList, dataId, mapper, storage=None):
self.pythonType = pythonType
self.cppType = cppType
self.storageName = storageName
self.mapper = mapper
self.access = access
self.storage = storage
if hasattr(locationList, '__iter__'):
self.locationList = locationList
else:
Expand All @@ -76,7 +76,7 @@ def to_yaml(dumper, obj):
"""
return dumper.represent_mapping(ButlerLocation.yaml_tag,
{'pythonType':obj.pythonType, 'cppType':obj.cppType, 'storageName':obj.storageName,
'locationList':obj.locationList, 'mapper':obj.mapper, 'access':obj.access, 'dataId':obj.dataId})
'locationList':obj.locationList, 'mapper':obj.mapper, 'storage':obj.storage, 'dataId':obj.dataId})

@staticmethod
def from_yaml(loader, node):
Expand Down
17 changes: 12 additions & 5 deletions python/lsst/daf/persistence/butlerSubset.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,8 @@

from __future__ import with_statement

from . import DataId

class ButlerSubset(object):

"""ButlerSubset is a container for ButlerDataRefs. It represents a
Expand Down Expand Up @@ -70,11 +72,11 @@ def __init__(self, butler, datasetType, level, dataId):
"""
self.butler = butler
self.datasetType = datasetType
self.dataId = dataId
self.dataId = DataId(dataId)
self.cache = []
self.level = level

keys = self.butler.getKeys(datasetType, level)
keys = self.butler.getKeys(datasetType, level, tag=dataId.tag)
if keys is None:
return
fmt = list(keys.iterkeys())
Expand Down Expand Up @@ -218,11 +220,13 @@ def subLevels(self):

return set(
self.butlerSubset.butler.getKeys(
self.butlerSubset.datasetType).keys()
self.butlerSubset.datasetType,
tag=self.butlerSubset.dataId.tag).keys()
) - set(
self.butlerSubset.butler.getKeys(
self.butlerSubset.datasetType,
self.butlerSubset.level).keys()
self.butlerSubset.level,
tag=self.butlerSubset.dataId.tag).keys()
)

def subItems(self, level=None):
Expand All @@ -238,7 +242,10 @@ def subItems(self, level=None):
"""

if level is None:
mappers = self.butlerSubset.butler.repository.mappers()
mappers = []
for repoData in self.butlerSubset.butler._repos.all():
if repoData.repo._mapper not in mappers:
mappers.append(repoData.repo._mapper)
if len(mappers) != 1:
raise RuntimeError("Support for multiple repositories not yet implemented!")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least this should say "Support for multiple mappers", since this PR is about multiple repositories 😃. It might be possible to use multiple mappers if they all had the same default level. (I'm a little concerned about the "todo" comment below, as well.)

mapper = mappers[0]
Expand Down
64 changes: 64 additions & 0 deletions python/lsst/daf/persistence/dataId.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
#!/usr/bin/env python

#
# LSST Data Management System
# Copyright 2016 LSST Corporation.
#
# This product includes software developed by the
# LSST Project (http://www.lsst.org/).
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the LSST License Statement and
# the GNU General Public License along with this program. If not,
# see <http://www.lsstcorp.org/LegalNotices/>.
#

import copy
import UserDict

class DataId(UserDict.IterableUserDict):
"""DataId is used to pass scientifically meaningful key-value pairs. It may be tagged as applicable only
to repositories that are tagged with the same value"""

def __init__(self, initialdata=None, tag=None, **kwargs):
"""Constructor

Parameters
-----------
initialdata : dict or dataId
A dict of initial data for the DataId
tag : any type, or a container of any type
A value or container of values used to restrict the DataId to one or more repositories that
share that tag value. It will be stored in a set for comparison with the set of tags assigned to
repositories.
kwargs : any values
key-value pairs to be used as part of the DataId's data.
"""
UserDict.UserDict.__init__(self, initialdata)
try:
self.tag = copy.deepcopy(initialdata.tag)
except AttributeError:
self.tag = set()

if tag is not None:
if isinstance(tag, basestring):
self.tag.update([tag])
else:
try:
self.tag.update(tag)
except TypeError:
self.tag.update([tag])

self.data.update(kwargs)

def __repr__(self):
return "DataId(initialdata=%s, tag=%s)" %(self.data.__repr__(), self.tag)
8 changes: 4 additions & 4 deletions python/lsst/daf/persistence/mapper.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@

import yaml

from lsst.daf.persistence import Policy
from . import Policy

"""This module defines the Mapper base class."""

Expand All @@ -41,14 +41,14 @@ class MapperCfg(Policy):
yaml_loader = yaml.Loader
yaml_dumper = yaml.Dumper

def __init__(self, cls, policy, access):
def __init__(self, cls, policy, storage):
super(MapperCfg, self).__init__()
self.update({'cls':cls, 'policy':policy, 'access':access})
self.update({'cls':cls, 'policy':policy, 'storage':storage})

@staticmethod
def to_yaml(dumper, obj):
return dumper.represent_mapping(RepositoryMapperCfg.yaml_tag,
{'cls':obj['cls'], 'policy':obj['policy'], 'access':obj['access']})
{'cls':obj['cls'], 'policy':obj['policy'], 'storage':obj['storage']})

@staticmethod
def from_yaml(loader, node):
Expand Down
30 changes: 30 additions & 0 deletions python/lsst/daf/persistence/policy.py
Original file line number Diff line number Diff line change
Expand Up @@ -376,6 +376,36 @@ def getStringArray(self, key):
val = [val]
return val

def __lt__(self, other):
Copy link
Contributor

@ktlim ktlim Jun 18, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it really make sense to be doing these types of comparisons on Policies? __eq__ and __ne__ are fine, but I'm not sure that the others are useful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it had to do with putting these into a container, these rich comparison methods were needed for that.

if isinstance(other, Policy):
other = other.data
return self.data < other

def __le__(self, other):
if isinstance(other, Policy):
other = other.data
return self.data <= other

def __eq__(self, other):
if isinstance(other, Policy):
other = other.data
return self.data == other

def __ne__(self, other):
if isinstance(other, Policy):
other = other.data
return self.data != other

def __gt__(self, other):
if isinstance(other, Policy):
other = other.data
return self.data > other

def __ge__(self, other):
if isinstance(other, Policy):
other = other.data
return self.data >= other

#######
# i/o #

Expand Down