New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-38444 Add Sasquatch datastore #824
Conversation
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #824 +/- ##
==========================================
- Coverage 87.76% 86.90% -0.86%
==========================================
Files 268 269 +1
Lines 35175 35497 +322
Branches 7407 7462 +55
==========================================
- Hits 30871 30850 -21
- Misses 3150 3494 +344
+ Partials 1154 1153 -1
☔ View full report in Codecov by Sentry. |
7c40db8
to
92c8ee8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all this. I've made lots of comments and will take another look.
There needs to be some text added to doc/lsst.daf.butler/datastores.rst
explaining the datastore and, importantly, the configuration options.
doc/changes/DM-38444.md
Outdated
@@ -0,0 +1 @@ | |||
Introduced a special datastore to upload metric measurements to a Sasquatch instance when a MetricMeasurementBundle is stored with a butler.put call. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Introduced a special datastore to upload metric measurements to a Sasquatch instance when a MetricMeasurementBundle is stored with a butler.put call. | |
Introduced a special datastore to upload metric measurements to a Sasquatch instance when a `MetricMeasurementBundle` is stored with a `butler.put()` call. |
Also can the full name of the datastore be included so people know where to find it?
self._dispatcher.dispatchRef(inMemoryDataset, ref) | ||
except SasquatchDispatchPartialFailure: | ||
raise DatasetPutError( | ||
"One or more records may have failed to upload, or only partially succeeded" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a chained datastore this message is going to be swallowed. Do you want a log entry to be written somewhere? Should you include the sasquatch error text? Should you raise from e
? (I'm looking forward to python 3.11 where we can add this message to the other exception)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed around how exceptions are being handled
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually this is going to need some discussion
super().__init__(config, bridgeManager) | ||
|
||
# Name ourselves either using an explicit name or a name | ||
# derived from the (unexpanded) root |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# derived from the (unexpanded) root | |
# derived from the (unexpanded) root. |
|
||
# Name ourselves either using an explicit name or a name | ||
# derived from the (unexpanded) root | ||
self.name = self.config.get("name", "{}@{}".format(type(self).__name__, self.config["root"])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see an example yaml configuration in this PR so I'm not sure what root
means here. Are you always explicitly naming with the name
or should the URL be used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cribbed this from another datastore prior to fully understanding how it all worked and fit together, now that I do I think I need to re-visit that. I think the URL is reasonably unique
class SasquatchDatastore(GenericBaseDatastore): | ||
"""Basic Datastore for writing to an in Sasquatch instance. | ||
|
||
This Datastore is write only, meaning that it can dispatch data to a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This Datastore is write only, meaning that it can dispatch data to a | |
This Datastore is currently write only, meaning that it can dispatch data to a |
"""Basic Datastore for writing to an in Sasquatch instance. | ||
|
||
This Datastore is write only, meaning that it can dispatch data to a | ||
sasquatch instance, but at the present can not be used to retrieve values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sasquatch instance, but at the present can not be used to retrieve values. | |
Sasquatch instance, but at the present can not be used to retrieve values. |
return | ||
|
||
def validateKey(self, lookupKey: LookupKey, entity: DatasetRef | DatasetType | StorageClass) -> None: | ||
# Docstring is inherited from base class |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Docstring is inherited from base class | |
# Docstring is inherited from base class. |
return | ||
|
||
def getLookupKeys(self) -> set[LookupKey]: | ||
# Docstring is inherited from base class |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Docstring is inherited from base class | |
# Docstring is inherited from base class. |
def export_records(self, refs: Iterable[DatasetIdRef]) -> Mapping[str, DatastoreRecordData]: | ||
# Docstring inherited from the base class. | ||
|
||
# In-memory Datastore records cannot be exported or imported |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# In-memory Datastore records cannot be exported or imported | |
# Sasquatch Datastore records cannot be exported or imported. |
An additional point I've just noticed, you need to make sure that |
92c8ee8
to
cae7172
Compare
Do we want to insert a record into a table about who knows about this data? I would prefer not to at this moment because a) we cant fetch anything from Sasquatch anyway b) we cant do anything about the fact that someone might delete the record on the sasquatch side. Have a table that is wrong seems worse than not having a table entry c) the worst that will happen if someone runs something again is the same value will just be uploaded again. |
Regarding recording that the datastore has accepted it, I'm fine with us punting for now but we do need to address it at some point. |
c830e06
to
b30b1b3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clean ups.
@@ -37,3 +37,13 @@ class ValidationError(RuntimeError): | |||
"""Some sort of validation error has occurred.""" | |||
|
|||
pass | |||
|
|||
|
|||
class DatasetPutError(RuntimeError): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is used anywhere now so can be deleted.
Some minor changes were made in chained datastore to support the read only SasquatchDatastore going into analysis_tools. A note of the datastore was added to the datastore document.
Checklist
doc/changes