Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow re-compression at copy to frontend #407

Merged
merged 4 commits into from
Mar 25, 2021
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
18 changes: 14 additions & 4 deletions strax/context.py
Original file line number Diff line number Diff line change
Expand Up @@ -1404,14 +1404,19 @@ def _apply_function(self, data, targets):
data = function(data, targets)
return data

def copy_to_frontend(self, run_id, target,
target_frontend_id=None, rechunk=False):
def copy_to_frontend(self,
run_id: str,
target: str,
target_frontend_id: int = None,
target_compressor: str = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth considering passing non-null but type-correct "empty" default args here? So an empty string for the target_compressor and something equivalent for target_frontend_id?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, let me change the typecasting to typing.Optional[str]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, fixed in f8e0809

See typing.Optional for the documentation: https://docs.python.org/3/library/typing.html#typing.Optional

rechunk: bool = False):
"""
Copy data from one frontend to another
:param run_id: run_id
:param target: target datakind
:param target_frontend_id: index of the frontend that the data should go to
in context.storage. If no index is specified, try all.
in context.storage. If no index is specified, try all.
:param target_compressor: if specified, recompress with this compressor.
:param rechunk: allow re-chunking for saving
"""
if not self.is_stored(run_id, target):
Expand Down Expand Up @@ -1440,7 +1445,7 @@ def copy_to_frontend(self, run_id, target,
(not self._is_stored_in_sf(run_id, target, t_sf) and
t_sf._we_take(target) and
t_sf.readonly is False)]

self.log.info(f'Copy data from {source_sf} to {target_sf}')
if not len(target_sf):
raise ValueError('No frontend to copy to! Perhaps you already stored '
'it or none of the frontends is willing to take it?')
Expand All @@ -1453,6 +1458,11 @@ def copy_to_frontend(self, run_id, target,
s_be = source_sf._get_backend(s_be_str)
md = s_be.get_metadata(s_be_key)

if target_compressor is not None:
self.log.info(f'Changing compressor from {md["compressor"]} '
f'to {target_compressor}.')
md.update({'compressor': target_compressor})

for t_sf in target_sf:
try:
# Need to load a new loader each time since it's a generator
Expand Down