New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tag remove preprocessor #640

Merged
merged 8 commits into from Aug 8, 2017

Conversation

Projects
None yet
3 participants
@mpacer
Copy link
Member

mpacer commented Aug 7, 2017

This PR implements tag based element filtering using a Preprocessor to remove entire cells, entire output areas, and individual outputs.

Cell removal is enabled by adding a tag to tags in the cell's metadata and then adding the tag to TagRemovePreprocessor.remove_cell_tags.

Output area removal is enabled by adding a tag to tags in the cell's metadata and then adding the tag to TagRemovePreprocessor.remove_all_outputs_tags.

Individual output removal is enabled by adding a tag to a tags field created in an output's metadata, and then adding the tag to TagRemovePreprocessor.remove_single_output_tags.

I have an idea talked through with @Carreau on how to additionally filter inputs (which would create invalid notebook objects), but that can't be done with a preprocessor (since preprocessors can only produce valid notebooks). Therefore, I'll leave that to another PR.

@mpacer mpacer force-pushed the mpacer:tag_preprocessor branch from 9eb211f to 447f99c Aug 7, 2017


remove_cell_tags = List(Unicode, default_value=[]).tag(config=True)
remove_all_outputs_tags = List(Unicode, default_value=[]).tag(config=True)
remove_single_output_tags = List(Unicode, default_value=[]).tag(config=True)

This comment has been minimized.

@takluyver

takluyver Aug 7, 2017

Member

Since the order doesn't matter here, I think you could make them sets instead of lists. That means we can use set operations to check if any tag matches:

self.remove_cell_tags.intersection(cell.get('metadata', {}).get('tags', []))

This comment has been minimized.

@mpacer

mpacer Aug 7, 2017

Author Member

Done

if 'metadata' in cell:
for field in self.remove_metadata_fields:
cell.metadata.pop(field, None)
if cell.get('outputs', {}):

This comment has been minimized.

@takluyver

takluyver Aug 7, 2017

Member

It doesn't really matter to the code, but for the sake of clarity, I'd make the fallback value a list rather than a dict. Or maybe even None.

This comment has been minimized.

@mpacer

mpacer Aug 7, 2017

Author Member

Done.


def preprocess_output(self, output, resources,
cell_index, output_index):
return output, resources

This comment has been minimized.

@takluyver

takluyver Aug 7, 2017

Member

Are you planning some future extension here, or can this method be discarded?

This comment has been minimized.

@mpacer

mpacer Aug 7, 2017

Author Member

I was partially trying to give people a handle on preprocessing individual outputs if they wanted to subclass (I thought it might be helpful for things like tag based preprocessing… but I'm realising now that something like that is more broad ranging and should probably be put into a different PR).

@takluyver

This comment has been minimized.

Copy link
Member

takluyver commented Aug 7, 2017

This basically looks good, I've just noted a few places where I think the code can be clearer/simpler. :-)

@mpacer

This comment has been minimized.

Copy link
Member Author

mpacer commented Aug 7, 2017

Should we be considering a major release soon? I want to add a new class to handle the input version of this, so maybe we should have it target 6.0?

@takluyver

This comment has been minimized.

Copy link
Member

takluyver commented Aug 7, 2017

Historically with IPython we did feature releases like '6.0' and bugfix rollups like '6.1', but since the big split we've tended more towards semver style versioning, so we increment the major version to indicate breaking changes. To some extent this is up to the people working on each individual project, though.

I'm not the biggest fan of semver - I think it ignores the complexity of real development, where almost every change can break something - but I know that when we're aiming for a new .0 release, it tends to become a Big Deal, and everyone wants their thing backported to the previous series. So I'm inclined to leave 6.0 until we actually want to break something.

@mpacer

This comment has been minimized.

Copy link
Member Author

mpacer commented Aug 7, 2017

K @takluyver would you say this is good to go?

@mpacer

This comment has been minimized.

Copy link
Member Author

mpacer commented Aug 7, 2017

@Carreau @minrk I'll merge if either of you look it over and give a 👍.

@minrk minrk merged commit f3dee67 into jupyter:master Aug 8, 2017

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@mpacer mpacer modified the milestone: 5.3 Aug 18, 2017

@mpacer mpacer added unlogged and removed unlogged labels Aug 31, 2017

@mpacer mpacer referenced this pull request Sep 1, 2017

Closed

Tags: an attempted overview #2826

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment