Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Programmatically run nbstripout and output new file. #57

Closed
stanleyng8 opened this issue May 30, 2017 · 7 comments
Closed

Programmatically run nbstripout and output new file. #57

stanleyng8 opened this issue May 30, 2017 · 7 comments

Comments

@stanleyng8
Copy link

I am trying to run nbstripout programmatically within a Python script and create a new file from the input file (and not in-place). The documentation doesn't have this use case covered. Is there a way I can do it?

@kynan
Copy link
Owner

kynan commented Jun 1, 2017

There is a (as you rightly point out undocumented) way of doing it programmatically - do what is done in main(). Something like this (untested):

import io
from nbstripout import strip_output, read, write, NO_CONVERT

if __name__ == '__main__':
    filename = ...
    outfile = ...
    with io.open(filename, 'r', encoding='utf8') as f:
        nb = read(f, as_version=NO_CONVERT)
    nb = strip_output(nb)
    with io.open(outfile, 'w', encoding='utf8') as f:
        write(nb, f)

@stanleyng8
Copy link
Author

stanleyng8 commented Jun 2, 2017

I tried the above in Python3.6 with Anaconda on Mac OsX and with just the import statement, I am getting the following error.

File "/Users/stanley/anaconda3/envs/py36/lib/python3.6/site-packages/nbstripout.py", line 96, in <module>
  output_stream = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8')

AttributeError: 'OutStream' object has no attribute 'buffer'

I looked at the nbstripout.py code and the code is specifically for Python3 yet this is where the error is coming from. Any ideas for a fix?

@kynan
Copy link
Owner

kynan commented Jun 9, 2017

Are you trying the import from within a notebook? This will not work because sys.stdout is a ipykernel.iostream.OutStream in that case.

@psthomas
Copy link
Contributor

I was trying to do something similar and might have a solution. If you want to do this from within a notebook, it's possible to run nbstripout as a part a bash command in a cell:

# Both original and the cleaned example are in the current working directory
notebook_path = './notebook.ipynb'
cleaned_path = './notebook_clean.ipynb'

# Bash commands within notebooks are preceeded by "!"
!cat {notebook_path} | nbstripout > {cleaned_path}

I've been running a command like this in a cell to make it more versatile and provide some feedback:

import os
from datetime import datetime

notebook_path = os.path.join(os.getcwd(),'notebook.ipynb')
cleaned_path = os.path.join(os.getcwd(),'notebook_clean.ipynb')

!cat {notebook_path} | nbstripout > {cleaned_path}

date_str = datetime.now().strftime('%Y-%m-%d %H:%M:%S')
print('{0} Cleaned file created at: {1}'.format(date_str, cleaned_path))
2017-06-23 12:57:40 Cleaned file created at: <current directory>/notebook_clean.ipynb

Note: I think there might be an error in readme.rst because the shell pipeline example doesn't use cat while the docstring in nbstripout.py does. I can open a separate issue if needed.

@stanleyng8
Copy link
Author

Thanks for the comments. I was running nbstripout with Python 3.6 as a script and not a notebook.

@kynan
Copy link
Owner

kynan commented Jun 24, 2017

@stanleyng8 I can not reproduce this issue: have tried with Python 3.6 from anaconda. Which nbstripout version are you using?

@stanleyng8
Copy link
Author

Thanks Florian but I have decided not to pursue this avenue any further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants