Add support for merging/concatenating multiple notebooks #253

fperez · 2016-02-22T19:00:07Z

This simple gist offers a command-line tool for concatenating/merging multiple notebooks. As requested by @jamespjh, this could be a useful nbconvert feature (it would also make it robust against evolution of the internal API for users, as they'd only have to remember the cmd line call, and we'd update the internals if the nbformat API changes).

The text was updated successfully, but these errors were encountered:

Carreau · 2016-02-22T19:05:33Z

I'm worried about the logic for merging metadata at notebook level, and why in many cases it is obvious what to do, I'm worried of the slippery slope we would get into when metadata differ.

fperez · 2016-02-23T02:27:33Z

I would simply make an explicit decision: the metadata is loaded so that it basically corresponds to that of the first nb in the list, plus keys from the others if they differ (the algorithm is simply to do meta.update() with all the notebooks in reverse order from the command line).

That's a simple, unambiguous choice with known semantics. If users don't like it, they can edit it back by hand later.

I don't see a problem with the feature having this constraint.

Carreau · 2016-02-23T02:32:32Z

Ok, I like a strong limitation like that. I came almost to the same conclusion while walking back home.

It might be hard to shoehorn that into the nbconvert structure itself, as right now it's constructed around the assumption that 1 exporter convert 1 notebook, and the looping on all the notebook is implicit, but we can likely arrange that.

Carreau · 2016-02-23T02:48:34Z

I propose to add a --merge flag that merge all the notebooks into one before feeding it to the rest of the pipeline. Metadata are as you proposed:

metadata = {}
for n in reversed(notebooks):
    metadata.update(n.metadata)

and for the name of the notebook (if needed) we use the first one.

This allow to not only merge, but merge (virtually) and generate a PDF/HTML version, at once.

fperez · 2016-02-23T02:51:53Z

+1

On Mon, Feb 22, 2016, 18:48 Matthias Bussonnier notifications@github.com
wrote:

I propose to add a --merge flag that merge all the notebooks into one
before feeding it to the rest of the pipeline. Metadata are as you proposed:

metadata = {}
for n in reversed(notebooks):
metadata.update(n.metadata)

and for the name of the notebook (if needed) we use the first one.

This allow to not only merge, but merge (virtually) and generate a
PDF/HTML version, at once.

—
Reply to this email directly or view it on GitHub
#253 (comment).

jamespjh · 2016-03-02T13:43:58Z

This would be great. I'm using @fperez nbmerge.py script from https://gist.github.com/fperez/e2bbc0a208e82e450f69 at the moment, and would be delighted to replace it with simple invocation of nbconvert.

chadlagore · 2016-06-28T23:22:06Z

+1 here. Using nbmerge.py fairly frequently as well.

aoboy · 2016-12-15T15:11:53Z

I am trying to use fperez version and I am getting the following errors..
Traceback (most recent call last):
File "nbmerge.py", line 49, in
merge_notebooks(notebooks)
File "nbmerge.py", line 38, in merge_notebooks
print(nbformat.writes(merged))
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2013' in position 2519513: ordinal not in range(128)

npyoung · 2016-12-15T21:07:25Z

Happen to be using Python 3, @aoboy? I think you're seeing this issue. If so, there's an easy fix mentioned in that thread.

aoboy · 2016-12-15T21:24:54Z

@npyoung I solved it.. using p2.7 actually.
I changed the line from print (nbformat.writes(merged)) to
print (nbformat.writes(merged).encode('utf-8'))
basically encoding is what was missing..

ketch · 2017-02-06T18:57:12Z

This capability would be very useful for a book I am currently working on, where each chapter is a Jupyter notebook. This feature would make it simpler to generate the print version.

Carreau · 2017-02-06T20:50:26Z

@ketch have a look at @takluyver's BookBook

chrisjsewell · 2017-07-07T00:21:57Z

Hey guys, I've just created a repo (ipypublish), with a simple workflow/scripts for creating/editing 'publication ready' scientific reports from one or more Jupyter Notebooks (containing matplotlib, pandas, scipy, ...), without leaving the browser. Sorry for the spam but, since I used the gist posted here (thanks!), I thought it might be nice to share.

In particular it would be great to get any feedback, especially in the case where future Jupyter versions might break (or enhance) this. Since I intend to write my doctoral thesis with it!

Ta, Chris

mpacer · 2017-07-07T01:11:33Z

@chrisjsewell Really cool project!! You might be interested in looking at Jupyter lab, it looks like your system is a beautiful application of the kind of workflow it makes possible & you will be able to influence the sevelopmebt of that interface to ensure that it can support true kinds of features you want going forward.

chrisjsewell · 2017-07-07T09:05:48Z

@mpacer thanks :) Yes I've seen a bit about it, looks good, I'll definitely be keeping tabs on it. I see you mentioning about easier manipulation of metadata (jupyterlab/jupyterlab#902), that's definitely relevant for my repo (chrisjsewell/ipypublish#1).

From the perspective of my research (atomic/quantum level simulations), I'm really interested in the interactive capability that javascript bridging is now offering for 3D graphics (ipywidgets, pythreejs, ipyvolume and my other repo pandas3js) and how it can be applied to the exploratory analysis -> publication workflow that Notebooks offer. Being out to 'pop' out a view of such a GUI to a separate window would definitely be pretty neat.

ketch · 2017-07-09T12:36:42Z

People interested in this thread may also be interested in this book project, which is a collection of notebooks viewable as PDF, HTML, or executable notebooks and runnable on binder or Microsoft Azure; it's not completely finished but is in an advanced state:

https://github.com/clawpack/riemann_book

We are using bookbook, among several other tools.

Yensan · 2018-01-31T02:31:17Z

Although I have finished reading, I have not got the HowTo thing. And nbmerge.py failed...
😕

mgeier · 2018-12-14T13:17:39Z

Since it hasn't been mentioned yet in this issue, let me suggest using https://nbsphinx.readthedocs.io/.

It basically concatenates notebooks and creates HTML pages or a LaTeX/PDF from them.

choldgraf · 2018-12-30T13:45:25Z

Just a note that this project sort-of exists now: https://github.com/jbn/nbmerge

(FWIW, I think it's better to have a separate tool than nbconvert do merging)

sfixedgear · 2019-12-12T08:39:48Z

ipynb files are JSON format. What I do is open in a new python notebook all the files I want to merge, and convert them to dicts, then you can use the 'cells' key to concatenate all the cells or whatever you want to do, so finally you convert this dict or dicts back to JSON and export it to a new file.

Here is an example where I import 2 different ipynb files, and merge them into a new ipynb file:

import json
import numpy as np

first file

with open('file1.ipynb', 'r') as file:
json_1 = file.read()
dict_1 = json.loads(json_1)
cells_1 = dict_1['cells']

second file

with open('file2.ipynb', 'r') as file:
json_2 = file.read()
dict_2 = json.loads(json_2)
cells_2 = dict_2['cells']

New file (merging the first and second files)

new_dict = dict_1.copy()
new_dict['cells'] = list(np.concatenate([cells_1, cells_2]))
with open('new_file.ipynb', 'w') as json_file:
json.dump(new_dict, json_file)

maximveksler · 2021-06-11T05:39:39Z

Does loading a notebook loading as a module feature offer an answer for the discussed use case? https://jupyter-notebook.readthedocs.io/en/stable/examples/Notebook/Importing%20Notebooks.html

Carreau added the question label Feb 22, 2016

Carreau added this to the wishlist milestone Feb 22, 2016

Carreau added good first issue great for new contributors URAP and removed question labels Feb 24, 2016

jbn mentioned this issue Apr 30, 2017

Add support for merging notebooks jbn/dissertate#1

Closed

Carreau removed the URAP label Jul 16, 2019

takluyver removed the good first issue great for new contributors label Nov 2, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for merging/concatenating multiple notebooks #253

Add support for merging/concatenating multiple notebooks #253

fperez commented Feb 22, 2016

Carreau commented Feb 22, 2016

fperez commented Feb 23, 2016

Carreau commented Feb 23, 2016

Carreau commented Feb 23, 2016

fperez commented Feb 23, 2016

jamespjh commented Mar 2, 2016

chadlagore commented Jun 28, 2016

aoboy commented Dec 15, 2016

npyoung commented Dec 15, 2016

aoboy commented Dec 15, 2016

ketch commented Feb 6, 2017

Carreau commented Feb 6, 2017

chrisjsewell commented Jul 7, 2017 •

edited

Loading

mpacer commented Jul 7, 2017

chrisjsewell commented Jul 7, 2017

ketch commented Jul 9, 2017

Yensan commented Jan 31, 2018 •

edited

Loading

mgeier commented Dec 14, 2018

choldgraf commented Dec 30, 2018

sfixedgear commented Dec 12, 2019

maximveksler commented Jun 11, 2021

Add support for merging/concatenating multiple notebooks #253

Add support for merging/concatenating multiple notebooks #253

Comments

fperez commented Feb 22, 2016

Carreau commented Feb 22, 2016

fperez commented Feb 23, 2016

Carreau commented Feb 23, 2016

Carreau commented Feb 23, 2016

fperez commented Feb 23, 2016

jamespjh commented Mar 2, 2016

chadlagore commented Jun 28, 2016

aoboy commented Dec 15, 2016

npyoung commented Dec 15, 2016

aoboy commented Dec 15, 2016

ketch commented Feb 6, 2017

Carreau commented Feb 6, 2017

chrisjsewell commented Jul 7, 2017 • edited Loading

mpacer commented Jul 7, 2017

chrisjsewell commented Jul 7, 2017

ketch commented Jul 9, 2017

Yensan commented Jan 31, 2018 • edited Loading

mgeier commented Dec 14, 2018

choldgraf commented Dec 30, 2018

sfixedgear commented Dec 12, 2019

first file

second file

New file (merging the first and second files)

maximveksler commented Jun 11, 2021

chrisjsewell commented Jul 7, 2017 •

edited

Loading

Yensan commented Jan 31, 2018 •

edited

Loading