Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement cell id based on source #223

Merged
merged 3 commits into from
Jan 6, 2022
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 20 additions & 4 deletions lib/doconce/ipynb.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
import sys, shutil, os
import regex as re
import shlex
import hashlib
from .common import default_movie, plain_exercise, table_analysis, indent_lines, \
bibliography, fix_ref_section_chapter, cite_with_multiple_args2multiple_cites, \
_CODE_BLOCK, _MATH_BLOCK, DEFAULT_ARGLIST, envir_delimiter_lines
Expand Down Expand Up @@ -85,7 +86,7 @@ def ipynb_author(authors_and_institutions, auth2index,
s += ' at ' + ' & '.join(i)
s += ' -->\n'
# Add extra line between heading and first author
s+= '<!-- Author: --> \n_%s_' % (author)
s+= '<!-- Author: --> \n_%s_' % (author)
'''
# Write the authors
s += '_%s_' % (author)
Expand Down Expand Up @@ -752,13 +753,28 @@ def subst(m):
for cell in cells_output:
if cell:
cells.append(cell)

"""
By default, each cell gets a random unique ID
(currently, `uuid.uuid4().hex[:8]` in nbformat/corpus/words.py)
To ensure cells do not change ID every time they are compiled,
replace this ID with a hash of the 'source' field.
If needed, add a running number to duplicates to ensure uniqueness.
"""
# Replace the random id with a reproducible hash of the content (issue #213)
# nbformat 5.1.3 creates random cell ID with `uuid.uuid4().hex[:8]`
hashed_ids = {} # dict of created hashes
for i in range(len(cells)):
if 'id' in cell.keys():
cell_content = str(i) + cells[i]['source']
cells[i]['id'] = hashlib.sha224(cell_content.encode()).hexdigest()[:8]
cell_content = cells[i]['source']
hashed_id = hashlib.sha224(cell_content.encode()).hexdigest()[:8]
cells[i]['id'] = hashed_id
# check for dupliatce IDs
if hashed_id in hashed_ids:
# append running number for each next one,
# use the current count for this hash to append _1, _2, ...
cells[i]['id'] = hashed_id + "_" + str(hashed_ids[hashed_id])
# add to dict with existing IDs
hashed_ids[hashed_id] = hashed_ids.get(hashed_id, 0) + 1

# Create the notebook in string format
nb = new_notebook(cells=cells)
Expand Down