Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Notebook validation failed: Non-unique cell id #6001

Open
kohlerjl opened this issue Mar 6, 2021 · 49 comments
Open

Notebook validation failed: Non-unique cell id #6001

kohlerjl opened this issue Mar 6, 2021 · 49 comments

Comments

@kohlerjl
Copy link

kohlerjl commented Mar 6, 2021

I have recently started receiving this popup error frequently when saving notebooks:

Title: Notebook validation failed

The save operation succeeded, but the notebook does not appear to be valid. The validation error was:
-------------------------
Notebook validation failed: Non-unique cell id 'waiting-opening' detected. Corrected to 'noted-romania'.:
"<UNKNOWN>"

It does not occur in new notebooks, but seems to be triggered after copying and pasting cells from other notebooks. But it is now occurring frequently on two different systems, editing different sets of notebooks. Once it appears within a notebook, it reappears on every subsequent save, with the same 'non-unique cell id' detected, but a different 'corrected to ' value.

I've attached the full output of conda list, but the versions of some particularly relevant packages are:

jupyter 1.0.0 py37h03978a9_6 conda-forge
jupyter_client 6.1.11 pyhd8ed1ab_1 conda-forge
jupyter_console 6.2.0 py_0 conda-forge
jupyter_contrib_core 0.3.3 py_2 conda-forge
jupyter_contrib_nbextensions 0.5.1 py37hc8dfbb8_1 conda-forge
jupyter_core 4.7.1 py37h03978a9_0 conda-forge
jupyter_highlight_selected_word 0.2.0 py37h03978a9_1002 conda-forge
jupyter_latex_envs 1.4.6 py37hc8dfbb8_1001 conda-forge
jupyter_nbextensions_configurator 0.4.1 py37h03978a9_2 conda-forge
jupyterlab_widgets 1.0.0 pyhd8ed1ab_1 conda-forge
nbconvert 5.6.1 py37hc8dfbb8_1 conda-forge
nbformat 5.1.2 pyhd8ed1ab_1 conda-forge
notebook 6.2.0 py37h03978a9_0 conda-forge

@kohlerjl
Copy link
Author

kohlerjl commented Mar 6, 2021

conda_list.txt

@kevin-bates
Copy link
Member

This message is related to some fairly recent changes to nbformat that introduce cell id metadata for each notebook cell. In this case, the validation logic is encountering a duplicate cell-id that was previously, and randomly, generated 'waiting-opening' and encountering that same cell-id later in the notebook. It's strange that this occurs for the same cell-id value and is as if the corrected-to value is not getting persisted. I'm unable to reproduce this, but I suspect there are a few factors in play here.

I'm cc-ing @MSeal for comment as to what might be going on and how best to proceed (as I'd rather not recommend downgrading).

@MSeal MSeal closed this as completed Mar 6, 2021
@MSeal MSeal reopened this Mar 6, 2021
@MSeal
Copy link

MSeal commented Mar 6, 2021

Sorry I accidentally hit the close button. 👀 on this now

@MSeal
Copy link

MSeal commented Mar 6, 2021

I also am struggling to reproduce the event with a very similar dependency list. Even if I manually force cell ids to be invalid / equal the notebook server corrects it with version 6.2.0 before it ever get to the validation error reported here.

@kohlerjl could you post a notebook exhibiting the behavior? I am wondering if something in the file structure will lend a clue as to why the replaced cell-id is not being fixed.

@kohlerjl
Copy link
Author

kohlerjl commented Mar 7, 2021

Thanks for the quick follow up.

I spent some time trying to isolate a simple notebook that would reproduce the event. What I've found is that this behavior does not persist between closing and reopening the notebook, or even just refreshing the browser tab (leaving the kernel still running).

However, I've been able to consistently reproduce this behavior by following these steps:

  1. Create a new notebook and enter some code in a cell (i.e. 'a = 1')
  2. Save the notebook, then refresh the tab
  3. Copy the cell and paste a duplicate within the same notebook
  4. Save the notebook again, triggering the error

It appears that this behavior does not occur if you duplicate a cell created in the same 'session' (i.e. while the notebook is open in the tab). But If I copy and paste a cell created prior to opening/refreshing the notebook, either from the same notebook or a different notebook, then I get errors about duplicate ids. But the notebook changes do save, and I can simply refresh the tab to mitigate the errors.

I have attached a notebook I produced this way, which gave me the errors prior to reloading it. However, I can't see any difference in the file structure. test.ipynb.txt

I also cannot reproduce this behavior on another system, running Python 3.9.2 and:
ipython 7.19.0
jupyter-client 6.1.7
jupyter-console 6.2.0
jupyter-core 4.6.3
jupyter-server 1.4.1
nbconvert 6.0.7
nbformat 5.0.8
notebook 6.2.0

@kevin-bates
Copy link
Member

Thanks @kohlerjl. I'm still unable to reproduce this (sorry). Are you using the Notebook classic or Juptyer Lab interface? (Neither reproduces the issue for me however.)

I also cannot reproduce this behavior on another system, running Python 3.9.2 and:
...
nbformat 5.0.8

This would be because that version of nbformat doesn't contain this consistency check. I'm using nbformat 5.1.2, but, again, no luck. Really curious what else could be going on.

Btw, we started displaying the notebook server version in the console when the server is started in the 6.x timeframe. Could you please confirm the displayed value? The log entry should look similar to the following:

[I 10:06:42.433 NotebookApp] Jupyter Notebook 6.2.0 is running at:

@kohlerjl
Copy link
Author

I can confirm that the notebook server reports "Jupyter Notebook 6.2.0 is running at:" on startup.

I've been using Notebook classic. I installed Jupyter Lab and tried the same procedure, but cannot reproduce the behavior there.

I might just transition to using Jupyter Lab going forward, since that seems to be more actively developed now.

@kevin-bates
Copy link
Member

I can confirm that the notebook server reports "Jupyter Notebook 6.2.0 is running at:" on startup.

Thank you. I'll have to defer to @MSeal on this one.

I've been using Notebook classic. I installed Jupyter Lab and tried the same procedure, but cannot reproduce the behavior there.

Just a point of clarity, Jupyter Lab >= 3 uses a different server (jupyter_server) whereas Lab < 3 still uses notebook as its server - although I believe the issue here lies in the front-end where cells are manipulated. As a result, it might be helpful to know which version of Lab did not reproduce this behavior and whether you've tried this with Lab < 3.

... Jupyter Lab ... seems to be more actively developed now.

Yes, that is absolutely the case.

@njohnsson
Copy link

I get a similar error saying cell ID was corrected to 'domestic-communist'. From comments above it seems cell-id names are randomly generated. Maybe that randomization algorithm should be changed a bit. In this case I thought I was dealing with a virus...

Title: Notebook validation failed
The save operation succeeded, but the notebook does not appear to be valid. The validation error was:
Notebook validation failed: Non-unique cell id 'moving-ultimate' detected. Corrected to 'domestic-communist'.:
""

I am running:
Classic Notebook (Not Jupyter Lab), v 6.2.0; Ubuntu; Chrome Browser
Python 3.8.6 | packaged by conda-forge | (default, Oct 7 2020, 19:08:05)

@MSeal
Copy link

MSeal commented Jun 2, 2021

@njohnsson make sure you have the latest package versions in your environment. We changed the id algorithm to use hashes and invalidated some of the older packages with name based id generation because it was creating problematic ids and marked the older nbformat packages as deprecated.

Similar to @kevin-bates I've struggled to reproduce the issue in classic. That being said classic is not being actively developed. If you're looking for the same simple look-n-feel I'd suggest using Retrolab or NBClassic with lab instead as new package capabilities will slowly be less supported in classic over time.

@njohnsson
Copy link

@MSeal: OK, I will update package versions, but just FYI: I the latest message I got ended with "....Non-unique cell id 'civil-spring' detected. Corrected to 'lesbian-voluntary'."

@EricThomson
Copy link

EricThomson commented Jun 6, 2021

I have been getting this as I do a lot of multi-cell ctrl-c/ctrl-v in my standard jupyter notebook lately (not jupyterlab). I am in jupyter_client 6.1.12 in Windows 10 installed using conda, working in firefox with Pytyon 3.7.6.

It seems to go away when I close/reopen the nb.

@MSeal
Copy link

MSeal commented Jun 6, 2021

@njohnsson You can see why we revoked the nbformat packages. It was horribly problematic and attempts to correct the lexicon where still showing problematic combinations. The new version (5.1.3) just uses hashes.

@EricThomson FYI the change needed is around nbformat and notebook server. The jupyter_client package is mostly unrelated.

@hadivafaii
Copy link

hadivafaii commented Jun 7, 2021

This worked for me as a temporary fix:

import nbformat as nbf
from glob import glob

import uuid
def get_cell_id(id_length=8):
    return uuid.uuid4().hex[:id_length]

# your notebook name/keyword
nb_name = 'my_notebook'
notebooks = list(filter(lambda x: nb_name in x, glob("./*.ipynb", recursive=True)))

# iterate over notebooks
for ipath in sorted(notebooks):
    # load notebook
    ntbk = nbf.read(ipath, nbf.NO_CONVERT)
    
    cell_ids = []
    for cell in ntbk.cells:
        cell_ids.append(cell['id'])

    # reset cell ids if there are duplicates
    if not len(cell_ids) == len(set(cell_ids)): 
        for cell in ntbk.cells:
            cell['id'] = get_cell_id()

        nbf.write(ntbk, ipath)

@irmagaladi
Copy link

This worked for me as a temporary fix:

import nbformat as nbf
from glob import glob

import uuid
def get_cell_id(id_length=8):
    return uuid.uuid4().hex[:id_length]

# your notebook name/keyword
nb_name = 'my_notebook'
notebooks = list(filter(lambda x: nb_name in x, glob("./*.ipynb", recursive=True)))

# iterate over notebooks
for ipath in sorted(notebooks):
    # load notebook
    ntbk = nbf.read(ipath, nbf.NO_CONVERT)
    
    cell_ids = []
    for cell in ntbk.cells:
        cell_ids.append(cell['id'])

    # reset cell ids if there are duplicates
    if not len(cell_ids) == len(set(cell_ids)): 
        for cell in ntbk.cells:
            cell['id'] = get_cell_id()

    nbf.write(ntbk, ipath)

Also for me! Thank you very much for sharing.
I cannot copy-paste cells in my jupyter notebook... because this error always appears afterwards.
But with this code, the error disappears. After running it, what worked for me is to save and on the message that appears on a new windows click on "Reload".

@aloosley
Copy link

aloosley commented Jul 8, 2021

I have been getting this as I do a lot of multi-cell ctrl-c/ctrl-v in my standard jupyter notebook lately (not jupyterlab). I am in jupyter_client 6.1.12 in Windows 10 installed using conda, working in firefox with Pytyon 3.7.6.

It seems to go away when I close/reopen the nb.

I appear to have the same issue after copying / pasting cells (pop_os! 20.10, python 3.9.5, jupyter 1.0.0). Thanks for the fix everybody.

@EricThomson
Copy link

EricThomson commented Jul 14, 2021

Workaround I've been using: cut in command mode (blue margin) but paste in edit mode (green margin).

No more errors.

Downside: when you paste in edit mode it all gets thrown into one cell, and it is put in code mode, so if you have a ton of formatted cells with lots of markdown, you will have to redo that). For my use case it is not that big of a deal so I'm pretty happy with this workaround.

@shankari
Copy link

shankari commented Jul 24, 2021

@MSeal I am still seeing name-based cell ids, even on nbformat 5.1.3.

Background:

  • I have a repository with some notebooks checked in.
  • The notebooks don't have outputs embedded.
  • My goal is to be able to diff the notebooks and review them and to follow a standard source control format

Since around March of this year, the "cell id" has been causing problems with this approach, since when I use "Reset kernal and Clear Outputs" to clear my outputs before checking in, all the ids change. This makes it really hard to identify the real changes.

I found this PR today, so I upgraded to nbformat 5.1.3

Downloading and Extracting Packages
folium-0.12.0        | 64 KB     | ################################################################### | 100%
nbformat-5.1.3       | 47 KB     | ################################################################### | 100%
...

$ conda list | grep nb
libblas                   3.9.0                8_openblas    conda-forge
libcblas                  3.9.0                8_openblas    conda-forge
liblapack                 3.9.0                8_openblas    conda-forge
libopenblas               0.3.12          openmp_h54245bb_1    conda-forge
nbclient                  0.5.1                      py_0    conda-forge
nbconvert                 6.0.7            py37hf985489_3    conda-forge
nbformat                  5.1.3              pyhd8ed1ab_0    conda-forge
widgetsnbextension        3.5.1            py37hf985489_4    conda-forge

I then ran "Kernel -> Restart and Clear Output". The new ids generated are still name-based and don't appear to be hashes - e.g.

-   "id": "cutting-april",
+   "id": "furnished-webcam",
    "metadata": {},
    "source": [
     "### Read data and setup variables"
@@ -119,7 +131,7 @@
   {
    "cell_type": "code",
    "execution_count": null,
-   "id": "external-westminster",
+   "id": "french-place",

Is this expected?

My project is open source, so I have an environment and notebooks that I can share with you. But it seems like what you really need are logs. Happy to send you as many as you like if you let me know where to get them.

This is currently 100% reproducible for me.

@shankari
Copy link

shankari commented Jul 25, 2021

Good news: I created a new notebook instead of editing an existing one, and now the IDs do seem to be hashes!
Bad news: The ids still seem to change although there are no changes in the cell

See e-mission/e-mission-eval-private-data@d81a234 for an example. There are no changes in the several of the cells, but the hashes have changed.

@ginward
Copy link

ginward commented Jul 29, 2021

I am having similar issues ...

@chivalry123
Copy link

any solution yet?

@SoundBoySelecta
Copy link

SoundBoySelecta commented Aug 16, 2021

Similar issues, the only action I was performing was copying a horizontal line


and pasting it else where, It has something to do with what ET mentioned... first time I copied the horizontal line in command (blue) mode, and then pasted without creating a new cell, so it was pasted in command mode when a cell was highlighted (existing empty cell), the second time I copied it but pasted the markdown into a cell in edit more (green). Wasn't able to replicate it in a new page. Whats also interesting (not sure if intentional), when I paste the horizontal like in command mode, vs the markdown code into a cell in edit mode, the colour of the lines are two different shades.

Screen Shot 2021-08-15 at 11 35 12 PM

@SoundBoySelecta
Copy link

I used download as, saved file as a .ipynb, reopened with no errors. One method I was thinking of if the error persists is import the raw format which is a dict of dict and write some code to get all "id" and get the duplicate id, count the index position of the duplicate "id", then delete that cell in either the nb or the dict of dicts.

@JMBurley
Copy link

I just updated my notebook packages yesterday and started getting this error when I copy paste anything (perhaps because I am working with notebooks made on older versions on jupyter notebook).

It is pretty crippling to workflow. Any copy-paste of a cell means you have to close and reopen the notebook to stop the save warnings popping up every few mins. Not ideal.

@JMBurley
Copy link

@kevin-bates @MSeal Is this issue extant and does it have a path to solution?

@hadivafaii posted a solution that could be a fallback in the save routine to stop this problem ever reaching the end user.

@caitlinpetro
Copy link

+1 also still having this problem

@MSeal
Copy link

MSeal commented Sep 27, 2021

The solution has to be done in the frontend here of Jupyter. I don't maintain any of the JS code here but the path @hadivafaii makes should be how the save operation need to change in JS side of things.

@jasongrout
Copy link
Member

jasongrout commented Sep 29, 2021

FYI, here is the issue and pull request we made in jlab for supporting cell ids.

@blois
Copy link

blois commented Sep 29, 2021

Is this related to #5928? It should be clearing the ID before putting the item on the clipboard but appears that something is going wrong there.

I'm having trouble repro'ing, but would it make sense to revert #5928? Being unable to save is worse than unstable cell IDs. Though there may just be an edge case which was missed in the PR and can be cleaned up.

@MSeal
Copy link

MSeal commented Sep 29, 2021

I not positive reverting cell id awareness would resolve the issue. Also it causes every notebook save to replace ids if we revert it which caused a lot of personal DMs to me about git diffs generated from other projects 😂 .

@blois I can reproduce but I sometimes had to refresh the page twice or restart the server to get it to have the duplicate cell id. I'm not sure why local state was affecting it. Is there a second path for copy where it wouldn't clear the cell id in the buffer?

@blois
Copy link

blois commented Sep 29, 2021

@MSeal can nbformat's constraints be relaxed to resolve the conflict by re-generating conflicting keys? I think that's what users want in the end- for the tools to make a best effort but continue working.

@MSeal
Copy link

MSeal commented Sep 29, 2021

@blois it's actually doing that already. It's just complaining about it the in the UI as a warning on each save once in this state:

Notebook JSON is invalid: Non-unique cell id '6a14e3ed' detected. Corrected to 'cabc0751'.

is the message sent back and the save isn't halted.

@JMBurley
Copy link

JMBurley commented Oct 13, 2021

@blois To be clear, The problem isn't that saving is prevented (it actually saves fine).

The problem is that once the error occurs, it will pop up that warning EVERY. TIME. IT. SAVES. And the popup steals focus from whatever cell you were typing in at the time. And the popup needs a mouse interaction to get rid of it.

All of which is, to put it mildly, a bit annoying and not conducive to a good workflow. I had to make the resave code an importable function because I have to use it so frequently to fix continually breaking notebooks.

@SoundBoySelecta
Copy link

SoundBoySelecta commented Oct 14, 2021 via email

@eric-ljk
Copy link

eric-ljk commented Nov 3, 2021

This solution worked for me. https://stackoverflow.com/a/69291092/17288157.

I created a new folder. I put the code from the link in a notebook in the folder, and put the problematic notebook in the same folder. I then changed 'my_notebook' (that was assigned to nb_name) to the filename of my notebook, with the extension.

After running the code from the link, the problematic notebook could save with no issues.

The code was as follows:

import nbformat as nbf
from glob import glob

import uuid
def get_cell_id(id_length=8):
    return uuid.uuid4().hex[:id_length]

# -- SETUP
nb_name = 'my_notebook'

# -- MAIN
# grab notebook
notebooks = list(filter(lambda x: nb_name in x, glob("./*.ipynb", recursive=True)))

# iterate over notebooks
for ipath in sorted(notebooks):
    # load notebook
    ntbk = nbf.read(ipath, nbf.NO_CONVERT)
    
    cell_ids = []
    for cell in ntbk.cells:
        cell_ids.append(cell['id'])

    # reset cell ids if there are duplicates
    if not len(cell_ids) == len(set(cell_ids)): 
        for cell in ntbk.cells:
            cell['id'] = get_cell_id()

        nbf.write(ntbk, ipath)

@saraafernandeez
Copy link

saraafernandeez commented Nov 3, 2021

Same thing happened to me and I found the solution. First select all the cells in the Jupyter Notebook, then press the "cut/scissors" button (don't panic, at this point your Notebook will be empty), and finally press the "paste" button.

This worked for me after a long time !! I hope it helps you.

@eric-ljk
Copy link

eric-ljk commented Nov 3, 2021

@saraafernandeez Wonderful solution, thanks! Tested it.

@ricejack
Copy link

@saraafernandeez this is by far the easiest solution, especially if you have a lot of markdown cells. thank you!

@alainrafiki
Copy link

This worked for me as a temporary fix:

import nbformat as nbf
from glob import glob

import uuid
def get_cell_id(id_length=8):
    return uuid.uuid4().hex[:id_length]

# your notebook name/keyword
nb_name = 'my_notebook'
notebooks = list(filter(lambda x: nb_name in x, glob("./*.ipynb", recursive=True)))

# iterate over notebooks
for ipath in sorted(notebooks):
    # load notebook
    ntbk = nbf.read(ipath, nbf.NO_CONVERT)
    
    cell_ids = []
    for cell in ntbk.cells:
        cell_ids.append(cell['id'])

    # reset cell ids if there are duplicates
    if not len(cell_ids) == len(set(cell_ids)): 
        for cell in ntbk.cells:
            cell['id'] = get_cell_id()

        nbf.write(ntbk, ipath)

Thanks, worked for me!

@luisjvca-menhir
Copy link

I had the same problem. Thank you for the temporary fix! In any case, I hope it will be properly fixed in newer versions :)

@sergiomora03
Copy link

Same thing happened to me and I found the solution. First select all the cells in the Jupyter Notebook, then press the "cut/scissors" button (don't panic, at this point your Notebook will be empty), and finally press the "paste" button.

This worked for me after a long time !! I hope it helps you.

Thanks @saraafernandeez!!!

@YuyaR
Copy link

YuyaR commented Mar 21, 2022

For me the problem was actually duplicate cells -- cells with exact same input and output. Once I deleted those Github was able to render the notebook!

@DeepakSaini119
Copy link

DeepakSaini119 commented May 10, 2022

This worked for me as a temporary fix:

import nbformat as nbf
from glob import glob

import uuid
def get_cell_id(id_length=8):
    return uuid.uuid4().hex[:id_length]

# your notebook name/keyword
nb_name = 'my_notebook'
notebooks = list(filter(lambda x: nb_name in x, glob("./*.ipynb", recursive=True)))

# iterate over notebooks
for ipath in sorted(notebooks):
    # load notebook
    ntbk = nbf.read(ipath, nbf.NO_CONVERT)
    
    cell_ids = []
    for cell in ntbk.cells:
        cell_ids.append(cell['id'])

    # reset cell ids if there are duplicates
    if not len(cell_ids) == len(set(cell_ids)): 
        for cell in ntbk.cells:
            cell['id'] = get_cell_id()

        nbf.write(ntbk, ipath)

Thanks, running this code in the notebook and reloading the notebook solved the issue for me.

@vivianamarquez
Copy link

I've had the same problem for over a year using different computers

@sergiomora03
Copy link

Same thing happened to me and I found the solution. First select all the cells in the Jupyter Notebook, then press the "cut/scissors" button (don't panic, at this point your Notebook will be empty), and finally press the "paste" button.

This worked for me after a long time !! I hope it helps you.

@vivianamarquez Maybe, this way could help u?

@guzmanojero
Copy link

I had the same issue. I don't understand why it happened, as nothing changed in my environment.

I have many files in a folder and this error occurs in all of them.
It even happens in files that used to work well, but I didn't touch in a while (like not copying/pasting any cells).

PARTIAL SOLUTION: I just copied the files into a different folder and all worked fine.

:0

@GabrielOsuna
Copy link

I have recently started receiving this popup error frequently when saving notebooks:

Title: Notebook validation failed

The save operation succeeded, but the notebook does not appear to be valid. The validation error was:
-------------------------
Notebook validation failed: Non-unique cell id 'waiting-opening' detected. Corrected to 'noted-romania'.:
"<UNKNOWN>"

I just had the same error when copy-pasting cells, on Jupyter. Duplicating the nb worked for me and have not received the error message "Notebook validation failed: Non-unique cell id '########' detected. Corrected to '########'.: UNKNOWN" again. :)

@Ilham-Rofii
Copy link

My Solution: I upload my .ipynb file to google drive, open it in google colab, save and download it, then I rewrite my old file with the new one from google colab. The error massage disappear.

@laurentperrinet
Copy link

A simple solution is to use nbconvert:

jupyter nbconvert --allow-errors  --to notebook  --inplace mynotebook.ipynb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests