Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot load notebooks: ModuleNotFoundError: Error processing dotted path #878

Closed
lambdaofgod opened this issue Jun 24, 2022 · 16 comments · Fixed by #923
Closed

Cannot load notebooks: ModuleNotFoundError: Error processing dotted path #878

lambdaofgod opened this issue Jun 24, 2022 · 16 comments · Fixed by #923
Labels
bug Something isn't working stash Label used to categorize issues that will be worked on next

Comments

@lambdaofgod
Copy link

I have a notebook folder in ploomber project and ploomber tries to run something so I can't even load the notebook (which BTW is not referenced in pipeline.yaml in any way)

What happened? How is it possible that ploomber breaks jupyter?

Error message

[I 11:59:22.671 NotebookApp] [Ploomber] Requested model: work/wwf/notebooks/Enrich_Test.ipynb. Looking for DAG with root dir: /home/kuba/Projects                            
[E 11:59:22.786 NotebookApp] Uncaught exception GET /api/contents/work/wwf/notebooks/Enrich_Test.ipynb?type=notebook&_=1656064762442 (127.0.0.1)                             
    HTTPServerRequest(protocol='http', host='localhost:8888', method='GET', uri='/api/contents/work/wwf/notebooks/Enrich_Test.ipynb?type=notebook&_=1656064762442', version='
HTTP/1.1', remote_ip='127.0.0.1')                                                                                                                                            
    Traceback (most recent call last):                                                                                                                                       
      File "/home/kuba/.local/lib/python3.8/site-packages/tornado/web.py", line 1704, in _execute                                                                            
        result = await result                                                                                                                                                
      File "/home/kuba/.local/lib/python3.8/site-packages/tornado/gen.py", line 234, in wrapper                                                                              
        yielded = ctx_run(next, result)                                                                                                                                      
      File "/home/kuba/.local/lib/python3.8/site-packages/notebook/services/contents/handlers.py", line 118, in get                                                          
        model = yield maybe_future(self.contents_manager.get(                  
      File "/home/kuba/.local/lib/python3.8/site-packages/ploomber/jupyter/manager.py", line 267, in get                                                                     
        self.load_dag(                                                                                                                                                       
      File "/home/kuba/.local/lib/python3.8/site-packages/ploomber/jupyter/manager.py", line 193, in load_dag                                                                
        pairs = [(resolve_path(                                                       
      File "/home/kuba/.local/lib/python3.8/site-packages/ploomber/jupyter/manager.py", line 196, in <listcomp>                                                                      if t.source.loc is not None]                                                  
      File "/home/kuba/.local/lib/python3.8/site-packages/ploomber/sources/pythoncallablesource.py", line 130, in loc                                                        
        self._loc = self._callable_loader.get_loc()                                                                                                                          
      File "/home/kuba/.local/lib/python3.8/site-packages/ploomber/sources/pythoncallablesource.py", line 58, in get_loc                                                     
        loc, _ = lazily_locate_dotted_path(self._primitive)                                                                                                                        File "/home/kuba/.local/lib/python3.8/site-packages/ploomber/util/dotted_path.py", line 343, in lazily_locate_dotted_path                                              
        raise ModuleNotFoundError('Error processing dotted '
    ModuleNotFoundError: Error processing dotted path 'deepsense_wwf.data_utils.copy_zipped_data': 'deepsense_wwf' appears to be a namespace package, which are not supported
[W 11:59:22.790 NotebookApp] Unhandled error          
[E 11:59:22.791 NotebookApp] {                                                                                                                                               
      "Host": "localhost:8888",                                                                                                                                              
      "Accept": "application/json, text/javascript, */*; q=0.01",                                                                                                            
      "Referer": "http://localhost:8888/notebooks/work/wwf/notebooks/Enrich_Test.ipynb",                                                                                     
      "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:101.0) Gecko/20100101 Firefox/101.0"                                                                         
    }                                                          
@edublancas
Copy link
Contributor

As a temporary measure, you may disable the extension with:

jupyter serverextension disable ploomber

Here's what's happening:

When installing the extension, ploomber runs a function whenever you open any notebook; then, it checks if such notebook belongs to the pipeline.yaml. If it does, it injects the cell, if it doesn't, it doesn't do anything. Is this breaking Jupyter for you? If so, please tell us a bit more to understand the problem since that should not happen. Ploomber should simply ignore the notebook and let you use it as usual.

Since ploomber needs to determine if the current notebooks belong or not to a pipeline, it needs to load your pipeline. It seems like during this loading process; something is breaking and it fails to load deepsense_wwf.data_utils.copy_zipped_data. Can you try running:

ploomber status

First in the same folder that contains your pipeline.yaml and then in the folder that contains the notebook you're opening (Enrich_Test.ipynb); let me know if you get the same error.

@lambdaofgod
Copy link
Author

Thanks, it worked for jupyter notebook but not for jupyterlab

I disabled it for jupyterlab, and when I run jupyter labextension list I don't see ploomber, but it still loads when I run jupyter lab.

The funny part is I don't even remember installing ploomber extension...

@edublancas
Copy link
Contributor

Ok, thanks for the feedback. The ploomber extension activates when you run pip install ploomber. So if you uninstall ploomber, it should remove it as well.

It's weird that disabling it didn't turn it off on jupyter, but many users have experienced issues with this in the past (not being able to activate it, the extension doesn't show up, etc). So who knows what's happening.

this is a pretty bad bug so we'll get to it. I'll work on it this week.

@lambdaofgod
Copy link
Author

Wouldn't disabling automatic installation of the extension basically solve the problem? I would prefer to have the option to install the extension myself, and temporarily the biggest problem will go away

I didn't even know that I had this extension until it failed 😄

@edublancas
Copy link
Contributor

good point. the problem is that if the extension doesn't enable automatically, then most people will never find it. so I think it's best to have it turned on but we should ensure it doesn't break Jupyter.

@edublancas
Copy link
Contributor

I'm trying to debug this but I'm unable to reproduce it. Does deepsense_wwf.data_utils.copy_zipped_data in your pipeline.yaml? Or in any of them if you have more than one. It looks like the extension it's trying to load it since it's part of a pipeline but is unable to find it.

@lambdaofgod
Copy link
Author

Yes. Basically the pipeline is broken.
But I find it is an overkill that I will not be able to open notebooks from the project folder, even though I didn't specify ploomber to do anything with notebooks.

@edublancas
Copy link
Contributor

edublancas commented Jul 1, 2022

Yeah, I agree. We'll fix this as soon as possible. It's pretty bad that ploomber is breaking jupyter just because the pipeline isn't loading. I'll try to reproduce and follow up with more questions if needed

Notes for when we fix this, the problem is when accessing .source.loc, it'll break if unable to find the source for the function in the pipeline.yaml

pairs = [(resolve_path(
, we need a try catch there

@edublancas edublancas added bug Something isn't working stash Label used to categorize issues that will be worked on next labels Jul 1, 2022
@edublancas
Copy link
Contributor

@lambdaofgod can you install from git and let me know if the error persists?

pip uninstall ploomber -y
pip install git+https://github.com/ploomber/ploomber@master

@94rain
Copy link
Contributor

94rain commented Jul 18, 2022

Yeah, I agree. We'll fix this as soon as possible. It's pretty bad that ploomber is breaking jupyter just because the pipeline isn't loading. I'll try to reproduce and follow up with more questions if needed

Notes for when we fix this, the problem is when accessing .source.loc, it'll break if unable to find the source for the function in the pipeline.yaml

pairs = [(resolve_path(

, we need a try catch there

How should we handle the exception in the except clause? Assigning an empty list to pairs = []?

@edublancas
Copy link
Contributor

@94rain before you start working on this, ensure you get the latest version from master since I recently pushed changes to this file.

I couldn't reproduce @lambdaofgod error, but the overall problem is that we should catch any exceptions raised by load_dag.

def load_dag(self, starting_dir=None, log=True, model=None):

so on a second thought, the solution should be more general (as opposed to only covering the pairs = ... statement.

something like this should work:

def _load_dag(self, ...):
   # actual implpementation

def load_dag(self, ...):
   try:
     self._load_dag(...)
   except Exception as e:
      # log exception along with the message 
     # a problem happened when loading your pipeline

I'm unsure if this would case any side-effects so let's get this change done and then see if any tests break

@edublancas
Copy link
Contributor

what we want if the load_dag fails to execute is for Jupyter to work appropriately

@94rain
Copy link
Contributor

94rain commented Jul 19, 2022

I just opened a PR #923 (with all tests passing) that hopefully will fix this.

@lambdaofgod
Copy link
Author

I can confirm this is fixed. Thanks!

@idomic
Copy link
Contributor

idomic commented Aug 2, 2022

@lambdaofgod feel free posting our delivery speed in socials 2 weeks from bug to production 😁🙌

@edublancas
Copy link
Contributor

edublancas commented Aug 2, 2022

click here to tweet something nice

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stash Label used to categorize issues that will be worked on next
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants