Skip to content

Conversation

tolgacangoz
Copy link
Contributor

@tolgacangoz tolgacangoz commented Jun 5, 2024

This pull request updates the is_google_colab check to use an environment variable instead of checking for the presence of the google.colab module. This change ensures that the check is more reliable. Currently, !diffusers-cli env in a Colab's cell gives No.

Also, the check of is_notebook gives No if someone commands !diffusers-cli env in a notebook cell. Because, I guess, the checking code is run in a .py file? What to do here? I took this check from huggingface_hub. At the expansion proposal PR, I don't remember if I tried the command in a cell. Probably I tried the checking code directly within a cell. This was for is_google_colab as well. Should I remove is_notebook completely?

@tolgacangoz tolgacangoz changed the title Fix Google Colab check for diffusers-cli env Fix Notebook and Colab checks for diffusers-cli env Jun 5, 2024
@tolgacangoz tolgacangoz changed the title Fix Notebook and Colab checks for diffusers-cli env Fix Colab and Notebook checks for diffusers-cli env Jun 5, 2024
@sayakpaul sayakpaul requested a review from BenjaminBossan June 6, 2024 06:47
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why I was tagged to review this as I never use colab :D

I've seen conflicting information online about what does or does not work to detect the colab environment. This is probably also something that has changed over time.

As is, the current method works for me, as does the suggested change:

image

I wouldn't feel comfortable making this change before we understand what works in what circumstances.

@tolgacangoz
Copy link
Contributor Author

"google.colab" in sys.modules gives False if we run !diffusers-cli env in a cell. But in the cell, "google.colab" in sys.modules gives True. Checking env variable gives True in the two cases. Idk, if the name of the env variable will be changed in the future. A similar thing happens for the is_notebook check as well.

@BenjaminBossan
Copy link
Member

Oh I see, so somehow Google Colab injects the library into ipython but not into python:

!python -c 'import sys;print("google.colab" in sys.modules)'
False
!ipython -c 'import sys;print("google.colab" in sys.modules)'
22;0tTrue

Relying on a single env var still sounds fragile to me, as there is no guarantee that this can't change at any moment. The most official statement I've seen is this reddit commit by a Google Colab employee, and this uses the current approach.

I wonder if we should check if any(k.startswith("COLAB_") for k in os.environ) to be a bit more robust.

@tolgacangoz
Copy link
Contributor Author

Also, !diffusers-cli env produces "No" for notebook checking as well if it is run in a notebook cell. I guess, because the checking code itself is in a .py file? Let's remove is_notebook completely for now?

@BenjaminBossan
Copy link
Member

Also, !diffusers-cli env produces "No" for notebook checking as well if it is run in a notebook cell.

Even with the proposed changes?

@tolgacangoz
Copy link
Contributor Author

I meant this:

# Taken from `huggingface_hub`.
_is_notebook = False
try:
shell_class = get_ipython().__class__ # type: ignore # noqa: F821
for parent_class in shell_class.__mro__: # e.g. "is subclass of"
if parent_class.__name__ == "ZMQInteractiveShell":
_is_notebook = True # Jupyter notebook, Google colab or qtconsole
break
except NameError:
pass # Probably standard Python interpreter

get_ipython() can be run in a notebook/ipython. But, it is in a .py file 🤔.

@tolgacangoz
Copy link
Contributor Author

Can/Should I remove is_notebook for now?

@BenjaminBossan
Copy link
Member

I meant this:

I see. Indeed that doesn't work inside a colab notebook but at least locally, it works in a Jupyter notebook. So maybe we can extend the logic to say _is_notebook = _is_notebook or _is_google_colab (except if _is_notebook is supposed to not include colab notebooks, @sayakpaul do you know this?).

@tolgacangoz
Copy link
Contributor Author

When I run !diffusers-cli env in a Jupyter notebook cell on my laptop, it says No. Do you see Yes on your local machine?

@sayakpaul
Copy link
Member

except if _is_notebook is supposed to not include colab notebooks, @sayakpaul do you know this?

We could include the logic of Colab check within the scope of the notebooks, too if that makes sense?

@BenjaminBossan
Copy link
Member

When I run !diffusers-cli env in a Jupyter notebook cell on my laptop, it says No. Do you see Yes on your local machine?

Ah yes you're right, sorry for my mistake.

At least in my environment, I see that jupyter sets JPY_SESSION_NAME and JPY_PARENT_PID. So similarly to what we do for colab, we could check for env vars that start with JPY_.

@@ -330,7 +330,7 @@ def is_timm_available():
_is_notebook = True # Jupyter notebook, Google colab or qtconsole
break
except NameError:
pass # Probably standard Python interpreter
_is_notebook = any(k.startswith("JPY_") for k in os.environ)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is more generic in this way, right?
Btw, this gives Yes in a conventional Jupyter notebook; but gives No in VS Code's notebook :D

Copy link
Contributor Author

@tolgacangoz tolgacangoz Jun 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I add any(k.startswith("VSCODE") for k in os.environ) as well? No, this gives Yes in VS Code's terminal too. But, "VSCODE_CLI" env var seems to give Yes only in the VS Code notebook.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, we can't cover all possible cases. Honestly, I'm not sure why we need this information. For acccelerate it is because there is a special notebook launcher. I don't know if/how this helps debugging diffusers, so I can't say what cases we need to cover.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tolgacangoz
Copy link
Contributor Author

I have just removed the notebook check completely. Let's merge this?

@sayakpaul
Copy link
Member

@BenjaminBossan WDYT?

So, from what I understand, we are removing the notebook check because it can be broken and we cannot be deterministic most of the times, yeah?

@BenjaminBossan
Copy link
Member

So my understanding is that the any(k.startswith("JPY_") for k in os.environ) should work for Jupyter notebooks. No idea about vscode notebooks, as I don't use vscode. So it would be possible to make this attribute work, even if it's not super robust.

As to whether it's needed: I don't know enough about diffusers to answer that. Would it in any way help to debug diffusers issue if we had this information? If not, I would tend to remove it. If yes, I'd keep it, even if it's not super reliable.

@sayakpaul
Copy link
Member

Yeah I don’t think having this information is super important for debugging purposes TBH.

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @tolgacangoz for fixing this and for your patience.

Yeah I don’t think having this information is super important for debugging purposes TBH.

In that case, the PR LGTM. I'll leave the final review and merge to @sayakpaul.

Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot!

@sayakpaul sayakpaul requested a review from DN6 June 25, 2024 12:38
@tolgacangoz
Copy link
Contributor Author

tolgacangoz commented Jul 18, 2024

Thanks for the approvals!

@DN6 Gently pinging here.

@tolgacangoz tolgacangoz requested a review from sayakpaul July 23, 2024 10:10
@DN6 DN6 merged commit cf55dcf into huggingface:main Jul 23, 2024
@DN6
Copy link
Collaborator

DN6 commented Jul 23, 2024

Sorry for the delay @tolgacangoz appreciate your patience here.

@tolgacangoz
Copy link
Contributor Author

tolgacangoz commented Jul 23, 2024

Thanks all for the reviews and merging!

@tolgacangoz tolgacangoz deleted the fix-envinfo-notebook branch July 23, 2024 12:46
sayakpaul added a commit that referenced this pull request Dec 23, 2024
* chore: Update is_google_colab check to use environment variable

* Check Colab with all possible COLAB_* env variables

* Remove unnecessary word

* Make `_is_google_colab` more inclusive

* Revert "Make `_is_google_colab` more inclusive"

This reverts commit 6406db2.

* Make `_is_google_colab` more inclusive.

* chore: Update import_utils.py with notebook check improvement

* Refactor import_utils.py to improve notebook detection for VS Code's notebook

* chore: Remove `is_notebook()` function and related code

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants