Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed imports and Checked Notebooks #928

Merged
merged 14 commits into from
Jan 19, 2023

Conversation

sawradip
Copy link
Contributor

@sawradip sawradip commented Jan 4, 2023

All notebooks are checked - working.
Added some helpful comment.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link
Collaborator

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot working on this. I have a few comments, but I'll add them here, instead of on the notebook diff, because the diff view for notebooks is not really ergonomic.

1 If the import fails, you print the line:

If not already installed, you can install skorch by running 'pip install skorch'

I think it's not necessary, if someone runs the notebook locally, we can assume that skorch is installed. If more dependencies are needed (e.g. gpytorch), we already document that. Therefore, I think you can remove that line and just pass.

2 Furthermore, you added a comment # Installation. Could you please extend it to be: # Installation on Google colab? This might not be clear to some users.

3 For some of the notebooks, I could see that they were not run from top to bottom. E.g. in Basic_Usage.ipynb, after cell 12 comes cell 15, then 11. Some cells have not run at all. Before checking in a notebook, please ensure that it runs from top to bottom. This is important for reproducibility.

4 In Transfer_Learning.ipynb, we get a bunch of warnings:

UserWarning: This DataLoader will create 4 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.

They are really distracting, so let's get rid of them. Either suppress the warning, or, better, set the number of workers to 2 in cell 9:

    iterator_train__num_workers=2,
    iterator_valid__num_workers=2,

5 You deleted two notebooks, CORA-geometric.ipynb and MNIST-torchvision.ipynb. Could you please restore them?

6 For some notebooks, the diff view is completely broken: Gaussian_Processes.ipynb, Hugging_Face_Finetuning.ipynb, Hugging_Face_Model_Checkpoint.ipynb, Transfer_Learning.ipynb. This makes it difficult for me to check if you changed something besides the install instruction. Could you please say if you did?

7 In Basic_Usage.ipynb, you added import osin cell 35 but os is never used.

8 Finally, you checked in a bunch of files that are not required, in the datasets and runs folders, could you please remove them?

@sawradip
Copy link
Contributor Author

sawradip commented Jan 5, 2023

Thank you for your reviews, and I am fixing those. I thing I have faced and think should be fixed is :

Screenshot (3)

If GPU is connected. This error arises.

I am seeing, changing this line
samples = samples.detach().numpy()
to this line
samples = samples.cpu().detach().numpy()
will make it work for both CPU and GPU.

And as you asked if I changed anything in those notebooks, I don't remember changing anything that effects the code. One or two comments might have been added, but as some were already there, it is tough for me also to figure out as diff is not a particularly good view for notebook. You can be sure taking a look at the comments.

Copy link
Collaborator

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for implementing the change so quickly, this looks very good.

I saw that you added some comments, those that I found looked very reasonable (but again, hard to find with the diff). So let's keep it that way.

I only saw some minor issues in Transfer_Learning.ipynb being left. In cell 9, could you please remove the line

iterator_valid__shuffle=True,

It was already wrong before, so nothing to do with your PR, but it's incorrect, so let's fix it.

In the same notebook, for some reason, there are still warnings about This DataLoader will create 4 worker processes in total, even though the number of workers was reduced to 2. When I run the notebook on colab, I don't get any warnings. Could you please try to run again?

Finally, could you please add an entry to CHANGES.md about fixing the install command in colab?

@sawradip
Copy link
Contributor Author

sawradip commented Jan 5, 2023

  • All the notebooks have been updated according to the code review.
  • All notebooks have been run top-to-bottom in Google Colab

As this is my first time updating a PR, can you guide me if I need to do anything now, I have commited all the changes to my corresponding branch.

Copy link
Collaborator

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sawradip There is nothing more you need to do, after pushing to your branch, the PR is updated automatically.

Thanks for implementing the remaining fixes. the Transfer_Learning.ipynb notebook looks much better now.

I'm ready to merge, but will leave this PR open for a few days in case @ottonemo or @thomasjpfan want to add anything.

@BenjaminBossan
Copy link
Collaborator

Since there were no more comments, I assume it means we're good to go :)

@BenjaminBossan
Copy link
Collaborator

@sawradip

Just found this suggestion by a google colab PM (ckperry) on reddit:

https://www.reddit.com/r/MachineLearning/comments/114hphp/comment/j8xxnpp/

Conclusion: There is no official way to check if a notebook is running on colab. The suggested snippet is basically doing the same thing as how you changed the notebooks, so I think we're good here.

@sawradip
Copy link
Contributor Author

Awesome! It was a great experience. I will like to contribute more later.

@sawradip
Copy link
Contributor Author

sawradip commented Mar 4, 2023

While working on a PR in Openvino, I found another way to verify. This is also not official, but still leaving here, if ever is necessary.

from IPython.core.getipython import get_ipython

# Checking if running in Google Colab
IN_COLAB = 'google.colab' in str(get_ipython())

@BenjaminBossan
Copy link
Collaborator

Thanks, good to know. I prefer the current method, but it's good to have a backup.

BenjaminBossan added a commit that referenced this pull request May 17, 2023
Preparation for release of version 0.13.0

Release text:

The new skorch release is here and it has some changes that will be exiting for
some users.

- First of all, you may have heard of the [PyTorch 2.0
  release](https://pytorch.org/get-started/pytorch-2.0/), which includes the
  option to compile the PyTorch module for better runtime performance. This
  skorch release allows you to pass `compile=True` when initializing the net to
  enable compilation.
- Support for training on multiple GPUs with the help of the
  [`accelerate`](https://huggingface.co/docs/accelerate/index) package has been
  improved by fixing some bugs and providing a dedicated [history
  class](https://skorch.readthedocs.io/en/latest/user/history.html#distributed-history).
  Our documentation contains more information on [what to consider when training
  on multiple
  GPUs](https://skorch.readthedocs.io/en/latest/user/huggingface.html#caution-when-using-a-multi-gpu-setup).
- If you have ever been frustrated with your neural net not training properly,
  you know how hard it can be to discover the underlying issue. Using the new
  [`SkorchDoctor`](https://skorch.readthedocs.io/en/latest/helper.html#skorch.helper.SkorchDoctor)
  class will simplify the diagnosis of underlying issues. Take a look at the
  accompanying
  [notebook](https://nbviewer.org/github/skorch-dev/skorch/blob/master/notebooks/Skorch_Doctor.ipynb)

Apart from that, a few bugs have been fixed and the included notebooks have been
updated to properly install requirements on Google Colab.

We are grateful for external contributors, many thanks to:

- Kshiteej K (kshitij12345)
- Muhammad Abdullah (abdulasiraj)
- Royi (RoyiAvital)
- Sawradip Saha (sawradip)
- y10ab1 (y10ab1)

Find below the list of all changes since v0.12.1 below:

### Added
- Add support for compiled PyTorch modules using the `torch.compile` function,
  introduced in [PyTorch 2.0
  release](https://pytorch.org/get-started/pytorch-2.0/), which can greatly
  improve performance on new GPU architectures; to use it, initialize your net
  with the `compile=True` argument, further compilation arguments can be
  specified using the dunder notation, e.g. `compile__dynamic=True`
- Add a class
  [`DistributedHistory`](https://skorch.readthedocs.io/en/latest/history.html#skorch.history.DistributedHistory)
  which should be used when training in a multi GPU setting (#955)
- `SkorchDoctor`: A helper class that assists in understanding and debugging the
  neural net training, see [this
  notebook](https://nbviewer.org/github/skorch-dev/skorch/blob/master/notebooks/Skorch_Doctor.ipynb)
  (#912)
- When using `AccelerateMixin`, it is now possible to prevent unwrapping of the
  modules by setting `unwrap_after_train=True` (#963)

### Fixed
- Fixed install command to work with recent changes in Google Colab (#928)
- Fixed a couple of bugs related to using non-default modules and criteria
  (#927)
- Fixed a bug when using `AccelerateMixin` in a multi-GPU setup (#947)
- `_get_param_names` returns a list instead of a generator so that subsequent
  error messages return useful information instead of a generator `repr` string
  (#925)
- Fixed a bug that caused modules to not be sufficiently unwrapped at the end of
  training when using `AccelerateMixin`, which could prevent them from being
  pickleable (#963)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants