Cambridge-ICCS · jatkinson1000 · May 9, 2024 · May 9, 2024 · May 9, 2024 · May 9, 2024
diff --git a/README.md b/README.md
@@ -6,7 +6,6 @@
 
 ![GitHub](https://img.shields.io/github/license/Cambridge-ICCS/ml-training-material)
 [![CC BY-NC-SA 4.0][cc-by-nc-sa-shield]][cc-by-nc-sa]
-[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/Cambridge-ICCS/ml-training-material/main)
 
 This repository contains documentation, resources, and code for the Introduction to
 Machine Learning with PyTorch session designed and delivered by [Jack Atkinson](https://jackatkinson.net/) ([**@jatkinson1000**](https://github.com/jatkinson1000))
@@ -24,7 +23,6 @@ A website for this workshop can be found at [https://cambridge-iccs.github.io/ml
 - [Preparation and prerequisites](#preparation-and-prerequisites)
 - [Installation and setup](#installation-and-setup)
 - [License information](#license)
-- [Contribution Guidelines and Support](#contribution-guidelines-and-support)
 
 
 ## Learning Objectives
@@ -72,6 +70,13 @@ These are for recapping after the course in case you missed anything, and contai
 [linted](https://docs.pylint.org/intro.html), and conforming to the
 [black](https://black.readthedocs.io/en/stable/) code style.
 
+If you were working on Colab you can open the worked solutions using the following links:
+
+* [Exercise 01](https://colab.research.google.com/github/Cambridge-ICCS/ml-training-material/blob/colab/worked-solutions/01_penguin_classification_solutions.ipynb)
+* [Exercise 02](https://colab.research.google.com/github/Cambridge-ICCS/ml-training-material/blob/colab/worked-solutions/02_penguin_regression_solutions.ipynb)
+* [Exercise 03](https://colab.research.google.com/github/Cambridge-ICCS/ml-training-material/blob/colab/worked-solutions/03_mnist_classification_solutions.ipynb)
+* [Exercise 04](https://colab.research.google.com/github/Cambridge-ICCS/ml-training-material/blob/colab/worked-solutions/04_ellipse_regression_solutions.ipynb)
+
 
 ## Preparation and prerequisites
 
@@ -131,18 +136,17 @@ us before a training session.
 
 ## Installation and setup
 
-There are three options for participating in this workshop for which instructions are provided below:
+There are two options for participating in this workshop for which instructions are provided below:
 
 * via a [local install](#local-install)
 * on [Google Colab](#google-colab)
-* on [binder](#binder)
 
 We recommend the [local install](#local-install) approach, especially if you forked
 the repository, as it is the easiest way to keep a copy of your work and push back to GitHub.
 
 However, if you experience issues with the installation process or are unfamiliar with
 the terminal/installation process there is the option to run the notebooks in
-[Google Colab](#google-colab) or on [binder](#binder).
+[Google Colab](#google-colab).
 
 ### Local Install
 
@@ -215,31 +219,18 @@ python -m ipykernel install --user --name=MLvenv
 
 ### Google Colab
 
-Running on Colab is useful as it allows you to access GPU resources.  
-To launch the notebooks in Google Colab click the following links for each of the exercises:
+To run the notebooks in Google Colab click the following links for each of the exercises:
 
-* [Exercise 01](https://colab.research.google.com/github/Cambridge-ICCS/ml-training-material/blob/colab/exercises/01_penguin_classification.ipynb) - [Worked Solution 01](https://colab.research.google.com/github/Cambridge-ICCS/ml-training-material/blob/colab/worked-solutions/01_penguin_classification_solutions.ipynb)
-* [Exercise 02](https://colab.research.google.com/github/Cambridge-ICCS/ml-training-material/blob/colab/exercises/02_penguin_regression.ipynb) - [Worked Solution 02](https://colab.research.google.com/github/Cambridge-ICCS/ml-training-material/blob/colab/worked-solutions/02_penguin_regression_solutions.ipynb)
-* [Exercise 03](https://colab.research.google.com/github/Cambridge-ICCS/ml-training-material/blob/colab/exercises/03_mnist_classification.ipynb) - [Worked Solution 03](https://colab.research.google.com/github/Cambridge-ICCS/ml-training-material/blob/colab/worked-solutions/03_mnist_classification_solutions.ipynb)
-* [Exercise 04](https://colab.research.google.com/github/Cambridge-ICCS/ml-training-material/blob/colab/exercises/04_ellipse_regression.ipynb) - [Worked Solution 04](https://colab.research.google.com/github/Cambridge-ICCS/ml-training-material/blob/colab/worked-solutions/04_ellipse_regression_solutions.ipynb)
+* [Exercise 01](https://colab.research.google.com/github/Cambridge-ICCS/ml-training-material/blob/colab/exercises/01_penguin_classification.ipynb)
+* [Exercise 02](https://colab.research.google.com/github/Cambridge-ICCS/ml-training-material/blob/colab/exercises/02_penguin_regression.ipynb)
+* [Exercise 03](https://colab.research.google.com/github/Cambridge-ICCS/ml-training-material/blob/colab/exercises/03_mnist_classification.ipynb)
+* [Exercise 04](https://colab.research.google.com/github/Cambridge-ICCS/ml-training-material/blob/colab/exercises/04_ellipse_regression.ipynb)
 
 _Notes:_
 * _Running in Google Colab requires you to have a Google account._
 * _If you leave a Colab session your work will be lost, so be careful to save any work
   you want to keep._
 
-### binder
-
-If you cannot operate using a local install, and do not wish to sign up for a Google account,
-the repository can be launched
-[on binder](https://mybinder.org/v2/gh/Cambridge-ICCS/ml-training-material/main).
-
-_Notes:_
-* _If you leave a binder session your work will be lost, so be careful to save any work
-  you want to keep_
-* _Due to the limited resources provided by binder you will struggle to run training in
-  exercises 3 and 4._
-
 
 ## License
 
@@ -253,24 +244,3 @@ The teaching materials are licensed under a
 [cc-by-nc-sa-shield]: https://img.shields.io/badge/License-CC%20BY--NC--SA%204.0-lightgrey.svg
 
 [![CC BY-NC-SA 4.0][cc-by-nc-sa-image]][cc-by-nc-sa]
-
-## Contribution Guidelines and Support
-
-If you spot an issue with the materials please let us know by
-[opening an issue](https://github.com/Cambridge-ICCS/ml-training-material/issues/new/choose)
-here on GitHub clearly describing the problem.
-
-If you are able to fix an issue that you spot, or an
-[existing open issue](https://github.com/Cambridge-ICCS/ml-training-material/issues/new/choose)
-please get in touch by commenting on the issue thread.
-
-Contributions from the community are welcome.
-To contribute back to the repository please first
-[fork it](https://github.com/Cambridge-ICCS/ml-training-material/fork),
-make the neccessary changes to fix the problem, and then open a pull request back to
-this repository clerly describing the changes you have made.
-We will then preform a review and merge once ready.
-
-If you would like support using these materials, adapting them to your needs, or
-delivering them please get in touch either via GitHub or via
-[ICCS](https://github.com/Cambridge-ICCS).
diff --git a/exercises/01_penguin_classification.ipynb b/exercises/01_penguin_classification.ipynb
@@ -105,9 +105,7 @@
     "    train=True,\n",
     ")\n",
     "\n",
-    "\n",
     "for features, target in data_set:\n",
-    "    # print the features and targets here\n",
     "    pass"
    ]
   },
@@ -126,7 +124,7 @@
    "source": [
     "### Task 4: Applying transforms to the data\n",
     "\n",
-    "A common way of transforming inputs to neural networks is to apply a series of transforms using ``torchvision.transforms.Compose``. The [``Compose``](https://pytorch.org/vision/stable/generated/torchvision.transforms.Compose.html) object takes a list of callable objects (i.e., functions) and applies them to the incoming data.\n",
+    "A common way of transforming inputs to neural networks is to apply a series of transforms using ``torchvision.transforms.Compose``. The ``Compose`` object takes a list of callable objects and applies them to the incoming data.\n",
     "\n",
     "These transforms can be very useful for mapping between file paths and tensors of images, etc.\n",
     "\n",
@@ -143,12 +141,8 @@
    "outputs": [],
    "source": [
     "from torchvision.transforms import Compose\n",
-    "# import some useful functions here, see https://pytorch.org/docs/stable/torch.html\n",
-    "# where `tensor` and `eye` are used for constructing tensors,\n",
-    "# and using a lower-precision float32 is advised for performance\n",
-    "from torch import tensor, eye, float32 \n",
     "\n",
-    "# Apply the transforms we need to the PenguinDataset to get out input\n",
+    "# Apply the transforms we need to the PenguinDataset to get out inputs\n",
     "# targets as Tensors."
    ]
   },
@@ -160,7 +154,7 @@
     "\n",
     "- Once we have created a ``Dataset`` object, we wrap it in a ``DataLoader``.\n",
     "  - The ``DataLoader`` object allows us to put our inputs and targets in mini-batches, which makes for more efficient training.\n",
-    "    - Note: rather than supplying one input-target pair to the model at a time, we supply \"mini-batches\" of these data at once (typically a small power of 2, like 16 or 32).\n",
+    "    - Note: rather than supplying one input-target pair to the model at a time, we supply \"mini-batches\" of these data at once.\n",
     "    - The number of items we supply at once is called the batch size.\n",
     "  - The ``DataLoader`` can also randomly shuffle the data each epoch (when training).\n",
     "  - It allows us to load different mini-batches in parallel, which can be very useful for larger datasets and images that can't all fit in memory at once.\n",

diff --git a/setup.py b/setup.py
diff --git a/slides/index.html b/slides/index.html
@@ -86,12 +86,10 @@ <h2 id="contents">Contents</h2>
               <ul>
                 <li><a href="#github">GitHub</a></li>
                 <li><a href="#colab">Colab</a></li>
-                <li><a href="#binder">binder</a></li>
                 <li><a href="#solutions">Solutions</a></li>
               </ul>
               <li><a href="#prerequisites">Prerequisites</a></li>
               <li><a href="#license">License</a></li>
-              <li><a href="#contribution-guidelines-and-support">Contribution Guidelines and Support</a></li>
             </ul>
           </div>
 
@@ -137,7 +135,6 @@ <h2 id="setup">Setup Instructions</h2>
           <ul>
             <li><a href="#github">a local install via Github</a></li>
             <li><a href="#colab">online via Google Colab</a></li>
-            <li><a href="#binder">online via binder</a></li>
           </ul>
 
           <p>We recommend the local install approach, especially if you forked the repository, as it is the easiest way to keep a copy of your work and push back to github.</p>
@@ -186,8 +183,7 @@ <h6>Optional) Keep virtual environment persistent in Jupyter Notebooks</h6>
           <code>python -m ipykernel install --user --name=MLvenv</code></p>
 
           <h4 id="colab">Google Colab</h4>
-          <p>Running on Colab is useful as it allows you to access GPU resources.<br>
-             To launch the notebooks in Google Colab click the following links for each of the exercises:</p>
+          <p>To run the notebooks in Google Colab click the following links for each of the exercises:</p>
           <ul>
             <li><a href="https://colab.research.google.com/github/Cambridge-ICCS/ml-training-material/blob/colab/exercises/01_penguin_classification.ipynb">Exercise 01</a></li>
             <li><a href="https://colab.research.google.com/github/Cambridge-ICCS/ml-training-material/blob/colab/exercises/02_penguin_regression.ipynb">Exercise 02</a></li>
@@ -200,17 +196,6 @@ <h4 id="colab">Google Colab</h4>
             <indent>If you leave a Colab session your work will be lost, so be careful to save any work you want to keep.</indent>
           </i></p>
 
-          <h4 id="binder">binder</h4>
-          <p>To run the notebooks in binder click the following link:</p>
-          <ul>
-            <li><a href="https://mybinder.org/v2/gh/Cambridge-ICCS/ml-training-material/main">Launch repository in binder</a></li>
-          </ul>
-
-          <p><i>Notes:<br>
-            <indent>If you leave a binder session your work will be lost, so be careful to save any work you want to keep.</indent>
-            <indent>Due to the limited resources provided by binder you will struggle to run training in exercises 3 and 4.</indent>
-          </i></p>
-
           <h4 id="solutions">Solutions</h4>
 
           <p>Worked solutions for all of the exercises can be found in the <inlinecode>worked-solutions/</inlinecode> directory.
@@ -261,7 +246,7 @@ <h4>Python</h4>
           </ul>
 
           <h4>git and GitHub</h4>
-          <p>Unless participating via <a href="#colab">Colab</a> or <a href="#binder">binder</a> you will be expected to know how to:</p>
+          <p>Unless participating via <a href="#colab">Colab</a> you will be expected to know how to:</p>
           <ul>
           <li>clone and/or fork a repository,</li>
           <li>commit, and</li>
@@ -291,12 +276,6 @@ <h2 id="license">License</h2>
           <p>The teaching materials are licensed under <a rel="license" href="https://creativecommons.org/licenses/by-nc-sa/4.0/">CC BY-NC-SA 4.0</a>.<br>
           <img src="https://licensebuttons.net/l/by-nc-sa/4.0/88x31.png"></img></p>
 
-          <h2 id="contribution-guidelines-and-support">Contribution Guidelines and Support</h2>
-          <p>If you spot an issue with the materials please let us know by <a href="https://github.com/Cambridge-ICCS/ml-training-material/issues/new/choose" target=blank>opening an issue</a> on GitHub clearly describing the problem.</p>
-          <p>If you are able to fix an issue that you spot, or an <a href="https://github.com/Cambridge-ICCS/ml-training-material/issues/new/choose" target=blank>existing open issue</a> please get in touch by commenting on the issue thread.</p>
-          <p>Contributions from the community are welcome. To contribute back to the repository please first <a href="https://github.com/Cambridge-ICCS/ml-training-material/fork" target=blank>fork it</a>, make the neccessary changes to fix the problem, and then open a pull request back to this repository clerly describing the changes you have made. We will then preform a review and merge once ready.</p>
-          <p>If you would like support using these materials, adapting them to your needs, or delivering them please get in touch either via GitHub or via <a href="https://github.com/Cambridge-ICCS" target=blank>ICCS</a>.</p>
-
         </div>
       </div>
 

diff --git a/src/ml_workshop/_penguins.py b/src/ml_workshop/_penguins.py
@@ -1,5 +1,4 @@
 """Penguins dataset."""
-
 from typing import Optional, List, Dict, Tuple, Any
 
 from torch.utils.data import Dataset
@@ -18,9 +17,9 @@ class PenguinDataset(Dataset):
 
     Parameters
     ----------
-    input_keys : List[str]
+    input_keys : Sequence[str]
         The column titles to use in the input feature vectors.
-    target_keys : List[str]
+    target_keys : Sequnce[str]
         The column titles to use in the target feature vectors.
     train : bool
         If ``True``, this object will serve as the training set, and if
@@ -40,7 +39,7 @@ class PenguinDataset(Dataset):
     def __init__(
         self,
         input_keys: List[str],
-        target_keys: List[str],
+        target_keys: str,
         train: bool,
         x_tfms: Optional[Compose] = None,
         y_tfms: Optional[Compose] = None,
@@ -110,7 +109,6 @@ def _load_penguin_data() -> DataFrame:
         .sort_values(by=sorted(data.keys()))
         .reset_index(drop=True)
     )
-    # Transform the sex field into a float, with male represented by 1.0, female by 0.0
     data.sex = (data.sex == "male").astype(float)
     return data
 

diff --git a/worked-solutions/01_penguin_classification_solutions.ipynb b/worked-solutions/01_penguin_classification_solutions.ipynb
@@ -214,7 +214,7 @@
    "source": [
     "### Task 4: Applying transforms to the data\n",
     "\n",
-    "A common way of transforming inputs to neural networks is to apply a series of transforms using ``torchvision.transforms.Compose``. The [``Compose``](https://pytorch.org/vision/stable/generated/torchvision.transforms.Compose.html) object takes a list of callable objects and applies them to the incoming data.\n",
+    "A common way of transforming inputs to neural networks is to apply a series of transforms using ``torchvision.transforms.Compose``. The ``Compose`` object takes a list of callable objects and applies them to the incoming data.\n",
     "\n",
     "These transforms can be very useful for mapping between file paths and tensors of images, etc.\n",
     "\n",
@@ -242,14 +242,11 @@
     }
    ],
    "source": [
-    "from torchvision.transforms import Compose\n",
-    "# import some useful functions here, see https://pytorch.org/docs/stable/torch.html\n",
-    "# where `tensor` and `eye` are used for constructing tensors,\n",
-    "# and using a lower-precision float32 is advised for performance\n",
     "from torch import tensor, float32, eye\n",
+    "from torchvision.transforms import Compose\n",
     "\n",
     "\n",
-    "# Apply the transforms we need to the PenguinDataset to get out input\n",
+    "# Apply the transforms we need to the PenguinDataset to get out inputs\n",
     "# targets as Tensors.\n",
     "\n",
     "\n",
@@ -324,7 +321,7 @@
     "\n",
     "- Once we have created a ``Dataset`` object, we wrap it in a ``DataLoader``.\n",
     "  - The ``DataLoader`` object allows us to put our inputs and targets in mini-batches, which makes for more efficient training.\n",
-    "    - Note: rather than supplying one input-target pair to the model at a time, we supply \"mini-batches\" of these data at once (typically a small power of 2, like 16 or 32).\n",
+    "    - Note: rather than supplying one input-target pair to the model at a time, we supply \"mini-batches\" of these data at once.\n",
     "    - The number of items we supply at once is called the batch size.\n",
     "  - The ``DataLoader`` can also randomly shuffle the data each epoch (when training).\n",
     "  - It allows us to load different mini-batches in parallel, which can be very useful for larger datasets and images that can't all fit in memory at once.\n",