From bd25d7803bc49206ef422639fd4ee8d5b6f7ebda Mon Sep 17 00:00:00 2001
From: Yunxuan Xiao <yunxuanx@anyscale.com>
Date: Thu, 7 Sep 2023 20:14:44 -0700
Subject: [PATCH] [2.7] Cleanup all LightningTrainer Mentions in Ray Doc
 (#39406)

Signed-off-by: woshiyyya <xiaoyunxuan1998@gmail.com>
Signed-off-by: Jim Thompson <jimthompson5802@gmail.com>
---
 doc/source/ray-overview/examples.rst          |    2 +-
 .../lightning/lightning_mnist_example.ipynb   |    2 +-
 .../tune/examples/includes/mnist_ptl_mini.rst |    5 -
 .../examples/tune-pytorch-lightning.ipynb     | 1151 ++++++++++++++---
 .../tune-vanilla-pytorch-lightning.ipynb      |    2 +-
 5 files changed, 970 insertions(+), 192 deletions(-)

diff --git a/doc/source/ray-overview/examples.rst b/doc/source/ray-overview/examples.rst
index 1572039910921..4a707a606722f 100644
--- a/doc/source/ray-overview/examples.rst
+++ b/doc/source/ray-overview/examples.rst
@@ -1373,7 +1373,7 @@ Ray Examples
         :link: /train/examples/lightning/vicuna_13b_lightning_deepspeed_finetune
         :link-type: doc
 
-        Fine-tune vicuna-13b-v1.3 with DeepSpeed and LightningTrainer
+        Fine-tune vicuna-13b-v1.3 with DeepSpeed, PyTorch Lightning and Ray Train
     
     .. grid-item-card:: :bdg-secondary:`Code example`
         :class-item: gallery-item training llm pytorch nlp
diff --git a/doc/source/train/examples/lightning/lightning_mnist_example.ipynb b/doc/source/train/examples/lightning/lightning_mnist_example.ipynb
index 850116cf01600..738bc4d47c523 100644
--- a/doc/source/train/examples/lightning/lightning_mnist_example.ipynb
+++ b/doc/source/train/examples/lightning/lightning_mnist_example.ipynb
@@ -51,7 +51,7 @@
             "source": [
                 "## Prepare Dataset and Module\n",
                 "\n",
-                "The Pytorch Lightning Trainer takes either `torch.utils.data.DataLoader` or `pl.LightningDataModule` as data inputs. You can keep using them without any changes for the Ray AIR LightningTrainer. "
+                "The Pytorch Lightning Trainer takes either `torch.utils.data.DataLoader` or `pl.LightningDataModule` as data inputs. You can keep using them without any changes with Ray Train. "
             ]
         },
         {
diff --git a/doc/source/tune/examples/includes/mnist_ptl_mini.rst b/doc/source/tune/examples/includes/mnist_ptl_mini.rst
index f13434a9a6507..32d4f3b5fbfea 100644
--- a/doc/source/tune/examples/includes/mnist_ptl_mini.rst
+++ b/doc/source/tune/examples/includes/mnist_ptl_mini.rst
@@ -3,9 +3,4 @@
 MNIST PyTorch Lightning Example
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-.. note::
-
-    In version 2.4, we introduced :class:`LightningTrainer <ray.train.lightning.LightningTrainer>`, which provides better integration with PyTorch Lightning.
-    For more information, please refer to :ref:`Using PyTorch Lightning with Tune <tune-pytorch-lightning-ref>`.
-
 .. literalinclude:: /../../python/ray/tune/examples/mnist_ptl_mini.py
diff --git a/doc/source/tune/examples/tune-pytorch-lightning.ipynb b/doc/source/tune/examples/tune-pytorch-lightning.ipynb
index 35812c4cd4137..e6dc7172df35a 100644
--- a/doc/source/tune/examples/tune-pytorch-lightning.ipynb
+++ b/doc/source/tune/examples/tune-pytorch-lightning.ipynb
@@ -1,7 +1,6 @@
 {
  "cells": [
   {
-   "attachments": {},
    "cell_type": "markdown",
    "metadata": {
     "tags": []
@@ -11,33 +10,21 @@
     "\n",
     "(tune-pytorch-lightning-ref)=\n",
     "\n",
-    "PyTorch Lightning is a framework which brings structure into training PyTorch models. It\n",
-    "aims to avoid boilerplate code, so you don't have to write the same training\n",
-    "loops all over again when building a new model.\n",
+    "PyTorch Lightning is a framework which brings structure into training PyTorch models. It aims to avoid boilerplate code, so you don't have to write the same training loops all over again when building a new model.\n",
     "\n",
     "```{image} /images/pytorch_lightning_full.png\n",
     ":align: center\n",
     "```\n",
     "\n",
-    "The main abstraction of PyTorch Lightning is the `LightningModule` class, which\n",
-    "should be extended by your application. There is [a great post on how to transfer your models from vanilla PyTorch to Lightning](https://towardsdatascience.com/from-pytorch-to-pytorch-lightning-a-gentle-introduction-b371b7caaf09).\n",
+    "The main abstraction of PyTorch Lightning is the `LightningModule` class, which should be extended by your application. There is [a great post on how to transfer your models from vanilla PyTorch to Lightning](https://towardsdatascience.com/from-pytorch-to-pytorch-lightning-a-gentle-introduction-b371b7caaf09).\n",
     "\n",
-    "The class structure of PyTorch Lightning makes it very easy to define and tune model\n",
-    "parameters. This tutorial will show you how to use Tune with AIR {class}`LightningTrainer <ray.train.lightning.LightningTrainer>` to find the best set of\n",
-    "parameters for your application on the example of training a MNIST classifier. Notably,\n",
-    "the `LightningModule` does not have to be altered at all for this - so you can\n",
-    "use it plug and play for your existing models, assuming their parameters are configurable!\n",
-    "\n",
-    ":::{note}\n",
-    "If you don't want to use AIR {class}`LightningTrainer <ray.train.lightning.LightningTrainer>` and prefer using vanilla lightning trainer with function trainable, please refer to this document: {ref}`Using vanilla Pytorch Lightning with Tune <tune-vanilla-pytorch-lightning-ref>`.\n",
-    "\n",
-    ":::\n",
+    "The class structure of PyTorch Lightning makes it very easy to define and tune model parameters. This tutorial will show you how to use Tune with Ray Train's {class}`TorchTrainer <ray.train.torch.TorchTrainer>` to find the best set of parameters for your application on the example of training a MNIST classifier. Notably, the `LightningModule` does not have to be altered at all for this - so you can use it plug and play for your existing models, assuming their parameters are configurable!\n",
     "\n",
     ":::{note}\n",
     "To run this example, you will need to install the following:\n",
     "\n",
     "```bash\n",
-    "$ pip install \"ray[tune]\" torch torchvision pytorch-lightning\n",
+    "$ pip install \"ray[tune]\" torch torchvision pytorch_lightning\n",
     "```\n",
     ":::\n",
     "\n",
@@ -48,26 +35,16 @@
     "\n",
     "## PyTorch Lightning classifier for MNIST\n",
     "\n",
-    "Let's first start with the basic PyTorch Lightning implementation of an MNIST classifier.\n",
-    "This classifier does not include any tuning code at this point.\n",
+    "Let's first start with the basic PyTorch Lightning implementation of an MNIST classifier. This classifier does not include any tuning code at this point.\n",
     "\n",
     "First, we run some imports:"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 1,
+   "execution_count": null,
    "metadata": {},
-   "outputs": [
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "/home/ray/anaconda3/lib/python3.7/site-packages/tqdm/auto.py:22: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
-      "  from .autonotebook import tqdm as notebook_tqdm\n"
-     ]
-    }
-   ],
+   "outputs": [],
    "source": [
     "import os\n",
     "import torch\n",
@@ -78,14 +55,12 @@
     "from torchmetrics import Accuracy\n",
     "from torch.utils.data import DataLoader, random_split\n",
     "from torchvision.datasets import MNIST\n",
-    "from torchvision import transforms\n",
-    "\n",
-    "from ray.train.lightning import LightningTrainer, LightningConfigBuilder"
+    "from torchvision import transforms"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": 2,
    "metadata": {
     "tags": [
      "remove-cell"
@@ -98,7 +73,6 @@
    ]
   },
   {
-   "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
@@ -220,7 +194,43 @@
    ]
   },
   {
-   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Define a training function that creates model, datamodule, and lightning trainer with Ray Train utilities."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from ray.train.lightning import (\n",
+    "    RayDDPStrategy,\n",
+    "    RayLightningEnvironment,\n",
+    "    RayTrainReportCallback,\n",
+    "    prepare_trainer,\n",
+    ")\n",
+    "\n",
+    "\n",
+    "def train_func(config):\n",
+    "    dm = MNISTDataModule(batch_size=config[\"batch_size\"])\n",
+    "    model = MNISTClassifier(config)\n",
+    "\n",
+    "    trainer = pl.Trainer(\n",
+    "        devices=\"auto\",\n",
+    "        accelerator=\"auto\",\n",
+    "        strategy=RayDDPStrategy(),\n",
+    "        callbacks=[RayTrainReportCallback()],\n",
+    "        plugins=[RayLightningEnvironment()],\n",
+    "        enable_progress_bar=False,\n",
+    "    )\n",
+    "    trainer = prepare_trainer(trainer)\n",
+    "    trainer.fit(model, datamodule=dm)"
+   ]
+  },
+  {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
@@ -235,40 +245,58 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 5,
+   "execution_count": 21,
    "metadata": {},
    "outputs": [],
    "source": [
-    "from pytorch_lightning.loggers import TensorBoardLogger\n",
-    "from ray import air, tune\n",
-    "from ray.train import RunConfig, ScalingConfig, CheckpointConfig\n",
+    "from ray import tune\n",
     "from ray.tune.schedulers import ASHAScheduler"
    ]
   },
   {
-   "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
     "### Configuring the search space\n",
     "\n",
-    "Now we configure the parameter search space using {class}`LightningConfigBuilder <ray.train.lightning.LightningConfigBuilder>`. We would like to choose between three different layer and batch sizes. The learning rate should be sampled uniformly between `0.0001` and `0.1`. The `tune.loguniform()` function is syntactic sugar to make sampling between these different orders of magnitude easier, specifically we are able to also sample small values.\n",
-    "\n",
-    ":::{note}\n",
-    "In `LightningTrainer`, the frequency of metric reporting is the same as the frequency of checkpointing. For example, if you set `builder.checkpointing(..., every_n_epochs=2)`, then for every 2 epochs, all the latest metrics will be reported to the Ray Tune session along with the latest checkpoint. Please make sure the target metrics(e.g. metrics specified in `TuneConfig`, schedulers, and searchers) are logged before saving a checkpoint.\n",
-    "\n",
-    ":::\n",
+    "Now we configure the parameter search space. We would like to choose between different layer dimensions, learning rate, and batch sizes. The learning rate should be sampled uniformly between `0.0001` and `0.1`. The `tune.loguniform()` function is syntactic sugar to make sampling between these different orders of magnitude easier, specifically we are able to also sample small values. Similarly for `tune.choice()`, which samples from all the provided options."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 22,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "search_space = {\n",
+    "    \"layer_1_size\": tune.choice([32, 64, 128]),\n",
+    "    \"layer_2_size\": tune.choice([64, 128, 256]),\n",
+    "    \"lr\": tune.loguniform(1e-4, 1e-1),\n",
+    "    \"batch_size\": tune.choice([32, 64]),\n",
+    "}"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Selecting a scheduler\n",
     "\n",
+    "In this example, we use an [Asynchronous Hyperband](https://blog.ml.cmu.edu/2018/12/12/massively-parallel-hyperparameter-optimization/)\n",
+    "scheduler. This scheduler decides at each iteration which trials are likely to perform\n",
+    "badly, and stops these trials. This way we don't waste any resources on bad hyperparameter\n",
+    "configurations.\n",
     "\n",
     ":::{note}\n",
-    "Use `LightningConfigBuilder.checkpointing()` to specify the monitor metric and checkpoint frequency for the Lightning ModelCheckpoint callback. To properly save AIR checkpoints, you must also provide an AIR {class}`CheckpointConfig <ray.train.CheckpointConfig>`. Otherwise, LightningTrainer will create a default CheckpointConfig, which saves all the reported checkpoints by default.\n",
+    "\n",
+    "    Currently, `LightningTrainer` is not compatible with {class}`PopulationBasedTraining <ray.tune.schedulers.PopulationBasedTraining>` scheduler, which may mutate hyperparameters during training time. \n",
     "\n",
     ":::"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 6,
+   "execution_count": 24,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -276,13 +304,10 @@
     "num_epochs = 5\n",
     "\n",
     "# Number of sampls from parameter space\n",
-    "num_samples = 10\n",
-    "\n",
-    "accelerator = \"gpu\""
+    "num_samples = 10"
    ]
   },
   {
-   "attachments": {},
    "cell_type": "markdown",
    "metadata": {
     "tags": []
@@ -293,7 +318,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 4,
+   "execution_count": 9,
    "metadata": {
     "tags": [
      "remove-cell"
@@ -303,84 +328,12 @@
    "source": [
     "if SMOKE_TEST:\n",
     "    num_epochs = 3\n",
-    "    num_samples = 3\n",
-    "    accelerator = \"cpu\""
-   ]
-  },
-  {
-   "attachments": {},
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "For hyper-parameter tuning, we only need to adjust a subset of configurations while keeping the others fixed. Therefore, we create two `lightning_configs` below:\n",
-    "\n",
-    "- `static_lightning_config`: specifies the static configs that are used for creating a base `LightningTrainer`.\n",
-    "- `searchable_lightning_config`: specifies the searchable configs using Tune {ref}`search space APIs <tune-search-space>`. It defines the search space for the Tuner."
+    "    num_samples = 3"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 8,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "dm = MNISTDataModule(batch_size=64)\n",
-    "logger = TensorBoardLogger(save_dir=os.getcwd(), name=\"tune-ptl-example\", version=\".\")\n",
-    "\n",
-    "# Static configs that does not change across trials\n",
-    "static_lightning_config = (\n",
-    "    LightningConfigBuilder()\n",
-    "    .module(cls=MNISTClassifier)\n",
-    "    .trainer(max_epochs=num_epochs, accelerator=accelerator, logger=logger)\n",
-    "    .fit_params(datamodule=dm)\n",
-    "    .checkpointing(monitor=\"ptl/val_accuracy\", save_top_k=2, mode=\"max\")\n",
-    "    .build()\n",
-    ")\n",
-    "\n",
-    "# Searchable configs across different trials\n",
-    "searchable_lightning_config = (\n",
-    "    LightningConfigBuilder()\n",
-    "    .module(config={\n",
-    "        \"layer_1_size\": tune.choice([32, 64, 128]),\n",
-    "        \"layer_2_size\": tune.choice([64, 128, 256]),\n",
-    "        \"lr\": tune.loguniform(1e-4, 1e-1),\n",
-    "    })\n",
-    "    .build()\n",
-    ")\n",
-    "\n",
-    "# Make sure to also define an AIR CheckpointConfig here\n",
-    "# to properly save checkpoints in AIR format.\n",
-    "run_config = RunConfig(\n",
-    "    checkpoint_config=CheckpointConfig(\n",
-    "        num_to_keep=2,\n",
-    "        checkpoint_score_attribute=\"ptl/val_accuracy\",\n",
-    "        checkpoint_score_order=\"max\",\n",
-    "    ),\n",
-    ")"
-   ]
-  },
-  {
-   "attachments": {},
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### Selecting a scheduler\n",
-    "\n",
-    "In this example, we use an [Asynchronous Hyperband](https://blog.ml.cmu.edu/2018/12/12/massively-parallel-hyperparameter-optimization/)\n",
-    "scheduler. This scheduler decides at each iteration which trials are likely to perform\n",
-    "badly, and stops these trials. This way we don't waste any resources on bad hyperparameter\n",
-    "configurations.\n",
-    "\n",
-    ":::{note}\n",
-    "\n",
-    "    Currently, `LightningTrainer` is not compatible with {class}`PopulationBasedTraining <ray.tune.schedulers.PopulationBasedTraining>` scheduler, which may mutate hyperparameters during training time. \n",
-    "\n",
-    ":::"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 9,
+   "execution_count": 25,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -388,7 +341,6 @@
    ]
   },
   {
-   "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
@@ -401,18 +353,28 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 10,
+   "execution_count": 26,
    "metadata": {},
    "outputs": [],
    "source": [
+    "from ray.train import RunConfig, ScalingConfig, CheckpointConfig\n",
+    "\n",
     "scaling_config = ScalingConfig(\n",
     "    num_workers=3, use_gpu=True, resources_per_worker={\"CPU\": 1, \"GPU\": 1}\n",
+    ")\n",
+    "\n",
+    "run_config = RunConfig(\n",
+    "    checkpoint_config=CheckpointConfig(\n",
+    "        num_to_keep=2,\n",
+    "        checkpoint_score_attribute=\"ptl/val_accuracy\",\n",
+    "        checkpoint_score_order=\"max\",\n",
+    "    ),\n",
     ")"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 11,
+   "execution_count": 13,
    "metadata": {
     "tags": [
      "remove-cell"
@@ -428,108 +390,929 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 12,
+   "execution_count": 27,
    "metadata": {},
    "outputs": [],
    "source": [
-    "# Define a base LightningTrainer without hyper-parameters for Tuner\n",
-    "lightning_trainer = LightningTrainer(\n",
-    "    lightning_config=static_lightning_config,\n",
+    "from ray.train.torch import TorchTrainer\n",
+    "\n",
+    "# Define a TorchTrainer without hyper-parameters for Tuner\n",
+    "ray_trainer = TorchTrainer(\n",
+    "    train_func,\n",
     "    scaling_config=scaling_config,\n",
     "    run_config=run_config,\n",
     ")"
    ]
   },
   {
-   "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
     "### Putting it together\n",
     "\n",
-    "Lastly, we need to create a `Tuner()` object and start Ray Tune with `tuner.fit()`.\n",
-    "\n",
-    "The full code looks like this:"
+    "Lastly, we need to create a `Tuner()` object and start Ray Tune with `tuner.fit()`. The full code looks like this:"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 13,
+   "execution_count": 28,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div class=\"tuneStatus\">\n",
+       "  <div style=\"display: flex;flex-direction: row\">\n",
+       "    <div style=\"display: flex;flex-direction: column;\">\n",
+       "      <h3>Tune Status</h3>\n",
+       "      <table>\n",
+       "<tbody>\n",
+       "<tr><td>Current time:</td><td>2023-09-07 14:03:52</td></tr>\n",
+       "<tr><td>Running for: </td><td>00:05:13.92        </td></tr>\n",
+       "<tr><td>Memory:      </td><td>20.5/186.6 GiB     </td></tr>\n",
+       "</tbody>\n",
+       "</table>\n",
+       "    </div>\n",
+       "    <div class=\"vDivider\"></div>\n",
+       "    <div class=\"systemInfo\">\n",
+       "      <h3>System Info</h3>\n",
+       "      Using AsyncHyperBand: num_stopped=10<br>Bracket: Iter 4.000: 0.9709362387657166 | Iter 2.000: 0.9617255330085754 | Iter 1.000: 0.9477165043354034<br>Logical resource usage: 4.0/48 CPUs, 3.0/4 GPUs (0.0/1.0 accelerator_type:None)\n",
+       "    </div>\n",
+       "    \n",
+       "  </div>\n",
+       "  <div class=\"hDivider\"></div>\n",
+       "  <div class=\"trialStatus\">\n",
+       "    <h3>Trial Status</h3>\n",
+       "    <table>\n",
+       "<thead>\n",
+       "<tr><th>Trial name              </th><th>status    </th><th>loc             </th><th style=\"text-align: right;\">   train_loop_config/ba\n",
+       "tch_size</th><th style=\"text-align: right;\">    train_loop_config/la\n",
+       "yer_1_size</th><th style=\"text-align: right;\">    train_loop_config/la\n",
+       "yer_2_size</th><th style=\"text-align: right;\">  train_loop_config/lr</th><th style=\"text-align: right;\">  iter</th><th style=\"text-align: right;\">  total time (s)</th><th style=\"text-align: right;\">  ptl/train_loss</th><th style=\"text-align: right;\">  ptl/train_accuracy</th><th style=\"text-align: right;\">  ptl/val_loss</th></tr>\n",
+       "</thead>\n",
+       "<tbody>\n",
+       "<tr><td>TorchTrainer_5144b_00000</td><td>TERMINATED</td><td>10.0.0.84:63990 </td><td style=\"text-align: right;\">32</td><td style=\"text-align: right;\"> 64</td><td style=\"text-align: right;\">256</td><td style=\"text-align: right;\">           0.0316233  </td><td style=\"text-align: right;\">     5</td><td style=\"text-align: right;\">         29.3336</td><td style=\"text-align: right;\">      0.973613  </td><td style=\"text-align: right;\">            0.766667</td><td style=\"text-align: right;\">     0.580943 </td></tr>\n",
+       "<tr><td>TorchTrainer_5144b_00001</td><td>TERMINATED</td><td>10.0.0.84:71294 </td><td style=\"text-align: right;\">64</td><td style=\"text-align: right;\">128</td><td style=\"text-align: right;\"> 64</td><td style=\"text-align: right;\">           0.0839278  </td><td style=\"text-align: right;\">     1</td><td style=\"text-align: right;\">         12.2275</td><td style=\"text-align: right;\">      2.19514   </td><td style=\"text-align: right;\">            0.266667</td><td style=\"text-align: right;\">     1.56644  </td></tr>\n",
+       "<tr><td>TorchTrainer_5144b_00002</td><td>TERMINATED</td><td>10.0.0.84:73540 </td><td style=\"text-align: right;\">32</td><td style=\"text-align: right;\"> 64</td><td style=\"text-align: right;\">256</td><td style=\"text-align: right;\">           0.000233034</td><td style=\"text-align: right;\">     5</td><td style=\"text-align: right;\">         29.1314</td><td style=\"text-align: right;\">      0.146903  </td><td style=\"text-align: right;\">            0.933333</td><td style=\"text-align: right;\">     0.114229 </td></tr>\n",
+       "<tr><td>TorchTrainer_5144b_00003</td><td>TERMINATED</td><td>10.0.0.84:80840 </td><td style=\"text-align: right;\">64</td><td style=\"text-align: right;\">128</td><td style=\"text-align: right;\"> 64</td><td style=\"text-align: right;\">           0.00109259 </td><td style=\"text-align: right;\">     5</td><td style=\"text-align: right;\">         21.6534</td><td style=\"text-align: right;\">      0.0474913 </td><td style=\"text-align: right;\">            0.966667</td><td style=\"text-align: right;\">     0.0714878</td></tr>\n",
+       "<tr><td>TorchTrainer_5144b_00004</td><td>TERMINATED</td><td>10.0.0.84:88077 </td><td style=\"text-align: right;\">32</td><td style=\"text-align: right;\"> 32</td><td style=\"text-align: right;\">128</td><td style=\"text-align: right;\">           0.00114083 </td><td style=\"text-align: right;\">     5</td><td style=\"text-align: right;\">         29.6367</td><td style=\"text-align: right;\">      0.0990443 </td><td style=\"text-align: right;\">            0.966667</td><td style=\"text-align: right;\">     0.0891999</td></tr>\n",
+       "<tr><td>TorchTrainer_5144b_00005</td><td>TERMINATED</td><td>10.0.0.84:95388 </td><td style=\"text-align: right;\">32</td><td style=\"text-align: right;\"> 64</td><td style=\"text-align: right;\"> 64</td><td style=\"text-align: right;\">           0.00924264 </td><td style=\"text-align: right;\">     4</td><td style=\"text-align: right;\">         25.7089</td><td style=\"text-align: right;\">      0.0349707 </td><td style=\"text-align: right;\">            1       </td><td style=\"text-align: right;\">     0.153937 </td></tr>\n",
+       "<tr><td>TorchTrainer_5144b_00006</td><td>TERMINATED</td><td>10.0.0.84:101434</td><td style=\"text-align: right;\">32</td><td style=\"text-align: right;\">128</td><td style=\"text-align: right;\">256</td><td style=\"text-align: right;\">           0.00325671 </td><td style=\"text-align: right;\">     5</td><td style=\"text-align: right;\">         29.5763</td><td style=\"text-align: right;\">      0.0708755 </td><td style=\"text-align: right;\">            0.966667</td><td style=\"text-align: right;\">     0.0820903</td></tr>\n",
+       "<tr><td>TorchTrainer_5144b_00007</td><td>TERMINATED</td><td>10.0.0.84:108750</td><td style=\"text-align: right;\">32</td><td style=\"text-align: right;\"> 32</td><td style=\"text-align: right;\"> 64</td><td style=\"text-align: right;\">           0.000123766</td><td style=\"text-align: right;\">     1</td><td style=\"text-align: right;\">         13.9326</td><td style=\"text-align: right;\">      0.27464   </td><td style=\"text-align: right;\">            0.966667</td><td style=\"text-align: right;\">     0.401102 </td></tr>\n",
+       "<tr><td>TorchTrainer_5144b_00008</td><td>TERMINATED</td><td>10.0.0.84:111019</td><td style=\"text-align: right;\">64</td><td style=\"text-align: right;\">128</td><td style=\"text-align: right;\">256</td><td style=\"text-align: right;\">           0.00371762 </td><td style=\"text-align: right;\">     5</td><td style=\"text-align: right;\">         21.8337</td><td style=\"text-align: right;\">      0.00108961</td><td style=\"text-align: right;\">            1       </td><td style=\"text-align: right;\">     0.0579874</td></tr>\n",
+       "<tr><td>TorchTrainer_5144b_00009</td><td>TERMINATED</td><td>10.0.0.84:118255</td><td style=\"text-align: right;\">32</td><td style=\"text-align: right;\">128</td><td style=\"text-align: right;\">128</td><td style=\"text-align: right;\">           0.00397956 </td><td style=\"text-align: right;\">     5</td><td style=\"text-align: right;\">         29.8334</td><td style=\"text-align: right;\">      0.00940019</td><td style=\"text-align: right;\">            1       </td><td style=\"text-align: right;\">     0.0685028</td></tr>\n",
+       "</tbody>\n",
+       "</table>\n",
+       "  </div>\n",
+       "</div>\n",
+       "<style>\n",
+       ".tuneStatus {\n",
+       "  color: var(--jp-ui-font-color1);\n",
+       "}\n",
+       ".tuneStatus .systemInfo {\n",
+       "  display: flex;\n",
+       "  flex-direction: column;\n",
+       "}\n",
+       ".tuneStatus td {\n",
+       "  white-space: nowrap;\n",
+       "}\n",
+       ".tuneStatus .trialStatus {\n",
+       "  display: flex;\n",
+       "  flex-direction: column;\n",
+       "}\n",
+       ".tuneStatus h3 {\n",
+       "  font-weight: bold;\n",
+       "}\n",
+       ".tuneStatus .hDivider {\n",
+       "  border-bottom-width: var(--jp-border-width);\n",
+       "  border-bottom-color: var(--jp-border-color0);\n",
+       "  border-bottom-style: solid;\n",
+       "}\n",
+       ".tuneStatus .vDivider {\n",
+       "  border-left-width: var(--jp-border-width);\n",
+       "  border-left-color: var(--jp-border-color0);\n",
+       "  border-left-style: solid;\n",
+       "  margin: 0.5em 1em 0.5em 1em;\n",
+       "}\n",
+       "</style>\n"
+      ],
+      "text/plain": [
+       "<IPython.core.display.HTML object>"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\u001b[2m\u001b[36m(TrainTrainable pid=63990)\u001b[0m 2023-09-07 13:58:43.025064: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=63990)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=63990)\u001b[0m 2023-09-07 13:58:43.165187: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=63990)\u001b[0m 2023-09-07 13:58:43.907088: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=63990)\u001b[0m 2023-09-07 13:58:43.907153: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=63990)\u001b[0m 2023-09-07 13:58:43.907160: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\n",
+      "\u001b[2m\u001b[36m(TorchTrainer pid=63990)\u001b[0m Starting distributed worker processes: ['64101 (10.0.0.84)', '64102 (10.0.0.84)', '64103 (10.0.0.84)']\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64101)\u001b[0m Setting up process group for: env:// [rank=0, world_size=3]\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64102)\u001b[0m 2023-09-07 13:58:50.419714: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64102)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64101)\u001b[0m 2023-09-07 13:58:50.419718: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64101)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64102)\u001b[0m 2023-09-07 13:58:50.555450: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64102)\u001b[0m 2023-09-07 13:58:51.317522: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64102)\u001b[0m 2023-09-07 13:58:51.317610: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64102)\u001b[0m 2023-09-07 13:58:51.317618: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64102)\u001b[0m Missing logger folder: /home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00000_0_batch_size=32,layer_1_size=64,layer_2_size=256,lr=0.0316_2023-09-07_13-58-38/lightning_logs\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64101)\u001b[0m /home/ray/anaconda3/lib/python3.9/site-packages/pytorch_lightning/loops/utilities.py:92: PossibleUserWarning: `max_epochs` was not set. Setting it to 1000 epochs. To train without an epoch limit, set `max_epochs=-1`.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64101)\u001b[0m   rank_zero_warn(\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64101)\u001b[0m GPU available: True, used: True\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64101)\u001b[0m TPU available: False, using: 0 TPU cores\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64101)\u001b[0m IPU available: False, using: 0 IPUs\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64101)\u001b[0m HPU available: False, using: 0 HPUs\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64102)\u001b[0m Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64102)\u001b[0m Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to /tmp/tmpydcy4598/MNIST/raw/train-images-idx3-ubyte.gz\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "100%|██████████| 9912422/9912422 [00:00<00:00, 120812916.07it/s]\n",
+      "100%|██████████| 9912422/9912422 [00:00<00:00, 101305832.98it/s]\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64102)\u001b[0m Extracting /tmp/tmpydcy4598/MNIST/raw/train-images-idx3-ubyte.gz to /tmp/tmpydcy4598/MNIST/raw\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64102)\u001b[0m \n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64102)\u001b[0m LOCAL_RANK: 1 - CUDA_VISIBLE_DEVICES: [0,1,2]\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64101)\u001b[0m \n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64101)\u001b[0m   | Name     | Type               | Params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64101)\u001b[0m ------------------------------------------------\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64101)\u001b[0m 0 | accuracy | MulticlassAccuracy | 0     \n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64101)\u001b[0m 1 | layer_1  | Linear             | 50.2 K\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64101)\u001b[0m 2 | layer_2  | Linear             | 16.6 K\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64101)\u001b[0m 3 | layer_3  | Linear             | 2.6 K \n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64101)\u001b[0m ------------------------------------------------\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64101)\u001b[0m 69.5 K    Trainable params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64101)\u001b[0m 0         Non-trainable params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64101)\u001b[0m 69.5 K    Total params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64101)\u001b[0m 0.278     Total estimated model params size (MB)\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64102)\u001b[0m [W reducer.cpp:1300] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration,  which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator())\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\u001b[2m\u001b[1m\u001b[36m(autoscaler +7m33s)\u001b[0m [autoscaler] Current infeasible resource requests: {\"resourcesBundle\":{\"bundle_group_289661bddaad4820732f117e33d702000000\":0.001}}, {\"resourcesBundle\":{\"bundle_group_d14ed93ffcb267f77984fc5e097c02000000\":0.001}}, {\"resourcesBundle\":{\"bundle_group_9d0f0584af89d9185ad87362359402000000\":0.001}}, {\"resourcesBundle\":{\"bundle_group_b8fdebe2246b003d6e5d0451465b02000000\":0.001}}, {\"resourcesBundle\":{\"bundle_group_35d0a11b5707ef020363a907e5fc02000000\":0.001}}, {\"resourcesBundle\":{\"bundle_group_ba2b3c448809cad351fc7dc545a402000000\":0.001}}, {\"resourcesBundle\":{\"bundle_group_05283c0cbfbb775ad68aacf47bc702000000\":0.001}}, {\"resourcesBundle\":{\"bundle_group_2cd0e3d931d1e356a1ab0f3afb6a02000000\":0.001}}, {\"resourcesBundle\":{\"bundle_group_14f2bd9329dfcde35c77e8474b0f02000000\":0.001}}\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64102)\u001b[0m Checkpoint successfully created at: Checkpoint(filesystem=local, path=/home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00000_0_batch_size=32,layer_1_size=64,layer_2_size=256,lr=0.0316_2023-09-07_13-58-38/checkpoint_000000)\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64103)\u001b[0m 2023-09-07 13:58:50.448640: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64103)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64101)\u001b[0m 2023-09-07 13:58:50.555450: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64101)\u001b[0m 2023-09-07 13:58:51.317611: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\u001b[32m [repeated 4x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64101)\u001b[0m 2023-09-07 13:58:51.317618: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64101)\u001b[0m Missing logger folder: /home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00000_0_batch_size=32,layer_1_size=64,layer_2_size=256,lr=0.0316_2023-09-07_13-58-38/lightning_logs\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "100%|██████████| 4542/4542 [00:00<00:00, 42147187.54it/s]\u001b[32m [repeated 11x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64101)\u001b[0m LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2]\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64101)\u001b[0m [W reducer.cpp:1300] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration,  which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator())\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64102)\u001b[0m Checkpoint successfully created at: Checkpoint(filesystem=local, path=/home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00000_0_batch_size=32,layer_1_size=64,layer_2_size=256,lr=0.0316_2023-09-07_13-58-38/checkpoint_000002)\u001b[32m [repeated 6x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64102)\u001b[0m Checkpoint successfully created at: Checkpoint(filesystem=local, path=/home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00000_0_batch_size=32,layer_1_size=64,layer_2_size=256,lr=0.0316_2023-09-07_13-58-38/checkpoint_000004)\u001b[32m [repeated 6x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=71294)\u001b[0m 2023-09-07 13:59:19.340985: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=71294)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64101)\u001b[0m Checkpoint successfully created at: Checkpoint(filesystem=local, path=/home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00000_0_batch_size=32,layer_1_size=64,layer_2_size=256,lr=0.0316_2023-09-07_13-58-38/checkpoint_000004)\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=71294)\u001b[0m 2023-09-07 13:59:19.479380: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=71294)\u001b[0m 2023-09-07 13:59:20.227539: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=71294)\u001b[0m 2023-09-07 13:59:20.227616: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=71294)\u001b[0m 2023-09-07 13:59:20.227623: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\n",
+      "\u001b[2m\u001b[36m(TorchTrainer pid=71294)\u001b[0m Starting distributed worker processes: ['71407 (10.0.0.84)', '71408 (10.0.0.84)', '71409 (10.0.0.84)']\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71407)\u001b[0m Setting up process group for: env:// [rank=0, world_size=3]\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71408)\u001b[0m 2023-09-07 13:59:26.852631: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71408)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71407)\u001b[0m 2023-09-07 13:59:26.854221: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71407)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71408)\u001b[0m 2023-09-07 13:59:26.986178: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71408)\u001b[0m 2023-09-07 13:59:27.752593: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71408)\u001b[0m 2023-09-07 13:59:27.752672: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71408)\u001b[0m 2023-09-07 13:59:27.752679: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71407)\u001b[0m /home/ray/anaconda3/lib/python3.9/site-packages/pytorch_lightning/loops/utilities.py:92: PossibleUserWarning: `max_epochs` was not set. Setting it to 1000 epochs. To train without an epoch limit, set `max_epochs=-1`.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71407)\u001b[0m   rank_zero_warn(\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71407)\u001b[0m GPU available: True, used: True\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71407)\u001b[0m TPU available: False, using: 0 TPU cores\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71407)\u001b[0m IPU available: False, using: 0 IPUs\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71407)\u001b[0m HPU available: False, using: 0 HPUs\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71408)\u001b[0m Missing logger folder: /home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00001_1_batch_size=64,layer_1_size=128,layer_2_size=64,lr=0.0839_2023-09-07_13-58-38/lightning_logs\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71408)\u001b[0m Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz\u001b[32m [repeated 12x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64101)\u001b[0m Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to /tmp/tmpt8k8jglf/MNIST/raw/t10k-labels-idx1-ubyte.gz\u001b[32m [repeated 11x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64101)\u001b[0m Extracting /tmp/tmpt8k8jglf/MNIST/raw/t10k-labels-idx1-ubyte.gz to /tmp/tmpt8k8jglf/MNIST/raw\u001b[32m [repeated 11x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=64101)\u001b[0m \u001b[32m [repeated 11x across cluster]\u001b[0m\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "  0%|          | 0/9912422 [00:00<?, ?it/s]\n",
+      "100%|██████████| 9912422/9912422 [00:00<00:00, 100664900.56it/s]\n",
+      "100%|██████████| 9912422/9912422 [00:00<00:00, 86590268.41it/s]\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71408)\u001b[0m LOCAL_RANK: 1 - CUDA_VISIBLE_DEVICES: [0,1,2]\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71407)\u001b[0m   | Name     | Type               | Params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71407)\u001b[0m ------------------------------------------------\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71407)\u001b[0m 0 | accuracy | MulticlassAccuracy | 0     \n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71407)\u001b[0m 1 | layer_1  | Linear             | 100 K \n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71407)\u001b[0m 2 | layer_2  | Linear             | 8.3 K \n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71407)\u001b[0m 3 | layer_3  | Linear             | 650   \n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71407)\u001b[0m ------------------------------------------------\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71407)\u001b[0m 109 K     Trainable params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71407)\u001b[0m 0         Non-trainable params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71407)\u001b[0m 109 K     Total params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71407)\u001b[0m 0.438     Total estimated model params size (MB)\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71407)\u001b[0m [W reducer.cpp:1300] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration,  which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator())\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71408)\u001b[0m Checkpoint successfully created at: Checkpoint(filesystem=local, path=/home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00001_1_batch_size=64,layer_1_size=128,layer_2_size=64,lr=0.0839_2023-09-07_13-58-38/checkpoint_000000)\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71409)\u001b[0m 2023-09-07 13:59:26.851614: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71409)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71409)\u001b[0m 2023-09-07 13:59:26.986178: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71409)\u001b[0m 2023-09-07 13:59:27.752674: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\u001b[32m [repeated 4x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71409)\u001b[0m 2023-09-07 13:59:27.752681: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=73540)\u001b[0m 2023-09-07 13:59:38.336002: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=73540)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71409)\u001b[0m Missing logger folder: /home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00001_1_batch_size=64,layer_1_size=128,layer_2_size=64,lr=0.0839_2023-09-07_13-58-38/lightning_logs\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "100%|██████████| 4542/4542 [00:00<00:00, 23461242.33it/s]\u001b[32m [repeated 11x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71407)\u001b[0m \u001b[32m [repeated 5x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71409)\u001b[0m LOCAL_RANK: 2 - CUDA_VISIBLE_DEVICES: [0,1,2]\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71408)\u001b[0m [W reducer.cpp:1300] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration,  which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator())\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71409)\u001b[0m Checkpoint successfully created at: Checkpoint(filesystem=local, path=/home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00001_1_batch_size=64,layer_1_size=128,layer_2_size=64,lr=0.0839_2023-09-07_13-58-38/checkpoint_000000)\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=73540)\u001b[0m 2023-09-07 13:59:38.476177: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=73540)\u001b[0m 2023-09-07 13:59:39.222782: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=73540)\u001b[0m 2023-09-07 13:59:39.222788: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\n",
+      "\u001b[2m\u001b[36m(TorchTrainer pid=73540)\u001b[0m Starting distributed worker processes: ['73647 (10.0.0.84)', '73648 (10.0.0.84)', '73649 (10.0.0.84)']\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73647)\u001b[0m Setting up process group for: env:// [rank=0, world_size=3]\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73648)\u001b[0m 2023-09-07 13:59:45.901023: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73648)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73648)\u001b[0m 2023-09-07 13:59:46.041760: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73649)\u001b[0m 2023-09-07 13:59:45.964229: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73649)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73648)\u001b[0m 2023-09-07 13:59:46.807096: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73648)\u001b[0m 2023-09-07 13:59:46.807173: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73648)\u001b[0m 2023-09-07 13:59:46.807180: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73648)\u001b[0m Missing logger folder: /home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00002_2_batch_size=32,layer_1_size=64,layer_2_size=256,lr=0.0002_2023-09-07_13-58-38/lightning_logs\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73647)\u001b[0m /home/ray/anaconda3/lib/python3.9/site-packages/pytorch_lightning/loops/utilities.py:92: PossibleUserWarning: `max_epochs` was not set. Setting it to 1000 epochs. To train without an epoch limit, set `max_epochs=-1`.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73647)\u001b[0m   rank_zero_warn(\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73647)\u001b[0m GPU available: True, used: True\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73647)\u001b[0m TPU available: False, using: 0 TPU cores\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73647)\u001b[0m IPU available: False, using: 0 IPUs\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73647)\u001b[0m HPU available: False, using: 0 HPUs\n",
+      "  0%|          | 0/9912422 [00:00<?, ?it/s]\n",
+      "100%|██████████| 9912422/9912422 [00:00<00:00, 88411180.44it/s]\n",
+      " 60%|█████▉    | 5931008/9912422 [00:00<00:00, 57942493.14it/s]\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73648)\u001b[0m Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz\u001b[32m [repeated 12x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73648)\u001b[0m Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to /tmp/tmpcy67mfe_/MNIST/raw/train-images-idx3-ubyte.gz\u001b[32m [repeated 13x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71409)\u001b[0m Extracting /tmp/tmpmxchio03/MNIST/raw/t10k-labels-idx1-ubyte.gz to /tmp/tmpmxchio03/MNIST/raw\u001b[32m [repeated 12x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=71409)\u001b[0m \u001b[32m [repeated 12x across cluster]\u001b[0m\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73648)\u001b[0m LOCAL_RANK: 1 - CUDA_VISIBLE_DEVICES: [0,1,2]\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73647)\u001b[0m   | Name     | Type               | Params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73647)\u001b[0m ------------------------------------------------\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73647)\u001b[0m 0 | accuracy | MulticlassAccuracy | 0     \n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73647)\u001b[0m 1 | layer_1  | Linear             | 50.2 K\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73647)\u001b[0m 2 | layer_2  | Linear             | 16.6 K\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73647)\u001b[0m 3 | layer_3  | Linear             | 2.6 K \n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73647)\u001b[0m ------------------------------------------------\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73647)\u001b[0m 69.5 K    Trainable params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73647)\u001b[0m 0         Non-trainable params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73647)\u001b[0m 69.5 K    Total params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73647)\u001b[0m 0.278     Total estimated model params size (MB)\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73648)\u001b[0m [W reducer.cpp:1300] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration,  which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator())\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73648)\u001b[0m Checkpoint successfully created at: Checkpoint(filesystem=local, path=/home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00002_2_batch_size=32,layer_1_size=64,layer_2_size=256,lr=0.0002_2023-09-07_13-58-38/checkpoint_000000)\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73647)\u001b[0m 2023-09-07 13:59:46.102948: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73647)\u001b[0m 2023-09-07 13:59:45.969366: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73647)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73647)\u001b[0m 2023-09-07 13:59:46.898646: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\u001b[32m [repeated 4x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73647)\u001b[0m 2023-09-07 13:59:46.898654: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73647)\u001b[0m Missing logger folder: /home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00002_2_batch_size=32,layer_1_size=64,layer_2_size=256,lr=0.0002_2023-09-07_13-58-38/lightning_logs\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "100%|██████████| 4542/4542 [00:00<00:00, 45575427.67it/s]\u001b[32m [repeated 11x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73647)\u001b[0m \u001b[32m [repeated 4x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73647)\u001b[0m LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2]\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73647)\u001b[0m [W reducer.cpp:1300] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration,  which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator())\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73648)\u001b[0m Checkpoint successfully created at: Checkpoint(filesystem=local, path=/home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00002_2_batch_size=32,layer_1_size=64,layer_2_size=256,lr=0.0002_2023-09-07_13-58-38/checkpoint_000002)\u001b[32m [repeated 6x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73648)\u001b[0m Checkpoint successfully created at: Checkpoint(filesystem=local, path=/home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00002_2_batch_size=32,layer_1_size=64,layer_2_size=256,lr=0.0002_2023-09-07_13-58-38/checkpoint_000004)\u001b[32m [repeated 6x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=80840)\u001b[0m 2023-09-07 14:00:14.333330: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=80840)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=80840)\u001b[0m 2023-09-07 14:00:14.472277: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73647)\u001b[0m Checkpoint successfully created at: Checkpoint(filesystem=local, path=/home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00002_2_batch_size=32,layer_1_size=64,layer_2_size=256,lr=0.0002_2023-09-07_13-58-38/checkpoint_000004)\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=80840)\u001b[0m 2023-09-07 14:00:15.216259: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=80840)\u001b[0m 2023-09-07 14:00:15.216329: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=80840)\u001b[0m 2023-09-07 14:00:15.216336: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\n",
+      "\u001b[2m\u001b[36m(TorchTrainer pid=80840)\u001b[0m Starting distributed worker processes: ['80950 (10.0.0.84)', '80951 (10.0.0.84)', '80952 (10.0.0.84)']\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80950)\u001b[0m Setting up process group for: env:// [rank=0, world_size=3]\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80950)\u001b[0m 2023-09-07 14:00:21.817341: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80950)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80951)\u001b[0m 2023-09-07 14:00:21.817340: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80951)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80950)\u001b[0m 2023-09-07 14:00:21.952950: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80950)\u001b[0m 2023-09-07 14:00:22.721445: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80950)\u001b[0m 2023-09-07 14:00:22.721524: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80950)\u001b[0m 2023-09-07 14:00:22.721531: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80950)\u001b[0m /home/ray/anaconda3/lib/python3.9/site-packages/pytorch_lightning/loops/utilities.py:92: PossibleUserWarning: `max_epochs` was not set. Setting it to 1000 epochs. To train without an epoch limit, set `max_epochs=-1`.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80950)\u001b[0m   rank_zero_warn(\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80950)\u001b[0m GPU available: True, used: True\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80950)\u001b[0m TPU available: False, using: 0 TPU cores\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80950)\u001b[0m IPU available: False, using: 0 IPUs\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80950)\u001b[0m HPU available: False, using: 0 HPUs\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80950)\u001b[0m Missing logger folder: /home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00003_3_batch_size=64,layer_1_size=128,layer_2_size=64,lr=0.0011_2023-09-07_13-58-38/lightning_logs\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80950)\u001b[0m Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz\u001b[32m [repeated 12x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80950)\u001b[0m Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to /tmp/tmpdj6sv23q/MNIST/raw/train-images-idx3-ubyte.gz\u001b[32m [repeated 12x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73647)\u001b[0m Extracting /tmp/tmpjm0jv6rr/MNIST/raw/t10k-labels-idx1-ubyte.gz to /tmp/tmpjm0jv6rr/MNIST/raw\u001b[32m [repeated 12x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=73647)\u001b[0m \u001b[32m [repeated 12x across cluster]\u001b[0m\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "100%|██████████| 9912422/9912422 [00:00<00:00, 120421348.01it/s]\n",
+      "100%|██████████| 9912422/9912422 [00:00<00:00, 111998101.50it/s]\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80950)\u001b[0m LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2]\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80950)\u001b[0m   | Name     | Type               | Params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80950)\u001b[0m ------------------------------------------------\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80950)\u001b[0m 0 | accuracy | MulticlassAccuracy | 0     \n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80950)\u001b[0m 1 | layer_1  | Linear             | 100 K \n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80950)\u001b[0m 2 | layer_2  | Linear             | 8.3 K \n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80950)\u001b[0m 3 | layer_3  | Linear             | 650   \n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80950)\u001b[0m ------------------------------------------------\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80950)\u001b[0m 109 K     Trainable params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80950)\u001b[0m 0         Non-trainable params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80950)\u001b[0m 109 K     Total params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80950)\u001b[0m 0.438     Total estimated model params size (MB)\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80950)\u001b[0m [W reducer.cpp:1300] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration,  which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator())\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80950)\u001b[0m Checkpoint successfully created at: Checkpoint(filesystem=local, path=/home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00003_3_batch_size=64,layer_1_size=128,layer_2_size=64,lr=0.0011_2023-09-07_13-58-38/checkpoint_000000)\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80952)\u001b[0m 2023-09-07 14:00:21.817339: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80952)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80952)\u001b[0m 2023-09-07 14:00:21.952959: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80952)\u001b[0m 2023-09-07 14:00:22.721494: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\u001b[32m [repeated 4x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80952)\u001b[0m 2023-09-07 14:00:22.721502: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80952)\u001b[0m Missing logger folder: /home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00003_3_batch_size=64,layer_1_size=128,layer_2_size=64,lr=0.0011_2023-09-07_13-58-38/lightning_logs\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "100%|██████████| 4542/4542 [00:00<00:00, 39279440.76it/s]\u001b[32m [repeated 11x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80950)\u001b[0m \n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80952)\u001b[0m LOCAL_RANK: 2 - CUDA_VISIBLE_DEVICES: [0,1,2]\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80952)\u001b[0m [W reducer.cpp:1300] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration,  which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator())\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80950)\u001b[0m Checkpoint successfully created at: Checkpoint(filesystem=local, path=/home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00003_3_batch_size=64,layer_1_size=128,layer_2_size=64,lr=0.0011_2023-09-07_13-58-38/checkpoint_000003)\u001b[32m [repeated 9x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=88077)\u001b[0m 2023-09-07 14:00:43.334099: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=88077)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80952)\u001b[0m Checkpoint successfully created at: Checkpoint(filesystem=local, path=/home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00003_3_batch_size=64,layer_1_size=128,layer_2_size=64,lr=0.0011_2023-09-07_13-58-38/checkpoint_000004)\u001b[32m [repeated 5x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=88077)\u001b[0m 2023-09-07 14:00:43.474522: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=88077)\u001b[0m 2023-09-07 14:00:44.217911: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=88077)\u001b[0m 2023-09-07 14:00:44.217986: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=88077)\u001b[0m 2023-09-07 14:00:44.217994: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\n",
+      "\u001b[2m\u001b[36m(TorchTrainer pid=88077)\u001b[0m Starting distributed worker processes: ['88184 (10.0.0.84)', '88185 (10.0.0.84)', '88186 (10.0.0.84)']\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88184)\u001b[0m Setting up process group for: env:// [rank=0, world_size=3]\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88186)\u001b[0m 2023-09-07 14:00:50.980950: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88186)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88185)\u001b[0m 2023-09-07 14:00:50.969448: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88185)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88186)\u001b[0m 2023-09-07 14:00:51.106653: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88186)\u001b[0m 2023-09-07 14:00:51.878087: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88186)\u001b[0m 2023-09-07 14:00:51.878157: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88186)\u001b[0m 2023-09-07 14:00:51.878165: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88186)\u001b[0m Missing logger folder: /home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00004_4_batch_size=32,layer_1_size=32,layer_2_size=128,lr=0.0011_2023-09-07_13-58-38/lightning_logs\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88184)\u001b[0m /home/ray/anaconda3/lib/python3.9/site-packages/pytorch_lightning/loops/utilities.py:92: PossibleUserWarning: `max_epochs` was not set. Setting it to 1000 epochs. To train without an epoch limit, set `max_epochs=-1`.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88184)\u001b[0m   rank_zero_warn(\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88184)\u001b[0m GPU available: True, used: True\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88184)\u001b[0m TPU available: False, using: 0 TPU cores\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88184)\u001b[0m IPU available: False, using: 0 IPUs\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88184)\u001b[0m HPU available: False, using: 0 HPUs\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88186)\u001b[0m Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz\u001b[32m [repeated 12x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88186)\u001b[0m Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to /tmp/tmpd1qkzrfz/MNIST/raw/train-images-idx3-ubyte.gz\u001b[32m [repeated 12x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80951)\u001b[0m Extracting /tmp/tmpyrcbok27/MNIST/raw/t10k-labels-idx1-ubyte.gz to /tmp/tmpyrcbok27/MNIST/raw\u001b[32m [repeated 12x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=80951)\u001b[0m \u001b[32m [repeated 12x across cluster]\u001b[0m\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "  0%|          | 0/9912422 [00:00<?, ?it/s]\n",
+      "100%|██████████| 9912422/9912422 [00:00<00:00, 135946084.34it/s]\n",
+      " 61%|██████▏   | 6094848/9912422 [00:00<00:00, 60581952.53it/s]\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88186)\u001b[0m LOCAL_RANK: 2 - CUDA_VISIBLE_DEVICES: [0,1,2]\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88184)\u001b[0m   | Name     | Type               | Params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88184)\u001b[0m ------------------------------------------------\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88184)\u001b[0m 0 | accuracy | MulticlassAccuracy | 0     \n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88184)\u001b[0m 1 | layer_1  | Linear             | 25.1 K\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88184)\u001b[0m 2 | layer_2  | Linear             | 4.2 K \n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88184)\u001b[0m 3 | layer_3  | Linear             | 1.3 K \n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88184)\u001b[0m ------------------------------------------------\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88184)\u001b[0m 30.6 K    Trainable params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88184)\u001b[0m 0         Non-trainable params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88184)\u001b[0m 30.6 K    Total params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88184)\u001b[0m 0.123     Total estimated model params size (MB)\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88186)\u001b[0m [W reducer.cpp:1300] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration,  which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator())\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88186)\u001b[0m Checkpoint successfully created at: Checkpoint(filesystem=local, path=/home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00004_4_batch_size=32,layer_1_size=32,layer_2_size=128,lr=0.0011_2023-09-07_13-58-38/checkpoint_000000)\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88184)\u001b[0m 2023-09-07 14:00:50.969450: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88184)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88184)\u001b[0m 2023-09-07 14:00:51.106653: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88184)\u001b[0m 2023-09-07 14:00:51.876301: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\u001b[32m [repeated 4x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88184)\u001b[0m 2023-09-07 14:00:51.876309: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88184)\u001b[0m Missing logger folder: /home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00004_4_batch_size=32,layer_1_size=32,layer_2_size=128,lr=0.0011_2023-09-07_13-58-38/lightning_logs\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "100%|██████████| 4542/4542 [00:00<00:00, 47154774.18it/s]\u001b[32m [repeated 11x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88184)\u001b[0m \u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "100%|██████████| 9912422/9912422 [00:00<00:00, 87231776.04it/s]\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88184)\u001b[0m LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2]\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88184)\u001b[0m [W reducer.cpp:1300] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration,  which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator())\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88186)\u001b[0m Checkpoint successfully created at: Checkpoint(filesystem=local, path=/home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00004_4_batch_size=32,layer_1_size=32,layer_2_size=128,lr=0.0011_2023-09-07_13-58-38/checkpoint_000002)\u001b[32m [repeated 6x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88186)\u001b[0m Checkpoint successfully created at: Checkpoint(filesystem=local, path=/home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00004_4_batch_size=32,layer_1_size=32,layer_2_size=128,lr=0.0011_2023-09-07_13-58-38/checkpoint_000004)\u001b[32m [repeated 6x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=95388)\u001b[0m 2023-09-07 14:01:20.343383: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=95388)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88184)\u001b[0m Checkpoint successfully created at: Checkpoint(filesystem=local, path=/home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00004_4_batch_size=32,layer_1_size=32,layer_2_size=128,lr=0.0011_2023-09-07_13-58-38/checkpoint_000004)\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=95388)\u001b[0m 2023-09-07 14:01:20.484476: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=95388)\u001b[0m 2023-09-07 14:01:21.230226: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=95388)\u001b[0m 2023-09-07 14:01:21.230300: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=95388)\u001b[0m 2023-09-07 14:01:21.230307: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\n",
+      "\u001b[2m\u001b[36m(TorchTrainer pid=95388)\u001b[0m Starting distributed worker processes: ['95492 (10.0.0.84)', '95493 (10.0.0.84)', '95494 (10.0.0.84)']\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95492)\u001b[0m Setting up process group for: env:// [rank=0, world_size=3]\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95494)\u001b[0m 2023-09-07 14:01:27.861861: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95494)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95492)\u001b[0m 2023-09-07 14:01:27.861862: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95492)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95494)\u001b[0m 2023-09-07 14:01:27.995553: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95494)\u001b[0m 2023-09-07 14:01:28.761910: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95494)\u001b[0m 2023-09-07 14:01:28.761983: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95494)\u001b[0m 2023-09-07 14:01:28.761990: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95492)\u001b[0m /home/ray/anaconda3/lib/python3.9/site-packages/pytorch_lightning/loops/utilities.py:92: PossibleUserWarning: `max_epochs` was not set. Setting it to 1000 epochs. To train without an epoch limit, set `max_epochs=-1`.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95492)\u001b[0m   rank_zero_warn(\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95492)\u001b[0m GPU available: True, used: True\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95492)\u001b[0m TPU available: False, using: 0 TPU cores\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95492)\u001b[0m IPU available: False, using: 0 IPUs\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95492)\u001b[0m HPU available: False, using: 0 HPUs\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95492)\u001b[0m Missing logger folder: /home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00005_5_batch_size=32,layer_1_size=64,layer_2_size=64,lr=0.0092_2023-09-07_13-58-38/lightning_logs\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95494)\u001b[0m Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz\u001b[32m [repeated 12x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95494)\u001b[0m Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to /tmp/tmpkvf1rrst/MNIST/raw/train-images-idx3-ubyte.gz\u001b[32m [repeated 12x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88184)\u001b[0m Extracting /tmp/tmppk4zrz1w/MNIST/raw/t10k-labels-idx1-ubyte.gz to /tmp/tmppk4zrz1w/MNIST/raw\u001b[32m [repeated 12x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=88184)\u001b[0m \u001b[32m [repeated 12x across cluster]\u001b[0m\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "  0%|          | 0/9912422 [00:00<?, ?it/s]\n",
+      "100%|██████████| 9912422/9912422 [00:00<00:00, 117459779.70it/s]\n",
+      " 74%|███████▍  | 7372800/9912422 [00:00<00:00, 73213483.02it/s]\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95494)\u001b[0m LOCAL_RANK: 2 - CUDA_VISIBLE_DEVICES: [0,1,2]\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95492)\u001b[0m   | Name     | Type               | Params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95492)\u001b[0m ------------------------------------------------\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95492)\u001b[0m 0 | accuracy | MulticlassAccuracy | 0     \n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95492)\u001b[0m 1 | layer_1  | Linear             | 50.2 K\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95492)\u001b[0m 2 | layer_2  | Linear             | 4.2 K \n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95492)\u001b[0m 3 | layer_3  | Linear             | 650   \n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95492)\u001b[0m ------------------------------------------------\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95492)\u001b[0m 55.1 K    Trainable params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95492)\u001b[0m 0         Non-trainable params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95492)\u001b[0m 55.1 K    Total params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95492)\u001b[0m 0.220     Total estimated model params size (MB)\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95494)\u001b[0m [W reducer.cpp:1300] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration,  which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator())\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95494)\u001b[0m Checkpoint successfully created at: Checkpoint(filesystem=local, path=/home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00005_5_batch_size=32,layer_1_size=64,layer_2_size=64,lr=0.0092_2023-09-07_13-58-38/checkpoint_000000)\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95493)\u001b[0m 2023-09-07 14:01:27.861861: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95493)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95493)\u001b[0m 2023-09-07 14:01:27.995552: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95493)\u001b[0m 2023-09-07 14:01:28.758718: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\u001b[32m [repeated 4x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95493)\u001b[0m 2023-09-07 14:01:28.758742: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95494)\u001b[0m Missing logger folder: /home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00005_5_batch_size=32,layer_1_size=64,layer_2_size=64,lr=0.0092_2023-09-07_13-58-38/lightning_logs\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "100%|██████████| 4542/4542 [00:00<00:00, 48598287.67it/s]\u001b[32m [repeated 10x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95492)\u001b[0m \u001b[32m [repeated 4x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95493)\u001b[0m LOCAL_RANK: 1 - CUDA_VISIBLE_DEVICES: [0,1,2]\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95493)\u001b[0m [W reducer.cpp:1300] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration,  which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator())\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95494)\u001b[0m Checkpoint successfully created at: Checkpoint(filesystem=local, path=/home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00005_5_batch_size=32,layer_1_size=64,layer_2_size=64,lr=0.0092_2023-09-07_13-58-38/checkpoint_000002)\u001b[32m [repeated 6x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=101434)\u001b[0m 2023-09-07 14:01:53.326795: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=101434)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95493)\u001b[0m Checkpoint successfully created at: Checkpoint(filesystem=local, path=/home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00005_5_batch_size=32,layer_1_size=64,layer_2_size=64,lr=0.0092_2023-09-07_13-58-38/checkpoint_000003)\u001b[32m [repeated 5x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=101434)\u001b[0m 2023-09-07 14:01:53.463803: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=101434)\u001b[0m 2023-09-07 14:01:54.201636: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=101434)\u001b[0m 2023-09-07 14:01:54.201711: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=101434)\u001b[0m 2023-09-07 14:01:54.201718: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\n",
+      "\u001b[2m\u001b[36m(TorchTrainer pid=101434)\u001b[0m Starting distributed worker processes: ['101544 (10.0.0.84)', '101545 (10.0.0.84)', '101546 (10.0.0.84)']\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101544)\u001b[0m Setting up process group for: env:// [rank=0, world_size=3]\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101545)\u001b[0m 2023-09-07 14:02:00.834273: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101545)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101544)\u001b[0m 2023-09-07 14:02:00.834274: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101544)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101545)\u001b[0m 2023-09-07 14:02:00.968155: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101545)\u001b[0m 2023-09-07 14:02:01.736107: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101545)\u001b[0m 2023-09-07 14:02:01.736184: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101545)\u001b[0m 2023-09-07 14:02:01.736191: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101545)\u001b[0m Missing logger folder: /home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00006_6_batch_size=32,layer_1_size=128,layer_2_size=256,lr=0.0033_2023-09-07_13-58-38/lightning_logs\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101544)\u001b[0m /home/ray/anaconda3/lib/python3.9/site-packages/pytorch_lightning/loops/utilities.py:92: PossibleUserWarning: `max_epochs` was not set. Setting it to 1000 epochs. To train without an epoch limit, set `max_epochs=-1`.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101544)\u001b[0m   rank_zero_warn(\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101544)\u001b[0m GPU available: True, used: True\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101544)\u001b[0m TPU available: False, using: 0 TPU cores\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101544)\u001b[0m IPU available: False, using: 0 IPUs\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101544)\u001b[0m HPU available: False, using: 0 HPUs\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101545)\u001b[0m Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz\u001b[32m [repeated 12x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95492)\u001b[0m Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to /tmp/tmpyy7a6r11/MNIST/raw/t10k-labels-idx1-ubyte.gz\u001b[32m [repeated 11x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95492)\u001b[0m Extracting /tmp/tmpyy7a6r11/MNIST/raw/t10k-labels-idx1-ubyte.gz to /tmp/tmpyy7a6r11/MNIST/raw\u001b[32m [repeated 12x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=95492)\u001b[0m \u001b[32m [repeated 12x across cluster]\u001b[0m\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "  0%|          | 0/9912422 [00:00<?, ?it/s]\n",
+      "100%|██████████| 9912422/9912422 [00:00<00:00, 104607984.65it/s]\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101545)\u001b[0m Extracting /tmp/tmpxobpdr_p/MNIST/raw/train-images-idx3-ubyte.gz to /tmp/tmpxobpdr_p/MNIST/raw\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101545)\u001b[0m Extracting /tmp/tmpxobpdr_p/MNIST/raw/train-labels-idx1-ubyte.gz to /tmp/tmpxobpdr_p/MNIST/raw\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101545)\u001b[0m Extracting /tmp/tmpxobpdr_p/MNIST/raw/t10k-images-idx3-ubyte.gz to /tmp/tmpxobpdr_p/MNIST/raw\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101545)\u001b[0m Extracting /tmp/tmpxobpdr_p/MNIST/raw/t10k-labels-idx1-ubyte.gz to /tmp/tmpxobpdr_p/MNIST/raw\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101545)\u001b[0m LOCAL_RANK: 1 - CUDA_VISIBLE_DEVICES: [0,1,2]\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101544)\u001b[0m   | Name     | Type               | Params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101544)\u001b[0m ------------------------------------------------\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101544)\u001b[0m 0 | accuracy | MulticlassAccuracy | 0     \n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101544)\u001b[0m 1 | layer_1  | Linear             | 100 K \n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101544)\u001b[0m 2 | layer_2  | Linear             | 33.0 K\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101544)\u001b[0m 3 | layer_3  | Linear             | 2.6 K \n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101544)\u001b[0m ------------------------------------------------\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101544)\u001b[0m 136 K     Trainable params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101544)\u001b[0m 0         Non-trainable params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101544)\u001b[0m 136 K     Total params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101544)\u001b[0m 0.544     Total estimated model params size (MB)\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101545)\u001b[0m [W reducer.cpp:1300] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration,  which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator())\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101545)\u001b[0m Checkpoint successfully created at: Checkpoint(filesystem=local, path=/home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00006_6_batch_size=32,layer_1_size=128,layer_2_size=256,lr=0.0033_2023-09-07_13-58-38/checkpoint_000000)\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101546)\u001b[0m 2023-09-07 14:02:00.834275: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101546)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101546)\u001b[0m 2023-09-07 14:02:00.968160: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101546)\u001b[0m 2023-09-07 14:02:01.736182: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\u001b[32m [repeated 4x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101546)\u001b[0m 2023-09-07 14:02:01.736190: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101546)\u001b[0m Missing logger folder: /home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00006_6_batch_size=32,layer_1_size=128,layer_2_size=256,lr=0.0033_2023-09-07_13-58-38/lightning_logs\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "100%|██████████| 4542/4542 [00:00<00:00, 38642046.18it/s]\u001b[32m [repeated 11x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101544)\u001b[0m \u001b[32m [repeated 3x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101546)\u001b[0m LOCAL_RANK: 2 - CUDA_VISIBLE_DEVICES: [0,1,2]\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101544)\u001b[0m [W reducer.cpp:1300] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration,  which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator())\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101545)\u001b[0m Checkpoint successfully created at: Checkpoint(filesystem=local, path=/home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00006_6_batch_size=32,layer_1_size=128,layer_2_size=256,lr=0.0033_2023-09-07_13-58-38/checkpoint_000002)\u001b[32m [repeated 6x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101545)\u001b[0m Checkpoint successfully created at: Checkpoint(filesystem=local, path=/home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00006_6_batch_size=32,layer_1_size=128,layer_2_size=256,lr=0.0033_2023-09-07_13-58-38/checkpoint_000004)\u001b[32m [repeated 6x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=108750)\u001b[0m 2023-09-07 14:02:30.387715: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=108750)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101546)\u001b[0m Checkpoint successfully created at: Checkpoint(filesystem=local, path=/home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00006_6_batch_size=32,layer_1_size=128,layer_2_size=256,lr=0.0033_2023-09-07_13-58-38/checkpoint_000004)\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=108750)\u001b[0m 2023-09-07 14:02:30.526490: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=108750)\u001b[0m 2023-09-07 14:02:31.271200: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=108750)\u001b[0m 2023-09-07 14:02:31.271270: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=108750)\u001b[0m 2023-09-07 14:02:31.271277: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\n",
+      "\u001b[2m\u001b[36m(TorchTrainer pid=108750)\u001b[0m Starting distributed worker processes: ['108861 (10.0.0.84)', '108862 (10.0.0.84)', '108863 (10.0.0.84)']\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108861)\u001b[0m Setting up process group for: env:// [rank=0, world_size=3]\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108862)\u001b[0m 2023-09-07 14:02:38.000239: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108862)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108863)\u001b[0m 2023-09-07 14:02:38.000240: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108863)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108862)\u001b[0m 2023-09-07 14:02:38.137493: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108862)\u001b[0m 2023-09-07 14:02:38.911788: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108862)\u001b[0m 2023-09-07 14:02:38.911870: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108862)\u001b[0m 2023-09-07 14:02:38.911877: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108861)\u001b[0m /home/ray/anaconda3/lib/python3.9/site-packages/pytorch_lightning/loops/utilities.py:92: PossibleUserWarning: `max_epochs` was not set. Setting it to 1000 epochs. To train without an epoch limit, set `max_epochs=-1`.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108861)\u001b[0m   rank_zero_warn(\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108861)\u001b[0m GPU available: True, used: True\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108861)\u001b[0m TPU available: False, using: 0 TPU cores\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108861)\u001b[0m IPU available: False, using: 0 IPUs\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108861)\u001b[0m HPU available: False, using: 0 HPUs\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108862)\u001b[0m Missing logger folder: /home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00007_7_batch_size=32,layer_1_size=32,layer_2_size=64,lr=0.0001_2023-09-07_13-58-38/lightning_logs\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108863)\u001b[0m Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz\u001b[32m [repeated 12x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101546)\u001b[0m Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to /tmp/tmpt_if2tuu/MNIST/raw/t10k-labels-idx1-ubyte.gz\u001b[32m [repeated 12x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101546)\u001b[0m Extracting /tmp/tmpt_if2tuu/MNIST/raw/t10k-labels-idx1-ubyte.gz to /tmp/tmpt_if2tuu/MNIST/raw\u001b[32m [repeated 8x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=101546)\u001b[0m \u001b[32m [repeated 12x across cluster]\u001b[0m\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "  0%|          | 0/9912422 [00:00<?, ?it/s]\n",
+      "100%|██████████| 9912422/9912422 [00:00<00:00, 111226266.99it/s]\n",
+      "100%|██████████| 9912422/9912422 [00:00<00:00, 89971437.39it/s]\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108862)\u001b[0m LOCAL_RANK: 1 - CUDA_VISIBLE_DEVICES: [0,1,2]\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108861)\u001b[0m   | Name     | Type               | Params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108861)\u001b[0m ------------------------------------------------\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108861)\u001b[0m 0 | accuracy | MulticlassAccuracy | 0     \n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108861)\u001b[0m 1 | layer_1  | Linear             | 25.1 K\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108861)\u001b[0m 2 | layer_2  | Linear             | 2.1 K \n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108861)\u001b[0m 3 | layer_3  | Linear             | 650   \n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108861)\u001b[0m ------------------------------------------------\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108861)\u001b[0m 27.9 K    Trainable params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108861)\u001b[0m 0         Non-trainable params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108861)\u001b[0m 27.9 K    Total params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108861)\u001b[0m 0.112     Total estimated model params size (MB)\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108862)\u001b[0m [W reducer.cpp:1300] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration,  which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator())\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108862)\u001b[0m Checkpoint successfully created at: Checkpoint(filesystem=local, path=/home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00007_7_batch_size=32,layer_1_size=32,layer_2_size=64,lr=0.0001_2023-09-07_13-58-38/checkpoint_000000)\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108861)\u001b[0m 2023-09-07 14:02:38.000239: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108861)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108861)\u001b[0m 2023-09-07 14:02:38.137493: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108861)\u001b[0m 2023-09-07 14:02:38.911832: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\u001b[32m [repeated 4x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108861)\u001b[0m 2023-09-07 14:02:38.911839: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108861)\u001b[0m Missing logger folder: /home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00007_7_batch_size=32,layer_1_size=32,layer_2_size=64,lr=0.0001_2023-09-07_13-58-38/lightning_logs\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "100%|██████████| 4542/4542 [00:00<00:00, 42054147.39it/s]\u001b[32m [repeated 11x across cluster]\u001b[0m\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\u001b[2m\u001b[1m\u001b[36m(autoscaler +11m23s)\u001b[0m [workspace snapshot] New snapshot created successfully (Size: 327.01 KB)\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\u001b[2m\u001b[36m(TrainTrainable pid=111019)\u001b[0m 2023-09-07 14:02:51.352608: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=111019)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108861)\u001b[0m \u001b[32m [repeated 3x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108861)\u001b[0m LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2]\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108861)\u001b[0m [W reducer.cpp:1300] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration,  which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator())\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108861)\u001b[0m Checkpoint successfully created at: Checkpoint(filesystem=local, path=/home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00007_7_batch_size=32,layer_1_size=32,layer_2_size=64,lr=0.0001_2023-09-07_13-58-38/checkpoint_000000)\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=111019)\u001b[0m 2023-09-07 14:02:51.493509: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=111019)\u001b[0m 2023-09-07 14:02:52.239731: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=111019)\u001b[0m 2023-09-07 14:02:52.239805: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=111019)\u001b[0m 2023-09-07 14:02:52.239812: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\n",
+      "\u001b[2m\u001b[36m(TorchTrainer pid=111019)\u001b[0m Starting distributed worker processes: ['111129 (10.0.0.84)', '111130 (10.0.0.84)', '111131 (10.0.0.84)']\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111129)\u001b[0m Setting up process group for: env:// [rank=0, world_size=3]\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111131)\u001b[0m 2023-09-07 14:02:58.909958: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111131)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111130)\u001b[0m 2023-09-07 14:02:58.910530: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111130)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111131)\u001b[0m 2023-09-07 14:02:59.041760: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111131)\u001b[0m 2023-09-07 14:02:59.809607: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111131)\u001b[0m 2023-09-07 14:02:59.809682: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111131)\u001b[0m 2023-09-07 14:02:59.809690: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111129)\u001b[0m /home/ray/anaconda3/lib/python3.9/site-packages/pytorch_lightning/loops/utilities.py:92: PossibleUserWarning: `max_epochs` was not set. Setting it to 1000 epochs. To train without an epoch limit, set `max_epochs=-1`.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111129)\u001b[0m   rank_zero_warn(\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111129)\u001b[0m GPU available: True, used: True\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111129)\u001b[0m TPU available: False, using: 0 TPU cores\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111129)\u001b[0m IPU available: False, using: 0 IPUs\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111129)\u001b[0m HPU available: False, using: 0 HPUs\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111131)\u001b[0m Missing logger folder: /home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00008_8_batch_size=64,layer_1_size=128,layer_2_size=256,lr=0.0037_2023-09-07_13-58-38/lightning_logs\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111131)\u001b[0m Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz\u001b[32m [repeated 12x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111131)\u001b[0m Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to /tmp/tmpddnnc0iv/MNIST/raw/train-images-idx3-ubyte.gz\u001b[32m [repeated 13x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108863)\u001b[0m Extracting /tmp/tmpxcg0v86z/MNIST/raw/t10k-labels-idx1-ubyte.gz to /tmp/tmpxcg0v86z/MNIST/raw\u001b[32m [repeated 12x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=108863)\u001b[0m \u001b[32m [repeated 12x across cluster]\u001b[0m\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "100%|██████████| 9912422/9912422 [00:00<00:00, 109686001.97it/s]\n",
+      "100%|██████████| 9912422/9912422 [00:00<00:00, 81254614.76it/s]\n",
+      "100%|██████████| 1648877/1648877 [00:00<00:00, 35741410.23it/s]\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111131)\u001b[0m LOCAL_RANK: 2 - CUDA_VISIBLE_DEVICES: [0,1,2]\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111129)\u001b[0m   | Name     | Type               | Params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111129)\u001b[0m ------------------------------------------------\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111129)\u001b[0m 0 | accuracy | MulticlassAccuracy | 0     \n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111129)\u001b[0m 1 | layer_1  | Linear             | 100 K \n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111129)\u001b[0m 2 | layer_2  | Linear             | 33.0 K\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111129)\u001b[0m 3 | layer_3  | Linear             | 2.6 K \n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111129)\u001b[0m ------------------------------------------------\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111129)\u001b[0m 136 K     Trainable params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111129)\u001b[0m 0         Non-trainable params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111129)\u001b[0m 136 K     Total params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111129)\u001b[0m 0.544     Total estimated model params size (MB)\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111131)\u001b[0m [W reducer.cpp:1300] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration,  which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator())\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111131)\u001b[0m Checkpoint successfully created at: Checkpoint(filesystem=local, path=/home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00008_8_batch_size=64,layer_1_size=128,layer_2_size=256,lr=0.0037_2023-09-07_13-58-38/checkpoint_000000)\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111129)\u001b[0m 2023-09-07 14:02:58.906403: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111129)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111129)\u001b[0m 2023-09-07 14:02:59.041757: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111129)\u001b[0m 2023-09-07 14:02:59.809306: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\u001b[32m [repeated 4x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111129)\u001b[0m 2023-09-07 14:02:59.809314: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111129)\u001b[0m Missing logger folder: /home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00008_8_batch_size=64,layer_1_size=128,layer_2_size=256,lr=0.0037_2023-09-07_13-58-38/lightning_logs\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "100%|██████████| 4542/4542 [00:00<00:00, 37135533.66it/s]\u001b[32m [repeated 11x across cluster]\u001b[0m\n",
+      "100%|██████████| 9912422/9912422 [00:00<00:00, 92298990.88it/s]\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111129)\u001b[0m \u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111129)\u001b[0m LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2]\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111129)\u001b[0m [W reducer.cpp:1300] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration,  which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator())\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111131)\u001b[0m Checkpoint successfully created at: Checkpoint(filesystem=local, path=/home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00008_8_batch_size=64,layer_1_size=128,layer_2_size=256,lr=0.0037_2023-09-07_13-58-38/checkpoint_000003)\u001b[32m [repeated 9x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=118255)\u001b[0m 2023-09-07 14:03:20.351292: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=118255)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111129)\u001b[0m Checkpoint successfully created at: Checkpoint(filesystem=local, path=/home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00008_8_batch_size=64,layer_1_size=128,layer_2_size=256,lr=0.0037_2023-09-07_13-58-38/checkpoint_000004)\u001b[32m [repeated 5x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=118255)\u001b[0m 2023-09-07 14:03:20.492641: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=118255)\u001b[0m 2023-09-07 14:03:21.239037: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=118255)\u001b[0m 2023-09-07 14:03:21.239106: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
+      "\u001b[2m\u001b[36m(TrainTrainable pid=118255)\u001b[0m 2023-09-07 14:03:21.239113: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\n",
+      "\u001b[2m\u001b[36m(TorchTrainer pid=118255)\u001b[0m Starting distributed worker processes: ['118362 (10.0.0.84)', '118363 (10.0.0.84)', '118364 (10.0.0.84)']\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118362)\u001b[0m Setting up process group for: env:// [rank=0, world_size=3]\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118363)\u001b[0m 2023-09-07 14:03:27.930188: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118363)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118364)\u001b[0m 2023-09-07 14:03:27.917602: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118364)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118363)\u001b[0m 2023-09-07 14:03:28.052415: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118363)\u001b[0m 2023-09-07 14:03:28.822569: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118363)\u001b[0m 2023-09-07 14:03:28.822644: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118363)\u001b[0m 2023-09-07 14:03:28.822652: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118363)\u001b[0m Missing logger folder: /home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00009_9_batch_size=32,layer_1_size=128,layer_2_size=128,lr=0.0040_2023-09-07_13-58-38/lightning_logs\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118362)\u001b[0m /home/ray/anaconda3/lib/python3.9/site-packages/pytorch_lightning/loops/utilities.py:92: PossibleUserWarning: `max_epochs` was not set. Setting it to 1000 epochs. To train without an epoch limit, set `max_epochs=-1`.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118362)\u001b[0m   rank_zero_warn(\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118362)\u001b[0m GPU available: True, used: True\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118362)\u001b[0m TPU available: False, using: 0 TPU cores\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118362)\u001b[0m IPU available: False, using: 0 IPUs\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118362)\u001b[0m HPU available: False, using: 0 HPUs\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118364)\u001b[0m Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz\u001b[32m [repeated 12x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118364)\u001b[0m Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to /tmp/tmp0sbwiedt/MNIST/raw/train-images-idx3-ubyte.gz\u001b[32m [repeated 12x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111130)\u001b[0m Extracting /tmp/tmpfmuq9_qh/MNIST/raw/t10k-labels-idx1-ubyte.gz to /tmp/tmpfmuq9_qh/MNIST/raw\u001b[32m [repeated 12x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=111130)\u001b[0m \u001b[32m [repeated 12x across cluster]\u001b[0m\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "  0%|          | 0/9912422 [00:00<?, ?it/s]\n",
+      "100%|██████████| 9912422/9912422 [00:00<00:00, 109752309.17it/s]\n",
+      "100%|██████████| 9912422/9912422 [00:00<00:00, 92575620.67it/s]\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118363)\u001b[0m LOCAL_RANK: 1 - CUDA_VISIBLE_DEVICES: [0,1,2]\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118362)\u001b[0m   | Name     | Type               | Params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118362)\u001b[0m ------------------------------------------------\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118362)\u001b[0m 0 | accuracy | MulticlassAccuracy | 0     \n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118362)\u001b[0m 1 | layer_1  | Linear             | 100 K \n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118362)\u001b[0m 2 | layer_2  | Linear             | 16.5 K\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118362)\u001b[0m 3 | layer_3  | Linear             | 1.3 K \n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118362)\u001b[0m ------------------------------------------------\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118362)\u001b[0m 118 K     Trainable params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118362)\u001b[0m 0         Non-trainable params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118362)\u001b[0m 118 K     Total params\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118362)\u001b[0m 0.473     Total estimated model params size (MB)\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118363)\u001b[0m [W reducer.cpp:1300] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration,  which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator())\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118363)\u001b[0m Checkpoint successfully created at: Checkpoint(filesystem=local, path=/home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00009_9_batch_size=32,layer_1_size=128,layer_2_size=128,lr=0.0040_2023-09-07_13-58-38/checkpoint_000000)\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118362)\u001b[0m 2023-09-07 14:03:27.912682: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F AVX512_VNNI FMA\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118362)\u001b[0m To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118362)\u001b[0m 2023-09-07 14:03:28.050355: I tensorflow/core/util/port.cc:104] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118362)\u001b[0m 2023-09-07 14:03:28.816159: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/nvidia/lib:/usr/local/nvidia/lib64\u001b[32m [repeated 4x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118362)\u001b[0m 2023-09-07 14:03:28.816166: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118362)\u001b[0m Missing logger folder: /home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00009_9_batch_size=32,layer_1_size=128,layer_2_size=128,lr=0.0040_2023-09-07_13-58-38/lightning_logs\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "100%|██████████| 4542/4542 [00:00<00:00, 42810177.01it/s]\u001b[32m [repeated 11x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118362)\u001b[0m \u001b[32m [repeated 4x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118362)\u001b[0m LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2]\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118362)\u001b[0m [W reducer.cpp:1300] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration,  which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator())\u001b[32m [repeated 2x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118363)\u001b[0m Checkpoint successfully created at: Checkpoint(filesystem=local, path=/home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00009_9_batch_size=32,layer_1_size=128,layer_2_size=128,lr=0.0040_2023-09-07_13-58-38/checkpoint_000002)\u001b[32m [repeated 6x across cluster]\u001b[0m\n",
+      "\u001b[2m\u001b[36m(RayTrainWorker pid=118363)\u001b[0m Checkpoint successfully created at: Checkpoint(filesystem=local, path=/home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00009_9_batch_size=32,layer_1_size=128,layer_2_size=128,lr=0.0040_2023-09-07_13-58-38/checkpoint_000004)\u001b[32m [repeated 6x across cluster]\u001b[0m\n",
+      "2023-09-07 14:03:52,186\tINFO tune.py:1143 -- Total run time: 313.94 seconds (313.92 seconds for the tuning loop).\n"
+     ]
+    }
+   ],
    "source": [
     "def tune_mnist_asha(num_samples=10):\n",
     "    scheduler = ASHAScheduler(max_t=num_epochs, grace_period=1, reduction_factor=2)\n",
     "\n",
     "    tuner = tune.Tuner(\n",
-    "        lightning_trainer,\n",
-    "        param_space={\"lightning_config\": searchable_lightning_config},\n",
+    "        ray_trainer,\n",
+    "        param_space={\"train_loop_config\": search_space},\n",
     "        tune_config=tune.TuneConfig(\n",
     "            metric=\"ptl/val_accuracy\",\n",
     "            mode=\"max\",\n",
     "            num_samples=num_samples,\n",
     "            scheduler=scheduler,\n",
     "        ),\n",
-    "        run_config=air.RunConfig(\n",
-    "            name=\"tune_mnist_asha\",\n",
-    "        ),\n",
     "    )\n",
-    "    results = tuner.fit()\n",
-    "    best_result = results.get_best_result(metric=\"ptl/val_accuracy\", mode=\"max\")\n",
-    "    best_result\n",
+    "    return tuner.fit()\n",
     "\n",
     "\n",
-    "tune_mnist_asha(num_samples=num_samples)"
+    "results = tune_mnist_asha(num_samples=num_samples)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 29,
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Result(\n",
+       "  metrics={'ptl/train_loss': 0.00108961365185678, 'ptl/train_accuracy': 1.0, 'ptl/val_loss': 0.05798737704753876, 'ptl/val_accuracy': 0.9820601940155029, 'epoch': 4, 'step': 1435},\n",
+       "  path='/home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00008_8_batch_size=64,layer_1_size=128,layer_2_size=256,lr=0.0037_2023-09-07_13-58-38',\n",
+       "  filesystem='local',\n",
+       "  checkpoint=Checkpoint(filesystem=local, path=/home/ray/ray_results/TorchTrainer_2023-09-07_13-58-38/TorchTrainer_5144b_00008_8_batch_size=64,layer_1_size=128,layer_2_size=256,lr=0.0037_2023-09-07_13-58-38/checkpoint_000004)\n",
+       ")"
+      ]
+     },
+     "execution_count": 29,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "results.get_best_result(metric=\"ptl/val_accuracy\", mode=\"max\")"
    ]
   },
   {
-   "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
     "In the example above, Tune runs 10 trials with different hyperparameter configurations.\n",
-    "An example output could look like so:\n",
-    "\n",
-    "```{code-block} bash\n",
-    ":emphasize-lines: 12\n",
-    "\n",
-    "  +------------------------------+------------+-------------------+----------------+----------------+-------------+----------+-----------------+----------------------+\n",
-    "  | Trial name                   | status     | loc               |   layer_1_size |   layer_2_size |          lr |     loss |   mean_accuracy |   training_iteration |\n",
-    "  |------------------------------+------------+-------------------+----------------+----------------+-------------+----------+-----------------+----------------------|\n",
-    "  | LightningTrainer_9532b_00001 | TERMINATED |  10.0.37.7:448989 |            32  |            64  | 0.00025324  | 0.58146  |       0.866667  |                   1  |\n",
-    "  | LightningTrainer_9532b_00002 | TERMINATED |  10.0.37.7:449722 |            128 |            128 | 0.000166782 | 0.29038  |       0.933333  |                   2  |\n",
-    "  | LightningTrainer_9532b_00003 | TERMINATED |  10.0.37.7:453404 |            64  |            128 | 0.0004948\t  | 0.15375  |       0.9       |                   4  |\n",
-    "  | LightningTrainer_9532b_00004 | TERMINATED |  10.0.37.7:457981 |            128 |            128 | 0.000304361 | 0.17622  |       0.966667  |                   4  |\n",
-    "  | LightningTrainer_9532b_00005 | TERMINATED |  10.0.37.7:467478 |            128 |            64  | 0.0344561\t  | 0.34665  |       0.866667  |                   1  |\n",
-    "  | LightningTrainer_9532b_00006 | TERMINATED |  10.0.37.7:484401 |            128 |            256 | 0.0262851\t  | 0.34981  |       0.866667  |                   1  |\n",
-    "  | LightningTrainer_9532b_00007 | TERMINATED |  10.0.37.7:490670 |            32  |            128 | 0.0550712\t  | 0.62575  |       0.766667  |                   1  |\n",
-    "  | LightningTrainer_9532b_00008 | TERMINATED |  10.0.37.7:491159 |            32  |            64  | 0.000489046 | 0.27384  |       0.966667  |                   2  |\n",
-    "  | LightningTrainer_9532b_00009 | TERMINATED |  10.0.37.7:491494 |            64  |            256 | 0.000395127 | 0.09642  |       0.933333  |                   4  |\n",
-    "  +------------------------------+------------+-------------------+----------------+----------------+-------------+----------+-----------------+----------------------+\n",
-    "```\n",
     "\n",
-    "As you can see in the `training_iteration` column, trials with a high loss\n",
-    "(and low accuracy) have been terminated early. The best performing trial used\n",
-    "`layer_1_size=32`, `layer_2_size=64`, and `lr=0.000489046`."
+    "As you can see in the `training_iteration` column, trials with a high loss (and low accuracy) have been terminated early. The best performing trial used\n",
+    "`batch_size=64`, `layer_1_size=128`, `layer_2_size=256`, and `lr=0.0037`."
    ]
   },
   {
-   "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
     "## More PyTorch Lightning Examples\n",
     "\n",
-    "- {ref}`Use LightningTrainer for Image Classification <lightning_mnist_example>`.\n",
-    "- {ref}`Use LightningTrainer with Ray Data and Batch Predictor <lightning_advanced_example>`\n",
-    "- {ref}`Experiment Tracking with Ray Train <train-experiment-tracking-native>`\n",
-    "- {ref}`Fine-tune a Large Language Model with LightningTrainer and FSDP <dolly_lightning_fsdp_finetuning>`\n",
+    "- {ref}`[Basic] Train a PyTorch Lightning Image Classifier with Ray Train <lightning_mnist_example>`.\n",
+    "- {ref}`[Intermediate] Fine-tune a BERT Text Classifier with PyTorch Lightning and Ray Train <lightning_advanced_example>`\n",
+    "- {ref}`[Advanced] Fine-tune dolly-v2-7b with PyTorch Lightning and FSDP <dolly_lightning_fsdp_finetuning>`\n",
     "- {doc}`/tune/examples/includes/mlflow_ptl_example`: Example for using [MLflow](https://github.com/mlflow/mlflow/)\n",
-    "  and [Pytorch Lightning](https://github.com/PyTorchLightning/pytorch-lightning) with Ray Tune.\n",
-    "- {doc}`/tune/examples/includes/mnist_ptl_mini`:\n",
-    "  A minimal example of using [Pytorch Lightning](https://github.com/PyTorchLightning/pytorch-lightning)\n",
-    "  to train a MNIST model. This example utilizes the Ray Tune-provided\n",
-    "  {ref}`PyTorch Lightning callbacks <tune-integration-pytorch-lightning>`.\n"
+    "  and [Pytorch Lightning](https://github.com/PyTorchLightning/pytorch-lightning) with Ray Tune.\n"
    ]
   }
  ],
@@ -549,7 +1332,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.16"
+   "version": "3.9.15"
   }
  },
  "nbformat": 4,
diff --git a/doc/source/tune/examples/tune-vanilla-pytorch-lightning.ipynb b/doc/source/tune/examples/tune-vanilla-pytorch-lightning.ipynb
index 2f1e4a1e30af2..58746f314e87e 100644
--- a/doc/source/tune/examples/tune-vanilla-pytorch-lightning.ipynb
+++ b/doc/source/tune/examples/tune-vanilla-pytorch-lightning.ipynb
@@ -494,7 +494,7 @@
     "make sure multiple trials can share GPUs and there is enough memory to do so.\n",
     "Ray does not automatically handle this for you.\n",
     "\n",
-    "If you want to use multiple GPUs per trial, you should check out {class}`LightningTrainer <ray.train.lightning.LightningTrainer>`.\n",
+    "If you want to use multiple GPUs per trial, you should check out {ref}`Getting Start with Lightning and Ray TorchTrainer <train-pytorch-lightning>`.\n",
     "\n",
     "### Putting it together\n",
     "\n",