TPU support for gossipy #7

ParsaMokhtariHessar · 2023-08-25T03:51:36Z

I recently attempted to bring TPU support to gossipy. however, when using it it shows 4 hours and 30 minutes to completion of my simulation, significantly slower than the GPU training time of approximately 45 minutes. I was wondering if you could make it work:
I have read that TPUs are supposed to be significantly faster but my attempt does not reflect that!
Here is how I changed stuff :
first installing torch_XLA:

!pip install torch_xla https://storage.googleapis.com/tpu-pytorch/wheels/colab/torch_xla-2.0-cp310-cp310-linux_x86_64.whl

import torch_xla.core.xla_model as xm

class GlobalSettings(metaclass=Singleton):
    """Global settings for the library.""" 
    
    _device = 'cpu'

    def auto_device(self) -> torch.device:
        """Set device to TPU if available, otherwise cuda if available, otherwise cpu.
        
        Returns
        -------
        torch.device
            The device.
        """
        if xm.xla_device_exists():
            self._device = xm.xla_device()
        elif torch.cuda.is_available():
            self._device = torch.device('cuda')
        else:
            self._device = torch.device('cpu')
        return self._device
    
    def set_device(self, device_name: str) -> torch.device:
        """Set the device.
    
        Parameters
        ----------
        device_name: name of the device to set (possible values are 'auto', 'cuda', 'cpu', and 'tpu').
        When device_name is 'auto', 'cuda' is used if available, otherwise 'cpu'.
        
        Returns
        -------
        torch.device
            The device.
        """

        if device_name == "auto":
            return GlobalSettings().auto_device()
        elif device_name == "tpu" and xm.xla_device():
            self._device = xm.xla_device()
        else:
            self._device = torch.device(device_name)
        
        return self._device
    
    def get_device(self):
        """Get the device.

        Returns
        -------
        torch.device
            The device.
        """
        return self._device

*****TPU

***********GPU

The text was updated successfully, but these errors were encountered:

makgyver · 2023-08-25T07:21:47Z

Hi @ParsaMokhtariHessar , thank you for your interest in gossipy. It seems that what you have done is correct, so it's hard to say what is wrong. Have you tried working with TPUs using a neural net outside gossipy? So to be sure that the problem comes from the framework. Anyway, (this happens also with GPUs), in these simulations there are a bunch of models and only one of them is loaded into the GPU/TPU memory at a time, and what happens is that the time overhead to move the models in/out of the memory usually overcomes the benefits of using GPU/TPUs. This is my take on this, but I honestly have to say that I did not dig into it.

ParsaMokhtariHessar · 2023-08-25T14:12:05Z

I see! And I guess it would be too much memory to attempt to move all calculations to the TPU VRAM. Nonetheless the "Loading trip" is significantly faster in the case of a GPU. I am going to attempt to time a simple MNIST train with CPU, GPU, and TPU and get back to you.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TPU support for gossipy #7

TPU support for gossipy #7

ParsaMokhtariHessar commented Aug 25, 2023

makgyver commented Aug 25, 2023

ParsaMokhtariHessar commented Aug 25, 2023

TPU support for gossipy #7

TPU support for gossipy #7

Comments

ParsaMokhtariHessar commented Aug 25, 2023

makgyver commented Aug 25, 2023

ParsaMokhtariHessar commented Aug 25, 2023