Refactor torch device types out of od and into _types #829

mauicv · 2023-07-11T09:57:32Z

What is this:

Defines TorchDeviceTypes: TypeAlias = Optional[Union[Literal['cuda', 'gpu', 'cpu'], 'torch.device']] in _types.py and refactors the typing for the device in the detectors.

fixes #779, #679. Also fixes #763

codecov · 2023-07-11T10:44:01Z

Codecov Report

Merging #829 (a3519f3) into master (d19cf09) will increase coverage by 0.08%.
The diff coverage is 94.84%.

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #829      +/-   ##
==========================================
+ Coverage   81.90%   81.98%   +0.08%     
==========================================
  Files         159      159              
  Lines       10338    10375      +37     
==========================================
+ Hits         8467     8506      +39     
+ Misses       1871     1869       -2

Files Changed	Coverage Δ
alibi_detect/saving/schemas.py	`97.96% <86.84%> (-0.82%)`	⬇️
alibi_detect/cd/classifier.py	`100.00% <100.00%> (ø)`
alibi_detect/cd/context_aware.py	`97.50% <100.00%> (+0.06%)`	⬆️
alibi_detect/cd/keops/learned_kernel.py	`94.20% <100.00%> (+0.04%)`	⬆️
alibi_detect/cd/keops/mmd.py	`98.24% <100.00%> (+0.03%)`	⬆️
alibi_detect/cd/learned_kernel.py	`100.00% <100.00%> (ø)`
alibi_detect/cd/lsdd.py	`97.14% <100.00%> (+0.08%)`	⬆️
alibi_detect/cd/lsdd_online.py	`93.75% <100.00%> (+0.13%)`	⬆️
alibi_detect/cd/mmd.py	`97.77% <100.00%> (+0.05%)`	⬆️
alibi_detect/cd/mmd_online.py	`94.44% <100.00%> (+0.10%)`	⬆️
... and 31 more

alibi_detect/utils/_types.py

ascillitoe · 2023-07-13T13:16:03Z

alibi_detect/cd/classifier.py

-            Can be specified by passing either 'cuda', 'gpu' or 'cpu'. Only relevant for 'pytorch' backend.
+            Device type used. The default tries to use the GPU and falls back on CPU if needed.
+            Can be specified by passing either ``'cuda'``, ``'gpu'``, ``'cpu'`` or an instance of
+            ``torch.device``. Only relevant for 'pytorch' backend.


Just out of curiosity, if you update the intersphinx_mapping like pytorch/pytorch#10400 and then reference torch.device like :py:class:torch.device (should be in backticks), does it work?

Can I leave this to explore in a separate issue? This PR already has a much wider scope than initially intended! 😅

ascillitoe · 2023-07-13T13:21:19Z

alibi_detect/utils/_types.py

@@ -37,3 +34,5 @@
 # type aliases, for use with mypy (must be FwdRef's if involving opt. deps.)
 OptimizerTF: TypeAlias = Union['tf.keras.optimizers.Optimizer', 'tf.keras.optimizers.legacy.Optimizer',
                               Type['tf.keras.optimizers.Optimizer'], Type['tf.keras.optimizers.legacy.Optimizer']]
+
+TorchDeviceType: TypeAlias = Optional[Union[Literal['cuda', 'gpu', 'cpu'], 'torch.device']]


re the forward reference 'torch.device' in here, I can't think of a good fix at the moment, but just noting that this introduces lots of additional sphinx warnings, and is not rendered "perfectly" in the docs (we've gone from 6 to 29 warnings, which makes me sad).

I suspect the forward ref would be resolved during docs compilation if we installed alibi-detect[all] on read-the-docs (#499) which is now allowed, but it seems wasteful...

😮‍💨 arg... I'll open an issue. Maybe this PR might need to be reigned in! Or split into two!

ascillitoe · 2023-07-13T13:25:44Z

alibi_detect/saving/_pytorch/saving.py

+    device
+        Device type used. The default tries to use the GPU and falls back on CPU if needed.
+            Can be specified by passing either ``'cuda'``, ``'gpu'``, ``'cpu'`` or an instance of
+            ``torch.device``.


Unnecessary indents?

Docstring also seems slightly inaccurate? Maybe just something like Torch device to be serialised.?

ascillitoe · 2023-07-13T13:36:00Z

alibi_detect/saving/_pytorch/saving.py

+
+    Returns
+    -------
+    a string with value ``'cuda'`` or ``'cpu'``.


str(torch.device('cuda:0')) will return 'cuda:0', which makes the Returns docstring slightly incorrect, but will also break our save/load. I think save/load itself would work, as 'cuda:0' will be resolved by get_device just fine. However, pydantic validation will fail since we have Literal['cpu', 'gpu', 'cuda'].

Possible solutions to me are:

Implement Inconsistency in device kwarg between detectors and preprocess_drift function #679 (comment) properly, by implementing a custom pydantic validator to properly validate 'cuda:<int>' strings.

Relax the pydantic validation in schemas.py to device: Optional[str] = None for now.

Remove support for passing torch.device from this PR completely.

Do nothing, except throw a warning/error in get_device if torch.device passed with a device index. So user knows they cannot serialise the detector when doing this...

Shall we just format the str(torch.device('cuda:0')) to remove the device index and raise a warning alerting the user to the change?

I like this solution. It is simple, and prevents serialised detectors being unloadable e.g. if saved with cuda:8 and loaded on a 4 gpu machine.

If we extend the pydantic validation to support the device index in the future, we could still save as cuda, and the user could manually add a device index in the config.toml if they desired.

Ah, so when saving the detector first gets the config, then validates and then replaces the values with the string representations... 🤔 I've added the Pydantic validation as It seems like the best way of going about this.

I've kept it simple for now though, it just validates the device type from str(device).

Damn I forgot about the pre-saving validation...

ascillitoe · 2023-07-13T13:39:39Z

alibi_detect/saving/saving.py

@@ -188,6 +189,11 @@ def _save_detector_config(detector: ConfigurableDetector,
    if optimizer is not None:
        cfg['optimizer'] = _save_optimizer_config(optimizer)

+    # Serialize device
+    device = cfg.get('device')
+    if device is not None:


See https://github.com/SeldonIO/alibi-detect/pull/829/files#r1262579606

Instead of the _save_device_config wrapper, isn't it easier just to do cfg['device'] = save_device_config_pt(device) here?

Granted, we do have a _save_optimizer_config wrapper, but that is a little different since we do have some sort of optimizer for tensorflow and torch. For device, is is torch only atm so not sure we need the wrapper...

p.s. Maybe _save_device would be more accurate than _save_device_config? _save_optimizer_config etc are named _config since they do actually return a "config dict", whereas _save_device_config is only returning a str.

ascillitoe · 2023-07-13T13:40:21Z

alibi_detect/saving/saving.py

+
+    # if device is not none then we're using pytorch
+    if device is not None:
+        return save_device_config_pt(device)


Isn't if device is not None unnecessary? Can you even arrive inside _save_device_config if device is None? Because its already checked here?

ascillitoe · 2023-07-13T13:45:12Z

alibi_detect/saving/schemas.py

@@ -295,7 +295,7 @@ class PreprocessConfig(CustomBaseModel):
    Optional tokenizer for text drift. Either a string referencing a HuggingFace tokenizer model name, or a
    :class:`~alibi_detect.utils.schemas.TokenizerConfig`.
    """
-    device: Optional[Literal['cpu', 'cuda']] = None
+    device: Optional[Literal['cpu', 'cuda', 'gpu']] = None


Content of this pending decision on https://github.com/SeldonIO/alibi-detect/pull/829/files#r1262574030

Note: agreed to format device string to remove device index prior to saving. See this comment

ascillitoe

Few minor comments, main one regarding serialisation.

I'll do a final pass once tests are written. Regarding tests, I reckon we could get away with a single unit test saving with save_device_config and then running through get_device? Parameterised with all the supported device types...

alibi_detect/saving/schemas.py

ascillitoe

LGTM bar one minor nitpick

Refactor torch device types out of od and into _types

3267557

mauicv added 4 commits July 11, 2023 15:01

Update types for device throughout detect

38ca48b

Update saving to account for torch.device

24e0bc8

Update doc string

65b910f

Remove redundant logic in _types

3b79f49

mauicv requested a review from ascillitoe July 11, 2023 15:49

ascillitoe reviewed Jul 13, 2023

View reviewed changes

alibi_detect/utils/_types.py Show resolved Hide resolved

ascillitoe reviewed Jul 13, 2023

View reviewed changes

mauicv added 7 commits July 20, 2023 16:09

Add saving test for torch device logic

d21a882

Add pydantic validation for supported torch devices

aaa781b

Merge branch 'master' into feature/refactor-device-types

15afda1

Fix save device config docstrings

59ba0c9

Address pr comments

69322a0

Add test for device save

0b6cdd3

Fix optional dependency tests

2d05b32

mauicv requested a review from ascillitoe July 21, 2023 16:30

ascillitoe reviewed Jul 26, 2023

View reviewed changes

alibi_detect/saving/schemas.py Outdated Show resolved Hide resolved

ascillitoe approved these changes Jul 26, 2023

View reviewed changes

Minor change

a3519f3

mauicv merged commit c2f0a5a into SeldonIO:master Jul 26, 2023
16 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor torch device types out of od and into _types #829

Refactor torch device types out of od and into _types #829

mauicv commented Jul 11, 2023 •

edited

Loading

codecov bot commented Jul 11, 2023 •

edited

Loading

ascillitoe Jul 13, 2023

mauicv Jul 21, 2023

ascillitoe Jul 13, 2023

mauicv Jul 21, 2023 •

edited

Loading

ascillitoe Jul 13, 2023 •

edited

Loading

ascillitoe Jul 13, 2023 •

edited

Loading

ascillitoe Jul 13, 2023

mauicv Jul 20, 2023

ascillitoe Jul 20, 2023

mauicv Jul 21, 2023 •

edited

Loading

ascillitoe Jul 21, 2023

ascillitoe Jul 13, 2023 •

edited

Loading

ascillitoe Jul 13, 2023 •

edited

Loading

ascillitoe Jul 13, 2023

ascillitoe Jul 13, 2023

ascillitoe Jul 13, 2023

mauicv Jul 21, 2023

ascillitoe left a comment

ascillitoe left a comment

Refactor torch device types out of od and into _types #829

Refactor torch device types out of od and into _types #829

Conversation

mauicv commented Jul 11, 2023 • edited Loading

What is this:

codecov bot commented Jul 11, 2023 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mauicv Jul 21, 2023 • edited Loading

Choose a reason for hiding this comment

ascillitoe Jul 13, 2023 • edited Loading

Choose a reason for hiding this comment

ascillitoe Jul 13, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mauicv Jul 21, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ascillitoe Jul 13, 2023 • edited Loading

Choose a reason for hiding this comment

ascillitoe Jul 13, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ascillitoe left a comment

Choose a reason for hiding this comment

ascillitoe left a comment

Choose a reason for hiding this comment

mauicv commented Jul 11, 2023 •

edited

Loading

codecov bot commented Jul 11, 2023 •

edited

Loading

mauicv Jul 21, 2023 •

edited

Loading

ascillitoe Jul 13, 2023 •

edited

Loading

ascillitoe Jul 13, 2023 •

edited

Loading

mauicv Jul 21, 2023 •

edited

Loading

ascillitoe Jul 13, 2023 •

edited

Loading

ascillitoe Jul 13, 2023 •

edited

Loading