Raise error for inconsistent add_ml_model and add_script parameters #324

juliaputko · 2023-07-21T00:02:45Z

Added error message for Model.add_ml_model and Model.add_script when devices_per_node parameter is >1 and device parameter is set to CPU.

Test for add_ml_model and add_script added to ensure error is correctly thrown.

…_script bad params

smartsim/entity/dbobject.py

ankona · 2023-07-25T18:52:00Z

smartsim/entity/dbobject.py

            for device_num in range(self.devices_per_node):
                devices.append(f"{self.device}:{str(device_num)}")
        else:
            devices = [self.device]

        return devices

+    def _check_arguments(self):


i love pulling complex validation out into separate methods but _check_arguments feels like it could be unclear which arguments are being checked (especially since it has none).

I'd probably rename this to _check_devices to make it clear where it's valid to use. I'd also probably pass the inputs instead of allowing the object to maybe be partially initialized.

I could see this as being a general sanity checker of the initialization. There's a bit of a tradeoff between cluttering up the constructor and breaking it out into a validator for just devices and number of devices.

As currently coded up (where it only relies on the set properties after initialization is completed), it's unambiguous that the settings are correct. Thoughts @juliaputko @ankona?

It looks like _check_xxx methods in the constructor follow the pattern of offloading per-field validation to another method. I wouldn't care if they're combined but _check_parameters is definitely improper naming given the existing:

self.file = self._check_filepath(file_path)

smartsim/entity/dbobject.py

ankona · 2023-07-25T19:03:13Z

smartsim/entity/dbobject.py

+            msg = "Cannot set devices_per_node>1 if a device numeral is specified, "
+            msg += f"the device was set to {self.device} and devices_per_node=={self.devices_per_node}"
+            raise ValueError(msg)
+        if self.device in ["CPU"] and self.devices_per_node > 1:


no need for in?

ankona · 2023-07-25T19:05:05Z

smartsim/entity/dbobject.py

            for device_num in range(self.devices_per_node):
                devices.append(f"{self.device}:{str(device_num)}")
        else:
            devices = [self.device]

        return devices

+    def _check_arguments(self):
+        devices = []
+        if ":" in self.device and self.devices_per_node > 1:


if we do a check at the start:

if self.devices_per_node <= 1: return

We can avoid having compound conditionals below, making it easier to read what's required to trigger a failure.

ankona · 2023-07-25T19:09:38Z

tests/backends/test_dbscript.py

@@ -578,3 +578,60 @@ def test_db_script_errors(fileutils, wlmutils, mlutils):
    # an in-memory script
    with pytest.raises(SSUnsupportedError):
        colo_ensemble.add_model(colo_model)
+
+def test_inconsistent_params_add_script(fileutils, wlmutils, mlutils):


It looks like we're testing a lot of extra code here. Aren't the new validators in a constructor?

Can we build tests around the smallest unit?

... # you could vary x, y, z and exercise the validation code # more directly instead of allowing other code to potentially break with pytest.raises(SSUnsupportedError): dbscript = DBScript(x,y,z)

Good point especially because the add_model, add_script, add_function are essentially just factory methods for their respective DBEntity

ankona · 2023-07-25T19:12:45Z

smartsim/entity/dbobject.py

+        if self.device in ["CPU"] and self.devices_per_node > 1:
+            raise SSUnsupportedError(
+                "Cannot set devices_per_node>1 if CPU is specified under devices"
+            )


Do we need to check device?

if not self._check_device(self.device): raise ValueError("invalid device...")

I think we have a hole since device is an attribute. Changes after the constructor won't be validated.

Consider adding properties that use _check_device on sets!

ankona · 2023-07-25T19:19:31Z

smartsim/entity/dbobject.py

@@ -53,6 +54,7 @@ def __init__(
            self.file = self._check_filepath(file_path)
        self.device = self._check_device(device)
        self.devices_per_node = devices_per_node
+        self._check_arguments()


Consider using device: Literal["CPU", "GPU"] type hint to tighten the arguments for device (instead of str) in _check_device/__init__

ashao

Thanks for fixing this up!

A few typos and some excellent suggestions from @ankona to address

ashao · 2023-07-25T23:37:08Z

smartsim/entity/dbobject.py

            for device_num in range(self.devices_per_node):
                devices.append(f"{self.device}:{str(device_num)}")
        else:
            devices = [self.device]

        return devices

+    def _check_arguments(self):


I could see this as being a general sanity checker of the initialization. There's a bit of a tradeoff between cluttering up the constructor and breaking it out into a validator for just devices and number of devices.

As currently coded up (where it only relies on the set properties after initialization is completed), it's unambiguous that the settings are correct. Thoughts @juliaputko @ankona?

ashao · 2023-07-25T23:41:58Z

smartsim/entity/dbobject.py

-            msg += f"the device was set to {self.device} and devices_per_node=={self.devices_per_node}"
-            raise ValueError(msg)
-        if self.device in ["CPU", "GPU"] and self.devices_per_node > 1:
+        if self.device in ["GPU"] and self.devices_per_node > 1:


Just check against GPU (no in needed). I suppose too we could always make CPU and GPU something like an enum and attach them to DBObject so that we're comparing an object of some variety instead of a magic string

ashao · 2023-07-25T23:47:25Z

tests/backends/test_dbmodel.py

@@ -793,3 +793,37 @@ def test_colocated_db_model_errors(fileutils, wlmutils, mlutils):

    with pytest.raises(SSUnsupportedError):
        colo_ensemble.add_model(colo_model)
+
+def test_inconsistent_params_add_ml_model(fileutils, wlmutils, mlutils):
+    """Test error when devices_per_node parameter>1 when devices is set to CPU in addd_ml_model function"""


Typo addd_ml_model

ashao · 2023-07-25T23:49:21Z

tests/backends/test_dbscript.py

@@ -578,3 +578,60 @@ def test_db_script_errors(fileutils, wlmutils, mlutils):
    # an in-memory script
    with pytest.raises(SSUnsupportedError):
        colo_ensemble.add_model(colo_model)
+
+def test_inconsistent_params_add_script(fileutils, wlmutils, mlutils):


Good point especially because the add_model, add_script, add_function are essentially just factory methods for their respective DBEntity

codecov · 2023-07-27T23:47:32Z

Codecov Report

Merging #324 (52abef3) into develop (4c741be) will increase coverage by 0.02%.
The diff coverage is 81.81%.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop     #324      +/-   ##
===========================================
+ Coverage    87.30%   87.33%   +0.02%     
===========================================
  Files           59       59              
  Lines         3522     3529       +7     
===========================================
+ Hits          3075     3082       +7     
  Misses         447      447

Files Changed	Coverage Δ
smartsim/entity/ensemble.py	`99.20% <ø> (ø)`
smartsim/entity/model.py	`95.07% <ø> (ø)`
smartsim/entity/dbobject.py	`68.51% <80.00%> (+2.86%)`	⬆️
smartsim/_core/utils/redis.py	`87.23% <100.00%> (-0.53%)`	⬇️

Julia Putko added 9 commits July 20, 2023 11:55

added warning to enumerate_devices and tests for add_ml_model and add…

ad90ade

…_script bad params

fixed loop through GPU+CPU

f9feac9

typo

9862775

added _check_arguments

d50c64c

properly calling add_script

1f1e607

added test for add_function

6fe959c

added test for add_function

930243f

cleaned tests

30b1dc9

cleaned and tested

9933498

juliaputko requested review from ashao and ankona July 21, 2023 18:51

clean

cb53903

ankona reviewed Jul 25, 2023

View reviewed changes

smartsim/entity/dbobject.py Outdated Show resolved Hide resolved

ankona reviewed Jul 25, 2023

View reviewed changes

smartsim/entity/dbobject.py Outdated Show resolved Hide resolved

ankona reviewed Jul 25, 2023

View reviewed changes

ashao requested changes Jul 25, 2023

View reviewed changes

Julia Putko added 7 commits July 26, 2023 11:30

cleaned, renamed, tightened arguments for device

874f127

added tests around smaller units (dbscript and dbmodel)

e5a3807

sync and fixed merge conflict

a6e6ac7

type checking and cleaning

6f42d8a

shortened line lengths

1ec0c2e

line length + removed else

0720599

cleaning lint

52abef3

ashao approved these changes Jul 28, 2023

View reviewed changes

juliaputko merged commit 53bff05 into CrayLabs:develop Jul 29, 2023
10 checks passed

MattToast added area: api Issues related to API changes type: usability Issues related to ease of use labels Sep 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Raise error for inconsistent add_ml_model and add_script parameters #324

Raise error for inconsistent add_ml_model and add_script parameters #324

juliaputko commented Jul 21, 2023

ankona Jul 25, 2023 •

edited

Loading

ashao Jul 25, 2023

ankona Jul 26, 2023 •

edited

Loading

ankona Jul 25, 2023

ankona Jul 25, 2023

ankona Jul 25, 2023

ashao Jul 25, 2023

ankona Jul 25, 2023 •

edited

Loading

ankona Jul 25, 2023

ashao left a comment

ashao Jul 25, 2023

ashao Jul 25, 2023

ashao Jul 25, 2023

ashao Jul 25, 2023

codecov bot commented Jul 27, 2023 •

edited

Loading

Raise error for inconsistent add_ml_model and add_script parameters #324

Raise error for inconsistent add_ml_model and add_script parameters #324

Conversation

juliaputko commented Jul 21, 2023

ankona Jul 25, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ankona Jul 26, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ankona Jul 25, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ashao left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Jul 27, 2023 • edited Loading

Codecov Report

ankona Jul 25, 2023 •

edited

Loading

ankona Jul 26, 2023 •

edited

Loading

ankona Jul 25, 2023 •

edited

Loading

codecov bot commented Jul 27, 2023 •

edited

Loading