Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Keras's incompatibility with numpy>=2 breaks cellfinder's model training #446

Closed
alessandrofelder opened this issue Aug 7, 2024 · 4 comments · Fixed by #457
Closed
Labels
bug Something isn't working

Comments

@alessandrofelder
Copy link
Member

alessandrofelder commented Aug 7, 2024

Describe the bug

When I try to train a model with cellfinder napari's Training widget, I get a keras-related error:

AttributeError: `np.Inf` was removed in the NumPy 2.0 release. Use `np.inf` instead.

Which is likely because of a reported incompatibility between keras and numpy 2.

Full stack trace
File ~/mambaforge/envs/cellfinder-py311/lib/python3.11/site-packages/superqt/utils/_qthreading.py:613, in create_worker.<locals>.reraise(e=AttributeError('`np.Inf` was removed in the NumPy 2.0 release. Use `np.inf` instead.'))
    612 def reraise(e):
--> 613     raise e
        e = AttributeError('`np.Inf` was removed in the NumPy 2.0 release. Use `np.inf` instead.')

File ~/mambaforge/envs/cellfinder-py311/lib/python3.11/site-packages/superqt/utils/_qthreading.py:175, in WorkerBase.run(self=<napari._qt.qthreading.FunctionWorker object>)
    173     warnings.filterwarnings("always")
    174     warnings.showwarning = lambda *w: self.warned.emit(w)
--> 175     result = self.work()
        self = <napari._qt.qthreading.FunctionWorker object at 0x74b3e0e87f40>
    176 if isinstance(result, Exception):
    177     if isinstance(result, RuntimeError):
    178         # The Worker object has likely been deleted.
    179         # A deleted wrapped C/C++ object may result in a runtime
    180         # error that will cause segfault if we try to do much other
    181         # than simply notify the user.

File ~/mambaforge/envs/cellfinder-py311/lib/python3.11/site-packages/superqt/utils/_qthreading.py:354, in FunctionWorker.work(self=<napari._qt.qthreading.FunctionWorker object>)
    353 def work(self) -> _R:
--> 354     return self._func(*self._args, **self._kwargs)
        self._func = <function run_training at 0x74b4a889f740>
        self = <napari._qt.qthreading.FunctionWorker object at 0x74b3e0e87f40>
        self._args = (TrainingDataInputs(yaml_files=(PosixPath('/home/alessandro/dev/training.yml'),), output_directory=PosixPath('/home/alessandro')), OptionalNetworkInputs(trained_model=None, model_weights=None, model_depth='50', pretrained_model='resnet50_tv'), OptionalTrainingInputs(continue_training=False, augment=True, tensorboard=False, save_weights=False, save_checkpoints=True, save_progress=True, epochs=100, learning_rate=0.0001, batch_size=16, test_fraction=0.1), MiscTrainingInputs(number_of_free_cpus=2))
        self._kwargs = {}

File ~/dev/cellfinder/cellfinder/napari/train/train.py:29, in run_training(training_data_inputs=TrainingDataInputs(yaml_files=(PosixPath('/home/..., output_directory=PosixPath('/home/alessandro')), optional_network_inputs=OptionalNetworkInputs(trained_model=None, model_...model_depth='50', pretrained_model='resnet50_tv'), optional_training_inputs=OptionalTrainingInputs(continue_training=False, ...ng_rate=0.0001, batch_size=16, test_fraction=0.1), misc_training_inputs=MiscTrainingInputs(number_of_free_cpus=2))
     21 @thread_worker
     22 def run_training(
     23     training_data_inputs: TrainingDataInputs,
   (...)
     26     misc_training_inputs: MiscTrainingInputs,
     27 ):
     28     print("Running training")
---> 29     train_yml(
        train_yml = <function run at 0x74b517751800>
        training_data_inputs = TrainingDataInputs(yaml_files=(PosixPath('/home/alessandro/dev/training.yml'),), output_directory=PosixPath('/home/alessandro'))
        optional_network_inputs = OptionalNetworkInputs(trained_model=None, model_weights=None, model_depth='50', pretrained_model='resnet50_tv')
        optional_training_inputs = OptionalTrainingInputs(continue_training=False, augment=True, tensorboard=False, save_weights=False, save_checkpoints=True, save_progress=True, epochs=100, learning_rate=0.0001, batch_size=16, test_fraction=0.1)
        misc_training_inputs = MiscTrainingInputs(number_of_free_cpus=2)
     30         **training_data_inputs.as_core_arguments(),
     31         **optional_network_inputs.as_core_arguments(),
     32         **optional_training_inputs.as_core_arguments(),
     33         **misc_training_inputs.as_core_arguments(),
     34     )
     35     print("Finished!")

File ~/dev/cellfinder/cellfinder/core/train/train_yml.py:431, in run(output_dir=PosixPath('/home/alessandro'), yaml_file=(PosixPath('/home/alessandro/dev/training.yml'),), n_free_cpus=2, trained_model=None, model_weights=PosixPath('/home/alessandro/.brainglobe/cellfinder/models/resnet50_tv.h5'), install_path=PosixPath('/home/alessandro/.brainglobe/cellfinder/models'), model=<Functional name=functional, built=True>, network_depth='50', learning_rate=0.0001, continue_training=False, test_fraction=0.1, batch_size=16, no_augment=False, tensorboard=False, save_weights=False, no_save_checkpoints=False, save_progress=True, epochs=100)
    426     else:
    427         filepath = str(
    428             output_dir / ("model" + base_checkpoint_file_name + ".keras")
    429         )
--> 431     checkpoints = ModelCheckpoint(
        filepath = '/home/alessandro/model-epoch.{epoch:02d}-loss-{val_loss:.3f}.keras'
        save_weights = False
    432         filepath,
    433         save_weights_only=save_weights,
    434     )
    435     callbacks.append(checkpoints)
    437 if save_progress:

File ~/mambaforge/envs/cellfinder-py311/lib/python3.11/site-packages/keras/src/callbacks/model_checkpoint.py:173, in ModelCheckpoint.__init__(self=<keras.src.callbacks.model_checkpoint.ModelCheckpoint object>, filepath='/home/alessandro/model-epoch.{epoch:02d}-loss-{val_loss:.3f}.keras', monitor='val_loss', verbose=0, save_best_only=False, save_weights_only=False, mode='auto', save_freq='epoch', initial_value_threshold=None)
    171         self.monitor_op = np.less
    172         if self.best is None:
--> 173             self.best = np.Inf
        self.best = None
        self = <keras.src.callbacks.model_checkpoint.ModelCheckpoint object at 0x74b3c00a95d0>
        np = <module 'numpy' from '/home/alessandro/mambaforge/envs/cellfinder-py311/lib/python3.11/site-packages/numpy/__init__.py'>
    175 if self.save_freq != "epoch" and not isinstance(self.save_freq, int):
    176     raise ValueError(
    177         f"Unrecognized save_freq: {self.save_freq}. "
    178         "Expected save_freq are 'epoch' or integer values"
    179     )

File ~/mambaforge/envs/cellfinder-py311/lib/python3.11/site-packages/numpy/__init__.py:397, in __getattr__(attr='Inf')
    394     raise AttributeError(__former_attrs__[attr])
    396 if attr in __expired_attributes__:
--> 397     raise AttributeError(
        attr = 'Inf'
        __expired_attributes__ = {'geterrobj': 'Use the np.errstate context manager instead.', 'seterrobj': 'Use the np.errstate context manager instead.', 'cast': 'Use `np.asarray(arr, dtype=dtype)` instead.', 'source': 'Use `inspect.getsource` instead.', 'lookfor': "Search NumPy's documentation directly.", 'who': 'Use an IDE variable explorer or `locals()` instead.', 'fastCopyAndTranspose': 'Use `arr.T.copy()` instead.', 'set_numeric_ops': 'For the general case, use `PyUFunc_ReplaceLoopBySignature`. For ndarray subclasses, define the ``__array_ufunc__`` method and override the relevant ufunc.', 'NINF': 'Use `-np.inf` instead.', 'PINF': 'Use `np.inf` instead.', 'NZERO': 'Use `-0.0` instead.', 'PZERO': 'Use `0.0` instead.', 'add_newdoc': "It's still available as `np.lib.add_newdoc`.", 'add_docstring': "It's still available as `np.lib.add_docstring`.", 'add_newdoc_ufunc': "It's an internal function and doesn't have a replacement.", 'compat': "There's no replacement, as Python 2 is no longer supported.", 'safe_eval': 'Use `ast.literal_eval` instead.', 'float_': 'Use `np.float64` instead.', 'complex_': 'Use `np.complex128` instead.', 'longfloat': 'Use `np.longdouble` instead.', 'singlecomplex': 'Use `np.complex64` instead.', 'cfloat': 'Use `np.complex128` instead.', 'longcomplex': 'Use `np.clongdouble` instead.', 'clongfloat': 'Use `np.clongdouble` instead.', 'string_': 'Use `np.bytes_` instead.', 'unicode_': 'Use `np.str_` instead.', 'Inf': 'Use `np.inf` instead.', 'Infinity': 'Use `np.inf` instead.', 'NaN': 'Use `np.nan` instead.', 'infty': 'Use `np.inf` instead.', 'issctype': 'Use `issubclass(rep, np.generic)` instead.', 'maximum_sctype': 'Use a specific dtype instead. You should avoid relying on any implicit mechanism and select the largest dtype of a kind explicitly in the code.', 'obj2sctype': 'Use `np.dtype(obj).type` instead.', 'sctype2char': 'Use `np.dtype(obj).char` instead.', 'sctypes': 'Access dtypes explicitly instead.', 'issubsctype': 'Use `np.issubdtype` instead.', 'set_string_function': 'Use `np.set_printoptions` instead with a formatter for custom printing of NumPy objects.', 'asfarray': 'Use `np.asarray` with a proper dtype instead.', 'issubclass_': 'Use `issubclass` builtin instead.', 'tracemalloc_domain': "It's now available from `np.lib`.", 'mat': 'Use `np.asmatrix` instead.', 'recfromcsv': 'Use `np.genfromtxt` with comma delimiter instead.', 'recfromtxt': 'Use `np.genfromtxt` instead.', 'deprecate': 'Emit `DeprecationWarning` with `warnings.warn` directly, or use `typing.deprecated`.', 'deprecate_with_doc': 'Emit `DeprecationWarning` with `warnings.warn` directly, or use `typing.deprecated`.', 'disp': 'Use your own printing function instead.', 'find_common_type': 'Use `numpy.promote_types` or `numpy.result_type` instead. To achieve semantics for the `scalar_types` argument, use `numpy.result_type` and pass the Python values `0`, `0.0`, or `0j`.', 'round_': 'Use `np.round` instead.', 'get_array_wrap': '', 'DataSource': "It's still available as `np.lib.npyio.DataSource`.", 'nbytes': 'Use `np.dtype(<dtype>).itemsize` instead.', 'byte_bounds': "Now it's available under `np.lib.array_utils.byte_bounds`", 'compare_chararrays': "It's still available as `np.char.compare_chararrays`.", 'format_parser': "It's still available as `np.rec.format_parser`."}
        __expired_attributes__[attr] = 'Use `np.inf` instead.'
    398         f"`np.{attr}` was removed in the NumPy 2.0 release. "
    399         f"{__expired_attributes__[attr]}"
    400     )
    402 if attr == "chararray":
    403     warnings.warn(
    404         "`np.chararray` is deprecated and will be removed from "
    405         "the main namespace in the future. Use an array with a string "
    406         "or bytes dtype instead.", DeprecationWarning, stacklevel=2)

AttributeError: `np.Inf` was removed in the NumPy 2.0 release. Use `np.inf` instead.

To Reproduce

  • Clean conda env
  • Install cellfinder
  • Open napari and the cellfinder training widget
  • Pass it a YAML file with some training data
  • Hit the run button.

Expected behaviour
I can train cellfinder through napari

Log file

\

Screenshots

\

Computer used (please complete the following information):

  • Ubuntu 22.04
  • Dell Desktop

Additional context

I can make this go away by pip install "numpy<2"

@alessandrofelder alessandrofelder added the bug Something isn't working label Aug 7, 2024
@IgorTatarnikov
Copy link
Member

Should we pin to NumPy < 2.0 for now?

@alessandrofelder
Copy link
Member Author

yes, a PR is in progress - will ask for your review shortly 😁

@IgorTatarnikov
Copy link
Member

This is now fixed in keras-team/keras#20049 and released as part of 3.5.0. I tested it locally and training proceeds without errors with numpy==2.0.1 and keras==3.5.0. We can now unpin numpy, but perhaps pin keras>=3.5.0?

@IgorTatarnikov IgorTatarnikov mentioned this issue Aug 19, 2024
5 tasks
@IgorTatarnikov
Copy link
Member

Need to wait for torch 2.4.1 to unpin numpy Windows.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants