Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add interrupt handlers #51

Closed
jfsantos opened this issue Apr 16, 2015 · 17 comments
Closed

Add interrupt handlers #51

jfsantos opened this issue Apr 16, 2015 · 17 comments

Comments

@jfsantos
Copy link
Contributor

Since many experiments can take a while and will be running on an environment where the user does not have a lot of control (e.g., a shared cluster), it would be interesting to have interrupt handlers to do something in case the operating system sends a signal to kill the process during the execution of the fit method. Blocks does this by using the signal module (which is part of the standard library). That way, you can save the current model state (using pickle, for example) before letting the OS kill your process. Would you be interested in adding something similar to the fit method in model.py?

@fchollet
Copy link
Member

I think this could be interesting to have! One issue would be that model saving with pickle takes time and can be impractical for large models. So maybe a more urgent feature would be a saving function that stores weight matrices to hdf5, alongside a serialized version of the model structure, for ultrafast model saving & loading.

Other potential improvements to the Sequential model would be a better logger (currently logging is handled with if statements in the code, which is not the cleanest thing) and visualization features (plot your loss/accuracy/validation metrics over time, your decaying learning rate, your gradient magnitudes, visualize learned features, and more...

@jfsantos
Copy link
Contributor Author

Yes, pickling models really takes forever! Since we added HDF5 support by adding h5py as a requirement, I think using an HDF5 file would be the way to go. I am not sure on how to serialize the model structure without the actual parameters, do you have any suggestion?

Regarding the logger, the best would be to have a list of extensions which execute some task as soon as a condition is reached (e.g., # of epochs/iterations). Then you could add extensions for plotting, logging, or even stopping the training algorithm in some cases (as in early stopping, for example).

@fchollet
Copy link
Member

To load a model, we would need:

  • the parameters passed to .compile()
  • the list of layers along with their current parameters
  • the weight matrices of each layer.

Recovering the weights is easy:

weights = []
for l in model.layers:
    weights.append(l.get_weights())

The main issue is serializing the list of layers and their parameters (sans weights). Maybe we could delete the weights then pickle the layers. Or maybe there is a better way. Any ideas?

@jfsantos
Copy link
Contributor Author

That does make sense. I'll open another issue to discuss the model serialization, so we don't mix subjects :)

@jfsantos
Copy link
Contributor Author

Now that we have model serialization, we can discuss this. I think it would be useful to have not only interrupt handlers, but also functions that run at given times (e.g., after each epoch or iteration). That would enable us to make model snapshots, report test/validation errors, etc. For the snapshots, it would also be interesting to be able to retrieve solver parameters/state after each iteration/epoch, as then we could interrupt and restart training at any time.

Any suggestions on how we can implement this?

@fchollet
Copy link
Member

Having a configurable/scriptable callback that runs after each epoch and at interrupt signal would be neat. You could use it to:

  • backup the model
  • recover info such as training/validation loss/accuracy, current learning rate, current mean gradient amplitude. This info can then be displayed in a webapp (on localhost or remote), that you could use as an experiment monitoring station. Total visualization is critical to doing good research.

A basic initial version would be to dump everything to a folder (one per experiment) at each epoch. The exact config (what to dump, where) would be passed to the .fit() method as a dictionary.

@jfsantos
Copy link
Contributor Author

Yes, I was thinking both on plotting and saving model/optimizer snapshots. To make it more flexible, we could even pass a list of callback functions/objects to fit instead of fixing the behaviour inside the model object. I'm just not sure on what would be the values passed to the callback function, since depending on its functionality it is going to need access to the model instance, the optimizer state and things that were calculated during the current epoch (accuracy, validation/test error, etc).

I am not sure on how we can keep the optimizer state (when it is meaningful, e.g., when using momentum). Any ideas?

@fchollet
Copy link
Member

In the case of SGD and Adam, the optimizer class attribute iterations gives you indirect access to the current learning rate. That's not really optimal though.

I imagine a solution would be to identify for each optimizer which quantities are meaningful to monitor, make these available as class attributes (updated at each iteration) and expose a method get_state() that lists the attributes and their values.

How would you envision the callback feature?

@jfsantos
Copy link
Contributor Author

Yes, I was doing some tests with SGD with momentum and updating iterations after loading weights, but the problem is that you don't have the current gradients to be able to compute velocities for the first iteration. This is an issue in many other optimizers. We would have to be able to restart an optimizer by passing starting values for the updates, but I'm not sure on what's the best way to implement this on Theano.

For the callbacks, I was thinking on having a list of objects from a Monitor subclass. Each object would have a run method that is called at a given frequency (number of iterations or epochs). Inside this method, we could store current test/validation metrics, plot stuff, do snapshots, or anything else. To test if this is a good idea, we could try to implement Progbar like this.

However, if you think this is too complicated, let's just add the snapshotting stuff (I already implemented this in my personal branch, will make a PR soon) and something to store the per-iteration/per-epoch metrics so we can generate plots (either offline or on-the-fly, with Bokeh for example).

@fchollet
Copy link
Member

One clean solution to "save" optimizers would be heavily modify them to store all moving parts as class attributes, and allow for state snapshotting (as a dictionary of these attributes) and instantiation from a previous state. We would lose in agility by doing so, but anything else would be pretty hacky and possibly unsafe.

For performance reasons if would definitely be preferable to incorporate the monitoring / data dumping to the Progbar (abstracted as a Monitor class), rather than having to pass updates to both the progbar for logging and the monitor for data dumping... in fact we can get there with only a slight modification of the existing progbar.

As we add more features, the challenge will be the interface: we want to design a way to configure the monitor that is sufficiently powerful, but that isn't heavy or complicated. Large configuration dictionaries have a tendency to quickly get out of hand.

@fchollet
Copy link
Member

fchollet commented May 5, 2015

One clean solution to "save" optimizers would be heavily modify them to store all moving parts as class attributes, and allow for state snapshotting (as a dictionary of these attributes) and instantiation from a previous state. We would lose in agility by doing so, but anything else would be pretty hacky and possibly unsafe.

Just wondering --at this point, would we have any better solution to monitor the state of an optimizer? And wouldn't this kill performance, given Theano's memory management model?

@jfsantos
Copy link
Contributor Author

jfsantos commented May 5, 2015

I agree that keeping the moving parts of the optimizers as class attributes would probably kill performance. In Blocks, they simply pickle the optimizer (actually, the whole main loop) but state that this is not reliable nor portable (e.g., files pickled in Python 2 do not load in Python 3, and if there are changes in libraries the pickled objects will stop working). I think pickling is only a solution as an emergency measure (let's say you're training on a shared machine and someone shuts it down, or you have limited walltime and underestimated training time).

I don't have any better idea, so we could do exactly as they do in Blocks: pickle the optimizer as this "emergency escape pod" solution, and rely on saving only the model as something that can be trusted. For most practical cases of stopping and restarting training (for example, using a dataset to pre-train the model and then another dataset to fine-tune it), it's probably OK to restart the optimizer.

@asafh
Copy link

asafh commented Dec 21, 2016

I'm made a small keras callback that listens for signal (SIGINT by default, using the signal module) and stops the training once this epoch is complete. Would it make sense to open a pull request adding it to callbacks.py?
I think it's a decent solution to allow you to easily interrupt a long training session, and have your code afterwards execute (e.g. save model then exit).

Related, would it be save to call model.save mid-training?

@yogeshg
Copy link

yogeshg commented Mar 8, 2017

I would love to use and contribute to this feature, could you point me to your callback, @asafh ?

asafh pushed a commit to asafh/keras that referenced this issue Mar 9, 2017
asafh pushed a commit to asafh/keras that referenced this issue Mar 9, 2017
asafh pushed a commit to asafh/keras that referenced this issue Mar 9, 2017
asafh pushed a commit to asafh/keras that referenced this issue Mar 9, 2017
@asafh
Copy link

asafh commented Mar 9, 2017

I've made a mistake with the commit on my branch, and a couple more trying to fix it. The PR though doesn't have any of the multiple commits and is contained in one commit.

@yogeshg you can either wait for the PR to be accepted (assuming it will be) or just take the relevant changes from the callbacks.py file in my commit #5679
If you want to, you can this class external to keras (in your own project), just qualify the Callback being extended (keras.callbacks.Callback).

@halflings
Copy link

Any chance this can be reopened? It's fair that programs not ending on SIGINT are annoying, but I think that somebody that would add this callback explicitly would be very much aware of this fact, and that they'll just need to send SIGINT twice for the running command to stop immediately.

When iterating with a model running in the cloud, I need to run some clean-up code once training ends, and I often realize that I used too many epochs too late into this process. This would help with these cases.

@swilson314
Copy link

This was an issue for me too. Additionally, I found that importing tf keeps me from receiving ctrl-c: https://stackoverflow.com/questions/52798454/import-of-tensorflow-stops-sigint-handler-from-working

I modified the code slightly so the existence of a file can signal an early stop:

class SignalStopping(Callback):
	'''Stop training when an interrupt signal (or other) was received
		# Arguments
		sig: the signal to listen to. Defaults to signal.SIGINT.
		doubleSignalExits: Receiving the signal twice exits the python
			process instead of waiting for this epoch to finish.
		patience: number of epochs with no improvement
			after which training will be stopped.
		verbose: verbosity mode.
	'''
	# SBW 2018.10.15 Since ctrl-c trapping isn't working, watch for existence of file, e.g. .\path\_StopTraining.txt.
	def __init__(self, sig=signal.SIGINT, doubleSignalExits=False, verbose=0, stop_file=None, stop_file_delta=10):
		super(SignalStopping, self).__init__()
		self.signal_received = False
		self.verbose = verbose
		self.doubleSignalExits = doubleSignalExits
		self.stop_file = stop_file
		self.stop_file_time = time.time()
		self.stop_file_delta = stop_file_delta
		def signal_handler(sig, frame):
			if self.signal_received and self.doubleSignalExits:
				if self.verbose > 0:
					print('') #new line to not print on current status bar. Better solution?
					print('Received signal to stop ' + str(sig)+' twice. Exiting..')
				exit(sig)
			self.signal_received = True
			if self.verbose > 0:
				print('') #new line to not print on current status bar. Better solution?
				print('Received signal to stop: ' + str(sig))
		signal.signal(signal.SIGINT, signal_handler)
		self.stopped_epoch = 0

	def on_epoch_end(self, epoch, logs={}):
		# SBW 2018.10.15 Since ctrl-c trapping isn't working, watch for existence of file, e.g. .\path\_StopTraining.txt.
		if self.stop_file is not None:
			# Checking file system is slow in training loop, don't check every epoch.
			delta = time.time() - self.stop_file_time
			if delta>self.stop_file_delta:
				self.stop_file_time += delta
				if os.path.isfile(self.stop_file):
					self.signal_received = True
		if self.signal_received:
			self.stopped_epoch = epoch
			self.model.stop_training = True

	def on_train_end(self, logs={}):
		if self.stopped_epoch > 0 and self.verbose > 0:
			print('Epoch %05d: stopping due to signal' % (self.stopped_epoch)) 

hubingallin pushed a commit to hubingallin/keras that referenced this issue Sep 22, 2023
hubingallin pushed a commit to hubingallin/keras that referenced this issue Sep 22, 2023
* Add golden correctness tests for Adam and SGD

* Fix dtype issues

* Sync with main (keras-team#56)

* Minor touch ups

* Fix a pretty major bug

* Format code

* Big rethink of Variable API

* Make build-by-run the default build(), leveraging new zero_history KerasTensor mode

* Minor fixes

* Format code

* Switch back to build-by-eager-run for simplicity

* Add raise upon build failure

* Work around JAX bug.

* Add a few more tests.

* Add saving tests

* Adds test suite for SGD and golden correctness tests for all optimizers (keras-team#40)

* Add golden correctness tests for Adam and SGD

* Fix dtype issues

* Add binary accuracy (keras-team#41)

* chore: adding binary accuracy

* chore: fix docstring

* Add tests for add_loss and activity regularization.

* Reformat code

* Add ActivityRegularization layer

* Fix JAX CI.

* Add Lambda Callback (keras-team#42)

* Add LambdaCallback

* Add Lambda Callback

* Add Lambda Callback

* Rename lambda_callback_test.py

* Add einsum (keras-team#43)

* Add einsum

* address comments

* Fix format line length (keras-team#45)

* Add Embedding layer

* Shorten lines

* Add .vscode to .gitignore (keras-team#46)

* rm vscode settings

* add .vscode to gitignore

* Set demo program backend (keras-team#48)

* Add tests for training arg resolution in Layer.

* Implement mixed precision.

* Replace backend.execute with backend.numpy.XXX (keras-team#50)

* Add cosine similarity loss and update l2_normalize from regularizers (keras-team#34)

* Begin cosine loss

* Add testing for cosine similarity

* Fix formatting

* Docstring standardization

* Formatting

* Create numerical_utils

* Fix issue with call context lingering.

* Add the EarlyStopping callback (keras-team#44)

* add earlystopping callback

* addressing comments

* address comments

* addressing comments

* remove unused imports

* re-enable imports checks (keras-team#51)

* Add nn.one_hot (keras-team#52)

* Add GaussianDropout layer.

* Add GaussianNoise layer

* Add Categorical Accuracy Metric (keras-team#47)

* chore: adding categorical accuracy metric

* chore: reformat docstrings

* chore: reformat

* chore: ndims with len

* refactor the docstring

* Fix typos

* Implement masking.

---------

Co-authored-by: Francois Chollet <francois.chollet@gmail.com>
Co-authored-by: Aritra Roy Gosthipaty <aritra.born2fly@gmail.com>
Co-authored-by: Ramesh Sampath <1437573+sampathweb@users.noreply.github.com>
Co-authored-by: Chen Qian <chenmoney@google.com>
Co-authored-by: Haifeng Jin <5476582+haifeng-jin@users.noreply.github.com>
Co-authored-by: Gabriel Rasskin <43894452+grasskin@users.noreply.github.com>

* Adds rmsprop optimizer and tests

* Add AdamW optimizer and tests, minor formatting changes

* Implemented formatting fixes

---------

Co-authored-by: Francois Chollet <francois.chollet@gmail.com>
Co-authored-by: Aritra Roy Gosthipaty <aritra.born2fly@gmail.com>
Co-authored-by: Ramesh Sampath <1437573+sampathweb@users.noreply.github.com>
Co-authored-by: Chen Qian <chenmoney@google.com>
Co-authored-by: Haifeng Jin <5476582+haifeng-jin@users.noreply.github.com>
Co-authored-by: Gabriel Rasskin <43894452+grasskin@users.noreply.github.com>
hubingallin pushed a commit to hubingallin/keras that referenced this issue Sep 22, 2023
…m#72)

* Add golden correctness tests for Adam and SGD

* Fix dtype issues

* Sync with main (keras-team#56)

* Minor touch ups

* Fix a pretty major bug

* Format code

* Big rethink of Variable API

* Make build-by-run the default build(), leveraging new zero_history KerasTensor mode

* Minor fixes

* Format code

* Switch back to build-by-eager-run for simplicity

* Add raise upon build failure

* Work around JAX bug.

* Add a few more tests.

* Add saving tests

* Adds test suite for SGD and golden correctness tests for all optimizers (keras-team#40)

* Add golden correctness tests for Adam and SGD

* Fix dtype issues

* Add binary accuracy (keras-team#41)

* chore: adding binary accuracy

* chore: fix docstring

* Add tests for add_loss and activity regularization.

* Reformat code

* Add ActivityRegularization layer

* Fix JAX CI.

* Add Lambda Callback (keras-team#42)

* Add LambdaCallback

* Add Lambda Callback

* Add Lambda Callback

* Rename lambda_callback_test.py

* Add einsum (keras-team#43)

* Add einsum

* address comments

* Fix format line length (keras-team#45)

* Add Embedding layer

* Shorten lines

* Add .vscode to .gitignore (keras-team#46)

* rm vscode settings

* add .vscode to gitignore

* Set demo program backend (keras-team#48)

* Add tests for training arg resolution in Layer.

* Implement mixed precision.

* Replace backend.execute with backend.numpy.XXX (keras-team#50)

* Add cosine similarity loss and update l2_normalize from regularizers (keras-team#34)

* Begin cosine loss

* Add testing for cosine similarity

* Fix formatting

* Docstring standardization

* Formatting

* Create numerical_utils

* Fix issue with call context lingering.

* Add the EarlyStopping callback (keras-team#44)

* add earlystopping callback

* addressing comments

* address comments

* addressing comments

* remove unused imports

* re-enable imports checks (keras-team#51)

* Add nn.one_hot (keras-team#52)

* Add GaussianDropout layer.

* Add GaussianNoise layer

* Add Categorical Accuracy Metric (keras-team#47)

* chore: adding categorical accuracy metric

* chore: reformat docstrings

* chore: reformat

* chore: ndims with len

* refactor the docstring

* Fix typos

* Implement masking.

---------

Co-authored-by: Francois Chollet <francois.chollet@gmail.com>
Co-authored-by: Aritra Roy Gosthipaty <aritra.born2fly@gmail.com>
Co-authored-by: Ramesh Sampath <1437573+sampathweb@users.noreply.github.com>
Co-authored-by: Chen Qian <chenmoney@google.com>
Co-authored-by: Haifeng Jin <5476582+haifeng-jin@users.noreply.github.com>
Co-authored-by: Gabriel Rasskin <43894452+grasskin@users.noreply.github.com>

* Adds rmsprop optimizer and tests

* Add AdamW optimizer and tests, minor formatting changes

* Implemented formatting fixes

* Adds clip norm and clip value tests to Adam

* Adds Adagrad and Adadelta optimizers

* Applies fixes to formatting and deletes unnecessary kwargs

---------

Co-authored-by: Francois Chollet <francois.chollet@gmail.com>
Co-authored-by: Aritra Roy Gosthipaty <aritra.born2fly@gmail.com>
Co-authored-by: Ramesh Sampath <1437573+sampathweb@users.noreply.github.com>
Co-authored-by: Chen Qian <chenmoney@google.com>
Co-authored-by: Haifeng Jin <5476582+haifeng-jin@users.noreply.github.com>
Co-authored-by: Gabriel Rasskin <43894452+grasskin@users.noreply.github.com>
hubingallin pushed a commit to hubingallin/keras that referenced this issue Sep 22, 2023
…rl) (keras-team#80)

* Add golden correctness tests for Adam and SGD

* Fix dtype issues

* Sync with main (keras-team#56)

* Minor touch ups

* Fix a pretty major bug

* Format code

* Big rethink of Variable API

* Make build-by-run the default build(), leveraging new zero_history KerasTensor mode

* Minor fixes

* Format code

* Switch back to build-by-eager-run for simplicity

* Add raise upon build failure

* Work around JAX bug.

* Add a few more tests.

* Add saving tests

* Adds test suite for SGD and golden correctness tests for all optimizers (keras-team#40)

* Add golden correctness tests for Adam and SGD

* Fix dtype issues

* Add binary accuracy (keras-team#41)

* chore: adding binary accuracy

* chore: fix docstring

* Add tests for add_loss and activity regularization.

* Reformat code

* Add ActivityRegularization layer

* Fix JAX CI.

* Add Lambda Callback (keras-team#42)

* Add LambdaCallback

* Add Lambda Callback

* Add Lambda Callback

* Rename lambda_callback_test.py

* Add einsum (keras-team#43)

* Add einsum

* address comments

* Fix format line length (keras-team#45)

* Add Embedding layer

* Shorten lines

* Add .vscode to .gitignore (keras-team#46)

* rm vscode settings

* add .vscode to gitignore

* Set demo program backend (keras-team#48)

* Add tests for training arg resolution in Layer.

* Implement mixed precision.

* Replace backend.execute with backend.numpy.XXX (keras-team#50)

* Add cosine similarity loss and update l2_normalize from regularizers (keras-team#34)

* Begin cosine loss

* Add testing for cosine similarity

* Fix formatting

* Docstring standardization

* Formatting

* Create numerical_utils

* Fix issue with call context lingering.

* Add the EarlyStopping callback (keras-team#44)

* add earlystopping callback

* addressing comments

* address comments

* addressing comments

* remove unused imports

* re-enable imports checks (keras-team#51)

* Add nn.one_hot (keras-team#52)

* Add GaussianDropout layer.

* Add GaussianNoise layer

* Add Categorical Accuracy Metric (keras-team#47)

* chore: adding categorical accuracy metric

* chore: reformat docstrings

* chore: reformat

* chore: ndims with len

* refactor the docstring

* Fix typos

* Implement masking.

---------

Co-authored-by: Francois Chollet <francois.chollet@gmail.com>
Co-authored-by: Aritra Roy Gosthipaty <aritra.born2fly@gmail.com>
Co-authored-by: Ramesh Sampath <1437573+sampathweb@users.noreply.github.com>
Co-authored-by: Chen Qian <chenmoney@google.com>
Co-authored-by: Haifeng Jin <5476582+haifeng-jin@users.noreply.github.com>
Co-authored-by: Gabriel Rasskin <43894452+grasskin@users.noreply.github.com>

* Adds rmsprop optimizer and tests

* Add AdamW optimizer and tests, minor formatting changes

* Implemented formatting fixes

* Adds clip norm and clip value tests to Adam

* Adds Adagrad and Adadelta optimizers

* Applies fixes to formatting and deletes unnecessary kwargs

* Adds Adamax and Adafactor and associated tests

* Adds Nadam and Ftrl optimizers and associated tests

---------

Co-authored-by: Francois Chollet <francois.chollet@gmail.com>
Co-authored-by: Aritra Roy Gosthipaty <aritra.born2fly@gmail.com>
Co-authored-by: Ramesh Sampath <1437573+sampathweb@users.noreply.github.com>
Co-authored-by: Chen Qian <chenmoney@google.com>
Co-authored-by: Haifeng Jin <5476582+haifeng-jin@users.noreply.github.com>
Co-authored-by: Gabriel Rasskin <43894452+grasskin@users.noreply.github.com>
kernel-loophole pushed a commit to kernel-loophole/keras that referenced this issue Sep 25, 2023
kernel-loophole pushed a commit to kernel-loophole/keras that referenced this issue Sep 25, 2023
* Add golden correctness tests for Adam and SGD

* Fix dtype issues

* Sync with main (keras-team#56)

* Minor touch ups

* Fix a pretty major bug

* Format code

* Big rethink of Variable API

* Make build-by-run the default build(), leveraging new zero_history KerasTensor mode

* Minor fixes

* Format code

* Switch back to build-by-eager-run for simplicity

* Add raise upon build failure

* Work around JAX bug.

* Add a few more tests.

* Add saving tests

* Adds test suite for SGD and golden correctness tests for all optimizers (#40)

* Add golden correctness tests for Adam and SGD

* Fix dtype issues

* Add binary accuracy (#41)

* chore: adding binary accuracy

* chore: fix docstring

* Add tests for add_loss and activity regularization.

* Reformat code

* Add ActivityRegularization layer

* Fix JAX CI.

* Add Lambda Callback (#42)

* Add LambdaCallback

* Add Lambda Callback

* Add Lambda Callback

* Rename lambda_callback_test.py

* Add einsum (#43)

* Add einsum

* address comments

* Fix format line length (#45)

* Add Embedding layer

* Shorten lines

* Add .vscode to .gitignore (#46)

* rm vscode settings

* add .vscode to gitignore

* Set demo program backend (#48)

* Add tests for training arg resolution in Layer.

* Implement mixed precision.

* Replace backend.execute with backend.numpy.XXX (#50)

* Add cosine similarity loss and update l2_normalize from regularizers (#34)

* Begin cosine loss

* Add testing for cosine similarity

* Fix formatting

* Docstring standardization

* Formatting

* Create numerical_utils

* Fix issue with call context lingering.

* Add the EarlyStopping callback (#44)

* add earlystopping callback

* addressing comments

* address comments

* addressing comments

* remove unused imports

* re-enable imports checks (keras-team#51)

* Add nn.one_hot (keras-team#52)

* Add GaussianDropout layer.

* Add GaussianNoise layer

* Add Categorical Accuracy Metric (#47)

* chore: adding categorical accuracy metric

* chore: reformat docstrings

* chore: reformat

* chore: ndims with len

* refactor the docstring

* Fix typos

* Implement masking.

---------

Co-authored-by: Francois Chollet <francois.chollet@gmail.com>
Co-authored-by: Aritra Roy Gosthipaty <aritra.born2fly@gmail.com>
Co-authored-by: Ramesh Sampath <1437573+sampathweb@users.noreply.github.com>
Co-authored-by: Chen Qian <chenmoney@google.com>
Co-authored-by: Haifeng Jin <5476582+haifeng-jin@users.noreply.github.com>
Co-authored-by: Gabriel Rasskin <43894452+grasskin@users.noreply.github.com>

* Adds rmsprop optimizer and tests

* Add AdamW optimizer and tests, minor formatting changes

* Implemented formatting fixes

---------

Co-authored-by: Francois Chollet <francois.chollet@gmail.com>
Co-authored-by: Aritra Roy Gosthipaty <aritra.born2fly@gmail.com>
Co-authored-by: Ramesh Sampath <1437573+sampathweb@users.noreply.github.com>
Co-authored-by: Chen Qian <chenmoney@google.com>
Co-authored-by: Haifeng Jin <5476582+haifeng-jin@users.noreply.github.com>
Co-authored-by: Gabriel Rasskin <43894452+grasskin@users.noreply.github.com>
kernel-loophole pushed a commit to kernel-loophole/keras that referenced this issue Sep 25, 2023
…m#72)

* Add golden correctness tests for Adam and SGD

* Fix dtype issues

* Sync with main (keras-team#56)

* Minor touch ups

* Fix a pretty major bug

* Format code

* Big rethink of Variable API

* Make build-by-run the default build(), leveraging new zero_history KerasTensor mode

* Minor fixes

* Format code

* Switch back to build-by-eager-run for simplicity

* Add raise upon build failure

* Work around JAX bug.

* Add a few more tests.

* Add saving tests

* Adds test suite for SGD and golden correctness tests for all optimizers (#40)

* Add golden correctness tests for Adam and SGD

* Fix dtype issues

* Add binary accuracy (#41)

* chore: adding binary accuracy

* chore: fix docstring

* Add tests for add_loss and activity regularization.

* Reformat code

* Add ActivityRegularization layer

* Fix JAX CI.

* Add Lambda Callback (#42)

* Add LambdaCallback

* Add Lambda Callback

* Add Lambda Callback

* Rename lambda_callback_test.py

* Add einsum (#43)

* Add einsum

* address comments

* Fix format line length (#45)

* Add Embedding layer

* Shorten lines

* Add .vscode to .gitignore (#46)

* rm vscode settings

* add .vscode to gitignore

* Set demo program backend (#48)

* Add tests for training arg resolution in Layer.

* Implement mixed precision.

* Replace backend.execute with backend.numpy.XXX (#50)

* Add cosine similarity loss and update l2_normalize from regularizers (#34)

* Begin cosine loss

* Add testing for cosine similarity

* Fix formatting

* Docstring standardization

* Formatting

* Create numerical_utils

* Fix issue with call context lingering.

* Add the EarlyStopping callback (#44)

* add earlystopping callback

* addressing comments

* address comments

* addressing comments

* remove unused imports

* re-enable imports checks (keras-team#51)

* Add nn.one_hot (keras-team#52)

* Add GaussianDropout layer.

* Add GaussianNoise layer

* Add Categorical Accuracy Metric (#47)

* chore: adding categorical accuracy metric

* chore: reformat docstrings

* chore: reformat

* chore: ndims with len

* refactor the docstring

* Fix typos

* Implement masking.

---------

Co-authored-by: Francois Chollet <francois.chollet@gmail.com>
Co-authored-by: Aritra Roy Gosthipaty <aritra.born2fly@gmail.com>
Co-authored-by: Ramesh Sampath <1437573+sampathweb@users.noreply.github.com>
Co-authored-by: Chen Qian <chenmoney@google.com>
Co-authored-by: Haifeng Jin <5476582+haifeng-jin@users.noreply.github.com>
Co-authored-by: Gabriel Rasskin <43894452+grasskin@users.noreply.github.com>

* Adds rmsprop optimizer and tests

* Add AdamW optimizer and tests, minor formatting changes

* Implemented formatting fixes

* Adds clip norm and clip value tests to Adam

* Adds Adagrad and Adadelta optimizers

* Applies fixes to formatting and deletes unnecessary kwargs

---------

Co-authored-by: Francois Chollet <francois.chollet@gmail.com>
Co-authored-by: Aritra Roy Gosthipaty <aritra.born2fly@gmail.com>
Co-authored-by: Ramesh Sampath <1437573+sampathweb@users.noreply.github.com>
Co-authored-by: Chen Qian <chenmoney@google.com>
Co-authored-by: Haifeng Jin <5476582+haifeng-jin@users.noreply.github.com>
Co-authored-by: Gabriel Rasskin <43894452+grasskin@users.noreply.github.com>
kernel-loophole pushed a commit to kernel-loophole/keras that referenced this issue Sep 25, 2023
…rl) (keras-team#80)

* Add golden correctness tests for Adam and SGD

* Fix dtype issues

* Sync with main (keras-team#56)

* Minor touch ups

* Fix a pretty major bug

* Format code

* Big rethink of Variable API

* Make build-by-run the default build(), leveraging new zero_history KerasTensor mode

* Minor fixes

* Format code

* Switch back to build-by-eager-run for simplicity

* Add raise upon build failure

* Work around JAX bug.

* Add a few more tests.

* Add saving tests

* Adds test suite for SGD and golden correctness tests for all optimizers (#40)

* Add golden correctness tests for Adam and SGD

* Fix dtype issues

* Add binary accuracy (#41)

* chore: adding binary accuracy

* chore: fix docstring

* Add tests for add_loss and activity regularization.

* Reformat code

* Add ActivityRegularization layer

* Fix JAX CI.

* Add Lambda Callback (#42)

* Add LambdaCallback

* Add Lambda Callback

* Add Lambda Callback

* Rename lambda_callback_test.py

* Add einsum (#43)

* Add einsum

* address comments

* Fix format line length (#45)

* Add Embedding layer

* Shorten lines

* Add .vscode to .gitignore (#46)

* rm vscode settings

* add .vscode to gitignore

* Set demo program backend (#48)

* Add tests for training arg resolution in Layer.

* Implement mixed precision.

* Replace backend.execute with backend.numpy.XXX (#50)

* Add cosine similarity loss and update l2_normalize from regularizers (#34)

* Begin cosine loss

* Add testing for cosine similarity

* Fix formatting

* Docstring standardization

* Formatting

* Create numerical_utils

* Fix issue with call context lingering.

* Add the EarlyStopping callback (#44)

* add earlystopping callback

* addressing comments

* address comments

* addressing comments

* remove unused imports

* re-enable imports checks (keras-team#51)

* Add nn.one_hot (keras-team#52)

* Add GaussianDropout layer.

* Add GaussianNoise layer

* Add Categorical Accuracy Metric (#47)

* chore: adding categorical accuracy metric

* chore: reformat docstrings

* chore: reformat

* chore: ndims with len

* refactor the docstring

* Fix typos

* Implement masking.

---------

Co-authored-by: Francois Chollet <francois.chollet@gmail.com>
Co-authored-by: Aritra Roy Gosthipaty <aritra.born2fly@gmail.com>
Co-authored-by: Ramesh Sampath <1437573+sampathweb@users.noreply.github.com>
Co-authored-by: Chen Qian <chenmoney@google.com>
Co-authored-by: Haifeng Jin <5476582+haifeng-jin@users.noreply.github.com>
Co-authored-by: Gabriel Rasskin <43894452+grasskin@users.noreply.github.com>

* Adds rmsprop optimizer and tests

* Add AdamW optimizer and tests, minor formatting changes

* Implemented formatting fixes

* Adds clip norm and clip value tests to Adam

* Adds Adagrad and Adadelta optimizers

* Applies fixes to formatting and deletes unnecessary kwargs

* Adds Adamax and Adafactor and associated tests

* Adds Nadam and Ftrl optimizers and associated tests

---------

Co-authored-by: Francois Chollet <francois.chollet@gmail.com>
Co-authored-by: Aritra Roy Gosthipaty <aritra.born2fly@gmail.com>
Co-authored-by: Ramesh Sampath <1437573+sampathweb@users.noreply.github.com>
Co-authored-by: Chen Qian <chenmoney@google.com>
Co-authored-by: Haifeng Jin <5476582+haifeng-jin@users.noreply.github.com>
Co-authored-by: Gabriel Rasskin <43894452+grasskin@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants