Skip to content
This repository has been archived by the owner on May 6, 2022. It is now read-only.

NLU module updates, threaded through the Spokestack wrapper #123

Merged
merged 3 commits into from Dec 16, 2020

Conversation

space-pope
Copy link
Collaborator

@space-pope space-pope commented Dec 14, 2020

I'd like an API sanity check here—see if the commit message pasted below makes sense. Further explanation: I think we need to keep Spokestack.start() and stop() limited to interacting with the speech pipeline—a user might want to force the pipeline to stop listening while playing a TTS prompt, so stop() can't call close() on TTS—but it would also be good to have the ability to fully release all modules' resources if necessary.

Would it be less surprising if start() also ran prepare() implicitly?


This change makes the NLUService extend AutoCloseable, which forces a
close() method on all implementors. The existing service uses this to
release TensorFlow model and vocab resources.

On the NLU manager, close() has been duplicated as a convenience
method called release(), named for parallelism with the newly
added prepare() method, which is its inverse.

The NLU module was the last one to provide release/prepare
support, so adding it suggests a change to the Spokestack wrapper
API, removing the release/prepare methods used for the TTS module
and repurposing them to handle resources for both NLU and TTS.

Copy link
Contributor

@noelweichbrodt noelweichbrodt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know I wasn't tagged, but have a couple of comments. I agree with the sentiment, and AutoCloseable is a reasonable thing to implement to add clarity to resource management, but it does bother me that there's two different semantics in use (stop/start & prepare/release). Additionally, unclear how initialize relates to this change (from a client's perspective). Finally, if you came across a use case from the tray work that motivated this change, it would be good to share that.

* <p>
* This is useful for stopping passive listening (listening for wakeword
* activation); for fully releasing <em>all</em> internal resources held by
* Spokestack, see {@link #release()}.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Spokestack, see {@link #release()}.
* the Spokestack pipeline, see {@link #release()}.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Saying "the Spokestack pipeline" here introduces confusion between Spokestack (a collection of all modules), and the speech pipeline. This is meant to refer to the former.

@space-pope
Copy link
Collaborator Author

space-pope commented Dec 14, 2020

I agree with the start/stop vs prepare/release problem. It stems from tension between Spokestack originally being just a speech pipeline (and the fact that that's arguably still its primary/most valuable function) and the fact that it's grown into a larger system.

Because of the primacy of the speech pipeline, I think it's still useful to think of start/stop as things that relate only to the app "listening" to the user. I'd be open to making those methods manage resources for other subsystems (and almost did), were it not for the fact that the speech pipeline might need to be managed separately from other modules—as I mentioned in the PR message, you might want to stop the speech pipeline from listening while the TTS module is active (so you can't release its resources).

Another possibility I didn't touch on here is that of expanding the speech pipeline API to more closely resemble Python's. Including a pause method would take a little refactoring in the pipeline but allow the scenario I mentioned above while letting start/stop take over all resource management duties. The only question left then is whether start/stop are good names for methods in charge of application resources. Personally, I'm not a big fan.

@space-pope
Copy link
Collaborator Author

space-pope commented Dec 14, 2020

Oh, and I'm not sure what you're referring to with initialize—that's not a method name in use anywhere in the Android lib. Initialization in Java is usually synonymous with object construction; that's how it's used in the docs.

Edit: I missed your last sentence on my first read-through. This change isn't motivated by the tray, but by me noticing that there was no way for an application to temporarily release resources held by the NLU module, but the other modules supported this use pattern. Once I'd put the modules on equal footing, it made sense to modify the wrapper to expose the full functionality.

This change makes the NLUService extend AutoCloseable, which forces a
`close()` method on all implementors. The existing service uses this to
release TensorFlow model and vocab resources.

On the NLU manager, `close()` has been duplicated as a convenience
method called `release()`, named for parallelism with the newly
added `prepare()` method, which is its inverse.

The NLU module was the last one to provide release/prepare
support, so adding it suggests a change to the Spokestack wrapper
API, removing the release/prepare methods used for the TTS module
and repurposing them to handle resources for both NLU and TTS.
@brentspell
Copy link
Contributor

The main purpose of the start/stop states are to minimize resource usage in the application when the speech features have been disabled. If you want to be able to stop listening during tts while not tearing everything down, it would probably make sense to add the pause/resume functions to the pipeline and then automatically call them in your spokestack wrapper during tts.

This adds `pause` and `resume` methods to the speech pipeline
to allow listening to be temporarily suspsended without fully
releasing the pipeline's resources.

In turn, `prepare` and `release` in the `Spokestack` wrapper
have been removed, with `start` and `stop` controlling the
resources of all modules instead of just the speech pipeline.
`pause` and `resume` have been threaded through the wrapper
and are automatically called by TTS events.
@space-pope
Copy link
Collaborator Author

In consideration of the feedback here, prepare and release have been removed, their functionality taken over by start and stop, respectively. The speech pipeline can now be temporarily paused, and that functionality has been threaded through the wrapper. pause and resume can be called manually, but they are called automatically when TTS events are received.

Copy link
Contributor

@brentspell brentspell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks like a simple solution, nice work

@brentspell brentspell merged commit d154c13 into master Dec 16, 2020
@brentspell brentspell deleted the jz-close-nlu branch December 16, 2020 19:12
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants