Refactor: factor algorithm out of server; bundle resource lookups into a single object #99

ronen · 2016-08-25T23:42:29Z

Hi, thanks for making Gentle available!

I'm looking into using Gentle, but without running a server. This PR includes:

align.py, a shell command that runs the forced alignment algorithm and outputs JSON
To do that cleanly, I factored out the algorithm from serve.py, creating two classes ForcedAligner and FullTranscriber
And to do that cleanly, I bundled all resource lookups into a single GentleResources instance which I pass around; and once I had that I propagated it into the other bits of code that used those resources.

I took my best shot at organizing and naming and whatnot, but of course I'm open to changes that'd you'd find more suitable.

[Oh I should also point out that I haven't tested FullTranscriber since the install didn't seem to include data/graph/HCLG.fst. It "should" work since the code is just moved over from where it had been earlier, but of course bugs could have easily crept in.]

serve.py now contains no logic for dealing with the resources, building queues, running the multiple passes, etc. It just runs the aligner/transcriber and displays the output. NB I haven’t been able to test the FullTranscriber since I don’t have the relevant HCLG file

strob · 2016-08-26T11:40:35Z

This looks great! I've been experimenting with ways to use Gentle programatically from python, but hadn't settled on a strategy yet. Two issues to discuss:

There are lots of path issues that come along with the desire to import gentle. In particular, thinking through where to put all of the language model resources when the python package is installed system-wide. Until now, every project I make that uses gentle has a mess of symlinks to ext/, PROTO_LANGDIR, etc. (Incidentally, the install_language_model.sh script will provide the ``data/graph/HCLG.fst` file that you're missing.)
The other thing I often do when using gentle programatically is align part of a wavefile. I usually do that by using the lower-level StandardKaldi object directly, but we may want to think about exposing a partial transcription/alignment API.

Let me know what you think about these points, and we can figure out how to complete the merge.
Thanks!

ronen · 2016-08-26T15:53:12Z

Great, glad you like this direction!

... thinking through where to put all of the language model resources when the python package is installed system-wide. Until now, every project I make that uses gentle has a mess of symlinks to ext/, PROTO_LANGDIR, etc.

I don't have a specific thought as to where to put the resources. But I wonder whether the GentleResources object could be made smarter to help deal with this? Here's a proposal:

GentleResources would expect a manifest JSON that would list the resources and their locations.
GentleResources could allow the manifest to be specified by explicit file path and/or by search path and/or via environment variables. (And for completeness, by programmatically providing the JSON data.)
The manifest contents would be keyed by a language name, allowing a multiple languages or versions of language data to be installed simultaneously. By default GentleResources would use the first/only language in the manifest.
Also provide tools to help create the manifest JSON while installing the resources and/or create a the manifest JSON after the fact by somehow finding existing resources?
Advanced, maybe: The manifest could support specifying URI's rather than only local filesystem paths, and GentleResources could download & cache the files locally somewhere. In fact, the manifest itself could be specified by a URI.

(and really GentleResources is a bad name, it should just be gentle.Resources I'd say)

What do you think about all that?

(Incidentally, the install_language_model.sh script will provide the `data/graph/HCLG.fst file that you're missing.)

Oh good to know thanks. I'll get the file and try it out. (Next week :)

... align part of a wavefile. I usually do that by using the lower-level StandardKaldi object directly, but we may want to think about exposing a partial transcription/alignment API.

Yes, that seems like it would make sense. I guess via optional parameters for starting time and length in the wavefile? Would it also make sense to have start and length in the transcript? And/or an easy way to specify a single wavefile and transcript and support the ability to do a partial alignment then later continue where that alignment left off? (I haven't used alignment in practice yet to have a good sense for important use cases.)

BTW, ultimately for the project I'm looking at I'd need to do everything in C++ without Python, so at some point I'll be reimplementing the core alignment algorithm in C++, directly calling Kaldi. Let me know if that's something you'd want contributed back here--subject to my client OK'ing releasing the C++ code that is.

…nit__.py)

rather than to_csv(data) and to_json(data) that know about the structure of the data.

…/ into util/

ronen · 2016-08-31T15:28:02Z

(Incidentally, the install_language_model.sh script will provide the `data/graph/HCLG.fst file that you're missing.)
Oh good to know thanks. I'll get the file and try it out. (Next week :)

Needed a small fix, but it's working now.

ronen · 2016-08-31T15:36:26Z

I've gone a little farther with the refactoring & encapsulation:

Now the top-level apps serve.py and align.py just import gentle and use gentle.Resources, gentle.FullTranscriber and gentle.ForcedAligner and don't reach into the gentle package to import anything else.
The value returned by the .transcribe() methods is now an instance of gentle.Transcription rather than plain dictionary. The formerly top-level to_csv() and to_json() functions are now methods of that class.
Utilties (paths, ffmpg and cyst) that aren't particularly gentle-specific, and that are used only by the top-level apps or are shared by the gentle package and the top-level apps, are now in a separate package util

Hope you approve

… resampling really are gentle-specific. also provide context manager version that creates a tempfile

strob · 2016-09-13T07:12:22Z

Thanks so much for this and sorry it took me so long to finish testing. Big code quality improvements!

ronen added 3 commits August 25, 2016 16:21

Command-line script to run the forced aligner, without starting a server

f26b374

use resources object rather than hardwired paths in remaining code

2c5349b

ronen added 6 commits August 26, 2016 16:59

Merge branch 'master' into refactor

2117598

bug fix (finally tested it) — added missing self.

d268306

changed GentleResources to gentle.Resources (exported from gentle/__i…

6b2d617

…nit__.py)

export gentle.ForcedAligner and gentle.FullTranscriber

4d75d85

refactor: make Transcription object with to_json() and to_csv() methods

5d1e181

rather than to_csv(data) and to_json(data) that know about the structure of the data.

refactor: move shared helpers that aren’t gentle-specific from gentle…

f57f054

…/ into util/

rename util.ffmpeg.to_wav as gentle.resample since the particulars of…

1ed0f82

… resampling really are gentle-specific. also provide context manager version that creates a tempfile

strob merged commit 1ed0f82 into lowerquality:master Sep 13, 2016

ronen mentioned this pull request Sep 13, 2016

delete broken vestigial import #104

Merged

strob mentioned this pull request Dec 5, 2016

Option for caching result of transcription pass #115

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor: factor algorithm out of server; bundle resource lookups into a single object #99

Refactor: factor algorithm out of server; bundle resource lookups into a single object #99

ronen commented Aug 25, 2016

strob commented Aug 26, 2016

ronen commented Aug 26, 2016

ronen commented Aug 31, 2016

ronen commented Aug 31, 2016

strob commented Sep 13, 2016

Refactor: factor algorithm out of server; bundle resource lookups into a single object #99

Refactor: factor algorithm out of server; bundle resource lookups into a single object #99

Conversation

ronen commented Aug 25, 2016

strob commented Aug 26, 2016

ronen commented Aug 26, 2016

ronen commented Aug 31, 2016

ronen commented Aug 31, 2016

strob commented Sep 13, 2016