New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More audio filetypes support #62

Merged
merged 17 commits into from Sep 5, 2014

Conversation

Projects
None yet
3 participants
@arvindch
Contributor

arvindch commented Aug 28, 2014

I've coded in some more support for audio extraction. (Refer Issue #60)
MP3, OGG are now supported, via a cmdline tool - sox.
Adding more filetypes is possible, as well.

@arvindch arvindch force-pushed the arvindch:more-audio-support branch from 0408181 to f26357f Aug 28, 2014

@deanmalmgren

This comment has been minimized.

Owner

deanmalmgren commented Aug 28, 2014

um, this is awesome! Thanks for coding this up. I saw your gchat about the tests failing. In your most recent commit, it appears that the issue is that either (i) the raw_text.txt files do not have a newline at the end of the file and they should or (ii) the raw_text.txt files have a newline at the end of the file and they shouldn't. Does that help?

@coveralls

This comment has been minimized.

coveralls commented Aug 28, 2014

Coverage Status

Coverage increased (+0.73%) when pulling 4005c39 on arvindch:more-audio-support into 1178019 on deanmalmgren:master.

@coveralls

This comment has been minimized.

coveralls commented Aug 28, 2014

Coverage Status

Coverage increased (+0.73%) when pulling 8065d08 on arvindch:more-audio-support into 1178019 on deanmalmgren:master.

@deanmalmgren

This comment has been minimized.

Owner

deanmalmgren commented Aug 28, 2014

This is really awesome. When I started this project I wasn't even coming close to thinking about audio stuff. I made a couple of comments on a few commits, but once we address those super minor things let's get this merged in!

@arvindch arvindch force-pushed the arvindch:more-audio-support branch from 20fe8fa to 4fe8f82 Sep 4, 2014

arvindch added some commits Aug 28, 2014

clarified wav_parser docstring
Signed-off-by: Arvind Chembarpu <achembarpu@gmail.com>
added sox to system requirements
Signed-off-by: Arvind Chembarpu <achembarpu@gmail.com>
added audio filetype conversion
uses sox cmdline tool

Signed-off-by: Arvind Chembarpu <achembarpu@gmail.com>
added mp3, ogg support
Signed-off-by: Arvind Chembarpu <achembarpu@gmail.com>
added mp3, ogg test cases
Signed-off-by: Arvind Chembarpu <achembarpu@gmail.com>
added conversion libs to requirements
Signed-off-by: Arvind Chembarpu <achembarpu@gmail.com>
added debian-specific requirement
Signed-off-by: Arvind Chembarpu <achembarpu@gmail.com>
tweaked sox cmdline params
Signed-off-by: Arvind Chembarpu <achembarpu@gmail.com>
updated mp3, ogg test cases
Signed-off-by: Arvind Chembarpu <achembarpu@gmail.com>
added newlines to .py source files
Signed-off-by: Arvind Chembarpu <achembarpu@gmail.com>
modified speech extraction to add newline
makes output cleaner

Signed-off-by: Arvind Chembarpu <achembarpu@gmail.com>
added sox testing info to docstring
Signed-off-by: Arvind Chembarpu <achembarpu@gmail.com>
updated documentation
removed remnant file

Signed-off-by: Arvind Chembarpu <achembarpu@gmail.com>
switched to ShellParserTestCase
Signed-off-by: Arvind Chembarpu <achembarpu@gmail.com>
edited cmd formatting
now matches other source formatting

Signed-off-by: Arvind Chembarpu <achembarpu@gmail.com>
refactored speechrecognition parser
simplifies issue diagnosis

Signed-off-by: Arvind Chembarpu <achembarpu@gmail.com>
updated changelog
Signed-off-by: Arvind Chembarpu <achembarpu@gmail.com>

@arvindch arvindch force-pushed the arvindch:more-audio-support branch from 4fe8f82 to b5c52ac Sep 4, 2014

@coveralls

This comment has been minimized.

coveralls commented Sep 4, 2014

Coverage Status

Coverage increased (+1.22%) when pulling b5c52ac on arvindch:more-audio-support into 4615083 on deanmalmgren:master.

@arvindch

This comment has been minimized.

Contributor

arvindch commented Sep 4, 2014

I was a bit busy with college stuff, the past few days, but I'm back now.

I've coded in the fixes, and refactored some stuff. I think it's probably ready for merge.
Tell me what you think!

@deanmalmgren

This comment has been minimized.

Owner

deanmalmgren commented Sep 5, 2014

Thanks again for the contribution @arvindch! I just merged this in and it will be included in the next release of textract, possibly next week.

Have a great weekend.

@deanmalmgren deanmalmgren merged commit b5c52ac into deanmalmgren:master Sep 5, 2014

1 check passed

continuous-integration/travis-ci The Travis CI build passed
Details

@arvindch arvindch deleted the arvindch:more-audio-support branch Sep 6, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment