Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Couple comments on reading #2

Closed
danpovey opened this issue Apr 26, 2020 · 5 comments · Fixed by #5
Closed

Couple comments on reading #2

danpovey opened this issue Apr 26, 2020 · 5 comments · Fixed by #5

Comments

@danpovey
Copy link
Collaborator

https://github.com/pzelasko/lhotse/blob/7555df605def57836c9454ae44aac95c504d86b0/lhotse/audio.py#L21

There should possibly be at least a TODO here to support getting a subset of channels. I'm thinking of scenarios like Switchboard. Since I think typical storage formats store the different channels interleaved, it may be just as efficient to read all of them and then select the needed ones. But there are scenarios where different channels are stored separately (AMI). We have to figure out at what level to do this.

Regarding the raw-PCM thing: I think if there is any common/universal format it would probably be wav, not raw pcm, as raw pcm is too easy to get wrong (no error if wrong format). I didn't really understand the AudioSource code well as your Python style is very modern... e.g. no init function.

@pzelasko
Copy link
Collaborator

pzelasko commented Apr 26, 2020

Regarding subests of channels - it's already supported, note the test_get_audio_multichannel - I'll document it better. I elaborated more in issue #4

Regarding the raw-PCM thing: I think if there is any common/universal format it would probably be wav, not raw pcm, as raw pcm is too easy to get wrong (no error if wrong format).

That's just for the command type of input, where we could always place sox at the end of the pipe to convert practically anything to PCM, and it is very easy to load with numpy as it's just a cast of binary buffer to int16. But we can make it work with wav too I guess.

I didn't really understand the AudioSource code well as your Python style is very modern... e.g. no init function.

Alright, let me elaborate on that. Python 3.7 introduced dataclass, based on attrs library. It basically reduces the amount of boilerplate code you need to write. You can specify the members similarly as in C++, e.g.:

@dataclass
class Foo:
  x: int
  y: List[str]

and it automatically creates default __init__, __eq__, __repr__, and can also create comparison operators, __hash__, or impose immutability with extra arguments to dataclass. The __init__ would look something like:

def __init__(self, x: int, y: List[str]):
  self.x = x
  self.y = y
  self.__post_init__()

It also provides a method asdict which can convert a dataclass into a nice dictionary, making serialization very easy (notice the code is quite terse for what it does). There's a __post_init__ that's automatically called in the auto-generated constructor, it allows you to customize __init__ without having to write the boilerplate.

Also, I find the typing module extremely useful for static code analysis and improving documentation. List[int] type means the variable will be a list of int, Optional means it can be None, Union[int, str] means it can be either an int or str.

@csukuangfj
Copy link
Contributor

Alright, let me elaborate on that. Python 3.7 introduced dataclass

The users of this library are forced to use Python 3.7. If some of them do not have
sudo permission, they will have difficulties upgrading their python version to 3.7

@danpovey
Copy link
Collaborator Author

danpovey commented May 8, 2020

Mm. My feeling is the python3.7 thing may be OK... this is a rather forward-looking project, and my feeling is enough people have python3.7 (or could install it in user-space).
If it's a problem later we can always change the code, but for now I don't want to slow this project down by worrying too much about such thigns.

@pzelasko
Copy link
Collaborator

pzelasko commented May 8, 2020 via email

@jtrmal
Copy link
Collaborator

jtrmal commented May 9, 2020 via email

m-wiesner pushed a commit to m-wiesner/lhotse that referenced this issue Sep 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants