Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Audio displayobject #3656

Closed
wants to merge 12 commits into from
Closed

Conversation

nils-werner
Copy link

[ not for 1.0 ]

This Pullrequest adds a class Audio(DisplayObject) to the list of display objects.

The object makes use of HTML5's <audio> tag to render audio data. Right now, the code generates .WAV PCM-data and includes it as src="data:audio/wav;base64,..." into the HTML.

There is currently no way to show the object in Qtconsole or in any other terminal type.

The code that generates the .WAV data is taken from scipy as their implementation is currently only able to write to files and not fill a ByteIO type buffer.

I have also submitted a PR to scipy that will fix this issue. Once it has been merged (and IPython will be using that version of scipy), lines 730 to 750 can be replaced by a simple call to scipy.io.wavfile.write(data, buffer, rate).

todo :

  • [] implement embed=True|Fasle
  • [] add in examples notebooks.


# Specifying Audio(url=...) or Audio(filename=...) does not
# embed the audio data, it only generates an `<audio>` tag with a
# link to the source. This will not work when offline.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filename should embed the data. There is no guarantee that files are actually local (they could be in /tmp, in which case they are not actually accessible with the files url). If someone wants to use files/ url, they should specify that with the url field, rather than the filename field.

@Carreau
Copy link
Member

Carreau commented Jul 17, 2013

May I also propose to start considering to have a side library of display-able objects that most of the time are compatible with all IPython version.

Of course we could have low level stuff in IPython itself. But with such a library with faster release cycle than IPython , this could allow older version of IPython to use it.

@ellisonbg
Copy link
Member

The symantics of the keyword arguments should follow those of the IPython.display.Image classes. This would mean that there should be an embed=False keyword argument that forces URL based resources to be embedded.

This object should be in IPython.lib.display rather than core. I agree with @Carreau that we should encourage third party display libraries in general. But this one is simple and important enough that it should be in IPython proper - but lib not core.

@ellisonbg
Copy link
Member

Also, please add an example to the rich display system notebook in examples/notebooks.

@Carreau
Copy link
Member

Carreau commented Jul 17, 2013

I agree with @Carreau that we should encourage third party display libraries in general.

That was not my main idea. I really think of a repo with lots of display object, even maybe in IPython organisation but out of sync with IPython itself.

Of course here audio seem pretty simple (an I have no objection to have it in lib), but where to draw the limit?
We embed wav and video. What about mp3 ? ogg? ogv? webm? We have a specific youtube class. Why not DailyMotion? Vimeo? Putlocker? NovaMov?

Those would be nice, are independant of IPython version, and we should consider having one place where those could be added, with why not a monthly release (at least more frequent than IPython itself).

@nils-werner
Copy link
Author

I have moved the component to IPython.lib.display (I did not know that existet, I thought it was only on core). Also, when you're including files or urls, the component now guesses and passes on the correct mime type to the browser.

This means that generated audio will still be WAV but you can include any format you like and your browser supports.

I haven't got around to implement embed=False yet.

Examples
--------
# embedded raw audio data
Audio(data=2**13*numpy.sin(2*pi*440/44100*numpy.arange(44100)).astype(numpy.int16),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this example and it didn't work. Can you try it out and update as needed?

@ellisonbg
Copy link
Member

So I think the data should be able to be passed as floats as well. It is pretty simple to normalize and convert the int16 internally.

@nils-werner
Copy link
Author

I've fixed the code example in the docstring, it now works for me. However I am still struggling with getting filenames and urls to work as well as with passing floats to the object.

Also, the wav-writing stuff assumes to receive a numpy array. Should it be implemented so it accepts any byte string?

@filmor
Copy link
Contributor

filmor commented Jul 24, 2013

A good example notebook might be http://nbviewer.ipython.org/urls/raw.github.com/Carreau/posts/master/07-the-sound-of-hydrogen.ipynb, that seems to share a lot of the code.

You could also use http://nbviewer.ipython.org/url/gist.github.com/filmor/c7ae1a867fc9058ffcd4/raw/91ce69c1400540ed39f68bd92234abfb1dc2ae70/tone-generator.ipynb as a starting point, that incorporates the Python 3 changes and implements a small "synthesizer".

@Carreau
Copy link
Member

Carreau commented Jul 24, 2013

We'll try to review this soon (middle of IPython meeting).

@fperez
Copy link
Member

fperez commented Jul 27, 2013

Moving to milestone 2.0 b/c we're too close to release time. This is nice, but it can wait.

@Carreau
Copy link
Member

Carreau commented Aug 10, 2013

1.0 has now been released, back on track.

Would be great to have an example notebook.
Also I have no clue how to do it, but do you think sending it to the browser as a compresseb format like MP3/OGG should be doable ?

@filmor
Copy link
Contributor

filmor commented Aug 10, 2013

I looked into that, but there doesn't seem to be a python 3 compatible library to encode audio data.

@nils-werner
Copy link
Author

Audio formats are a bit tricky, because not a single format is supported by everyone (1) (2).

Also, these formats are lossy and making heavy use of auditory and perceptual models, wich would disqualify this routine for specific tasks like listening tests.

@ellisonbg
Copy link
Member

Like this:

http://nbviewer.ipython.org/urls/raw.github.com/ellisonbg/talk-sicm2-2013/master/audio.ipynb

On Sat, Aug 10, 2013 at 9:21 AM, Nils Werner notifications@github.comwrote:

Audio formats are a bit tricky, because not a single format is supported
by everyone (1)https://www.scirra.com/blog/44/on-html5-audio-formats-aac-and-ogg
(2) http://en.wikipedia.org/wiki/HTML5_Audio#Supported_audio_codecs.

Also, these formats are lossy and making heavy use of auditory and
perceptual models, wich would disqualify this routine for specific tasks
like listening tests.


Reply to this email directly or view it on GitHubhttps://github.com//pull/3656#issuecomment-22442591
.

Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu and ellisonbg@gmail.com

@Carreau
Copy link
Member

Carreau commented Aug 12, 2013

Also, these formats are lossy and making heavy use of auditory and perceptual models, wich would disqualify this routine for specific tasks like listening tests.

I didn't ment to do it by default, but to also beeing able to do it. wav is quite heavy compared to other format which might be annoying in services like nbviewer.

How would be people feeling about merging this and dooing demo-notbook in a subsequent PR ?

@nils-werner
Copy link
Author

No, please do not merge this yet. It is not yet in the state I would like to have it in.

@nils-werner
Copy link
Author

  1. It assumes to receive a numpy array while the rest of IPython seems numpy-independent
  2. The way I am passing and parsing data=, filename=, url= etc is not done properly

@fperez
Copy link
Member

fperez commented Sep 7, 2013

Just a quick comment, we'll need to consider cache issues with this element. Otherwise it can be extremely frustrating to use, as browsers cache media very aggressively (on Chrome, even clearing the cache won't play updated audio files whose name hasn't changed, you need to open a new incognito tab or flat out restart the browser).

This page has some suggestions on how to achieve no caching. Maybe that needs to be added to the files/ handler for safety (even at the cost of some performance over the network)...

@ellisonbg
Copy link
Member

@nils-werner our usual policy is to close PRs after they sit for more than a month. That doesn't mean we are not interested in the work moving forward - we would love to see this completed. Are you OK with me closing it to keep our PR queue focused on active work? If you finish this up, we would encourage you to reopen the PR to finish review.

@nils-werner
Copy link
Author

Yeah, no problem; I am currently busy doing other stuff anyways.

I would love to find others to help me implementing this though as I, while I have experience with web-stuff and audio, have no idea how IPython internally works and how I should implement this.

@ellisonbg
Copy link
Member

OK I will close it, but open an issue to track progress with the hopes of getting other people to help out.

@ellisonbg ellisonbg mentioned this pull request Sep 20, 2013
@ellisonbg
Copy link
Member

Closing this PR until work restarts. I have opened an issue to track progress on the effort: #4241

@ellisonbg ellisonbg closed this Sep 20, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants