Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Audio Messages 🗣️💬 #10669

Closed
errorists opened this issue May 18, 2020 · 31 comments · Fixed by #10810
Closed

Audio Messages 🗣️💬 #10669

errorists opened this issue May 18, 2020 · 31 comments · Fixed by #10810
Labels
feature feature requests

Comments

@errorists
Copy link
Contributor

errorists commented May 18, 2020

User Story

As a user I want to be able to compose and send audio messages in chat instead of having to type it with keyboard. I want message recipients to be able to play back my audio messages.

Description

The scope of this issue is to add a new message type of an Audio Message to our chat and the interface to record one.

We should cap the maximum length of an audio message to something reasonable and test what's the best compression / payload to send the message with. These don't have to be high quality, a 128kb/s bitrate is already better than what you get on a phone. I assume we're reusing the same Waku logic as we did for images.

The UI:

  • we add a new button, placed on the last position for starting a new audio message
  • when tapped we show a brief hint that to start recording you need to press and hold it

tooltip

  • when press and held, we animate in the recording UI, it's composed of two parts: one that has the audio waveform with controls for playback and cancelling and the duration in seconds. The other for pausing / continuing recording (we record only when the button is pressed) and sending the message.

recording

  • ideally you should be able to scrub the waveform to position the play head when not recording, that can be followed up as a nice to have in a separate issue
  • tapping outside the recording toolbar, dismisses the recording
  • sending the message renders the audio message on the chat thread as expected. inside the message, we place a waveform, a control to play/pause and a time indicator.

sending

Prototype 👉 https://framer.cloud/mnwpw
_Figma design 👉 here

Acceptance Criteria

You can send and play back audio messages in chats

Notes

Thanks! 🙌

@errorists errorists added chat feature feature requests labels May 18, 2020
@tbenr
Copy link
Contributor

tbenr commented Jun 5, 2020

@rachelhamlin I'd like to start working on this audio feature, if not yet allocated to someone :)
My plan would be to start working on the underlining audio functionalities first, UI later or maybe someone else

@hesterbruikman
Copy link
Contributor

@tbenr that would be lovely! Please bear with me while I set up a bounty.

Rachel moved on to new adventures recently: https://www.honestday.work/

@tbenr
Copy link
Contributor

tbenr commented Jun 5, 2020

ohh goodluck to rachel!

no rush at all for the bounty!

@hesterbruikman
Copy link
Contributor

@cammellos where does this issue stand on the protocol level? Does it require multipart attachments and is there work required to implement that?

@flexsurfer
Copy link
Member

hey @tbenr welcome back ;) we just finished sending images, and for this one we did some research, so we send images as base64 string, probably we want the same with sounds, so we need to record audio and send it as base64 and then be able to play it, there are a few libraries for rn, or probably we could implement file saving, reading converting in status-go, with images we send path to a local file to status-go and it converts an image to base64 and sends it

@cammellos
Copy link
Contributor

Welcome back @tbenr!

as @flexsurfer said, we just implemented images, we have quite a lot of freedom in the format.

You can pass to status-go a filepath for example, and we can take care of the conversion, if necessary, for images we send the binary and we convert on base64 on receiving the message.

I had a quick look at how to implement it, the main challenge to me looked like that unlike images, we can't play a sound inline (could not find a react native library for it). i.e

<audio controls src="data:audio/ogg;base64...

That's convenient for us as audio file needs to be stored in the database encrypted, and ideally we don't save them ever on the filesystem unencrypted.

If that's not feasible, then the best we can probably do is to store them on play and remove them as soon as finished playing (I have seen some libraries doing that), but we also likely need to take extra care in cleaning things up.

We probably don't need multipart for this, we can play with the audio quality (maximum size we can send is probably 800KB, but I would have to try this to make sure) and length to ensure we don't exceed the file limit. @errorists did some exploratory work already.

If we see that we have to generate larger files, we can think about multipart, but the feature won't really have to change as it will be transparent to status-react (multipart will be handled at a lower level), hopefully that won't be the case.

I can take care of the status-go bits, but of course you are welcome to work on both if you wish to do so.

@tbenr
Copy link
Contributor

tbenr commented Jun 5, 2020

@flexsurfer nice to talk to you again. Yes i saw what you did for images and we could go similarly with sound.

looking into this:
https://github.com/react-native-community/react-native-audio-toolkit

@tbenr
Copy link
Contributor

tbenr commented Jun 5, 2020

hi @cammellos, nice to talk to you again too :) yes it's clear. Thanks for the explanations. I'll spend some other time to find an inline solution before going to filesystem. I'll let you know!

@andremedeiros
Copy link
Contributor

@cammellos I actually think we should lay the foundation for multipart attachments. After we do that we don't have to limit file sizes and people can build whatever they want on top of it.

Is the cost and complexity of implementing that a lot higher than going straight for audio messages?

@cammellos
Copy link
Contributor

I would say it is,
the thing is that both audio messages and multipart can be worked on separately, the only dependencies is file size limit, but we would in any case limit audio messages to a reasonable size (what that is, is to be discussed).

What I think multipart depends on rather, is rate limiting by size (currently we limit only by number of messages, but if those messages are 1MB each, then it's a lot of bandwidth).

So the way I would go about this is to work on audio (changes in status-go are likely to be minimal, it would be a carbon copy of status-im/status-go@f5ab58b ), in order to unlock the work in status-react, which is going to be fairly complex.

In the meantime, we can address rate limiting.
That should not be difficult, as it's already in place for number of messages.

Once that's done we can start with multipart.
If multipart is ready by the time audio is ready to be released, better, we can cap the audio to say 5 minutes or whatever), otherwise, we can either decide to release without multipart, therefore capping the audio to say 1 minute. Otherwise we wait for multipart to be ready. We can very

Essentially I would defer the decision and work on this independently, once we are ready to release, weight the pros and cons of going live with or without it, by that time we will have a better understanding of the remaining effort required etc.

What do you think?

@andremedeiros
Copy link
Contributor

You had me at

in order to unlock the work in status-react

@tbenr
Copy link
Contributor

tbenr commented Jun 7, 2020

@cammellos just an update on my current experiments\thoughts

first a summary of requirements we have so far:

  1. playback: inline base64 data (nice to have)
  2. recording: metering (to get a realtimeish waveform representation?)
  3. playback: some sort of waveform representation

let's see:

  1. react-native-sound plus this PR Support base64 audio on iOS zmxv/react-native-sound#635 seems able to to that (need to try). Android needs to be implemented as well but i think MediaPlayer with MediaDataSource should do the trick (need to try as well)

  2. it is possible to get metering data while recording (react-native-sound-level) (tested, i also pushed a PR related to android). So sort of dynamic representation could be done, but it wont the actual waveform. What we get here is max db on since last poll. if we configure a 10hz polling we can have an acceptable representation. This drive us to 3.

  3. we could encode simple metering data taken during recording and send it along with the audio. It will be complex(?) but you don't need to load and analyze the entire audio to get a waveform. The alternative is to use metering only while recording and render waveform (for both sender and receiver) when message is sent\received. But representation while recording will be different to the representation of the actual sent message.

another alternative is to delegate player and representation to a webview with injected wavesurfer.js (https://wavesurfer-js.org/) (LOL?) which, btw, supports inline base64 audio too.

ignoring webview, I think i can prepare a react-native module which does what we need, if maintaining our own module is a viable solution, but waveform representation complicates things quite a bit. To get a real representation we need to decode it (which is how https://github.com/juananime/react-native-audiowaveform works)

@cammellos
Copy link
Contributor

Sounds good, thanks for looking into it @tbenr!
If waveform is a bit of a hurdle, shall we maybe implement it a later stage? (we can have a player just like whatsapp for example, at least at the first version). What do you say @errorists ?

@errorists
Copy link
Contributor Author

enjoying the discussion! @cammellos re: waveform, can I please ask you guys to give it a shot first? In my experience every time I agree to carve out some UI deemed difficult for later, it never gets a follow up and @tbenr said it complicates things a bit and not that it's impossible :)

@tbenr
Copy link
Contributor

tbenr commented Jun 8, 2020

@errorists one question for you: in your design, while recording there is a scrolling waveform so you see the last x seconds of registration. When rec is done you see the whole waveform.

what about having something like this while recording?
image

like last second or two of recording bars. The bars remain still but change amplitude.

@cammellos and others, i'm not strong in UI.
we may have the sound module having a function like
getWaveform("data:audio/mp4;base64,xxxxx",options)

alternative 1
this function may returns a rendered PNG like this
image
current position is a scrolling box and we have on top of it the PNG with tintColor set to the backgroung (blue)

alternative 2
returns an array of peaks and render in js the wave form including the "position"

@flexsurfer
Copy link
Member

flexsurfer commented Jun 8, 2020

imo, we're not developing professional sound editor, so having not the actual waveform sounds fine to me

alternative 2 looks more flexible ?

@tbenr
Copy link
Contributor

tbenr commented Jun 8, 2020

@flexsurfer what is the best way to draw the peak bars?

this?
https://github.com/react-native-community/art

@tbenr
Copy link
Contributor

tbenr commented Jun 8, 2020

my idea is that while recording we have few bars so each bar could be a single view.
in case of full peaks array, we have a ton of them (unless we cap to a low resolution representation)

@errorists
Copy link
Contributor Author

@tbenr sure, we could do that, would need to adjust the design

@flexsurfer in general yes, you're right but it is the most well known and accurate representation of audio and it's useful as it will show you if you're recording too loudly or not loud enough. We have no deadlines here, so why not try.

@cammellos
Copy link
Contributor

I would consider splitting this in 2 tasks to be honest.

While I agree with @errorists and I think he makes a fair point that often when we split tasks the second part never gets done, in this case it looks to me that most of the complexity will be the waveform, which is not going to provide the most benefit to the user (although it does look pretty neat), and we might delay the release of this for this.

What about, something like, we start working on the simple version first, and then once that's done we re-evaluate if we still want to keep working on the waveform or release as it is? In that way we can start sorting out any eventual issue with the protocol etc? At least I would focus on the waveform last, to clear out room for testing etc.

In any case, I am ok with anything, that's my 2p.

@flexsurfer
Copy link
Member

@tbenr we already use https://github.com/react-native-community/react-native-svg it looks like a good fit for this task

@tbenr
Copy link
Contributor

tbenr commented Jun 8, 2020

@cammellos
ok I start working on the audio module with:

  1. metering feedback while recording
  2. inline base64 data support for playback

agree?

@cammellos
Copy link
Contributor

@tbenr sounds good to me, thanks!

@tbenr
Copy link
Contributor

tbenr commented Jun 11, 2020

@cammellos at the end i decided to fork react-native-audio-toolkit, hopfuly they'll accept the PRs.

it has been harder to manage the base64 since it is based on AVPlayer instead of AVAudioPlayer. Never worked on objective-c stuff but I finally managed :)

working on this branch:
https://github.com/tbenr/react-native-audio-toolkit/tree/base64-data-url-support

now ios base64 is working. i'll move to Andorid which i hope will be easier for me.

modified their example app :)
image

@tbenr
Copy link
Contributor

tbenr commented Jun 12, 2020

android works as well. much easier.

image

switching to metering.

@tbenr tbenr mentioned this issue Jun 14, 2020
19 tasks
@gitcoinbot
Copy link

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


This issue now has a funding of 1.26 ETH (301.51 USD @ $239.29/ETH) attached to it as part of the Status fund.

@gitcoinbot
Copy link

gitcoinbot commented Jul 7, 2020

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


Work has been started.

These users each claimed they can complete the work by 1 year, 4 months ago.
Please review their action plans below:

1) tbenr has started work.

Already started. The work is almost done!

Learn more on the Gitcoin Issue Details page.

@gitcoinbot
Copy link

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


Work for 1.26 ETH (303.58 USD @ $240.94/ETH) has been submitted by:

  1. @tbenr

@StatusSceptre please take a look at the submitted work:


@gitcoinbot
Copy link

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


The funding of this issue was increased to 4.26 ETH (993.43 USD @ $233.2/ETH) .

@gitcoinbot
Copy link

⚡️ A tip worth 4.26000 ETH (1811.6 USD @ $425.26/ETH) has been granted to @tbenr for this issue from @StatusSceptre. ⚡️

Nice work @tbenr! Your tip has automatically been deposited in the ETH address we have on file.

@gitcoinbot
Copy link

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


This Bounty has been completed.

Additional Tips for this Bounty:

  • StatusSceptre tipped 4.2600 ETH worth 2034.33 USD to tbenr.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature feature requests
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants