Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Button to record audio snippets and send them as audio events (voice messages) #1358

Closed
aviraldg opened this issue Apr 10, 2016 · 57 comments
Closed

Comments

@aviraldg
Copy link
Contributor

aviraldg commented Apr 10, 2016

Button, which when pushed records audio and sends it as m.audio that automatically plays.

  • needs support for inline audio recording and file attachment
  • sent audio should auto play
  • but users on the "other side" should have an option to disable auto play
  • custom typing indicator for recording audio?
aviraldg added a commit to aviraldg/matrix-doc that referenced this issue Apr 13, 2016
@aviraldg aviraldg changed the title Push-to-talk Push-to-talk voice messages Jun 29, 2016
@freelock
Copy link

I'm interested in exactly this, for a little side project for my daughter... in this case I'm looking to run as a Matrix client on a headless device (a CHIP, RPI, etc) to act as a kids "walkie talkie". We have two CHIPs arriving in the next week!

I may be able to help on this if I can carve out some time...

@ara4n
Copy link
Member

ara4n commented Aug 18, 2016

@aviraldg did anything ever land for this?

@richvdh
Copy link
Member

richvdh commented Sep 28, 2016

I fear this got stuck in matrix-org/matrix-spec-proposals#310

aviraldg added a commit to aviraldg/matrix-react-sdk that referenced this issue Feb 11, 2017
aviraldg added a commit to aviraldg/matrix-react-sdk that referenced this issue Feb 16, 2017
@spacekitteh
Copy link

spacekitteh commented Apr 10, 2017

close this?

edit: Oh, @aviraldg hasn't merged it into master I think?

@aviraldg
Copy link
Contributor Author

@spacekitteh There was no consensus on the formats to be supported, so it's still unmerged. I currently do not have the time to create a proposal for it, but I believe the core team would welcome one if it was created and then it could be merged.

@spacekitteh
Copy link

Just adding a link here to the current PR... matrix-org/matrix-react-sdk#690

@BloodyIron
Copy link

If this is akin to push to talk in VoIP apps like Mumble, Teamspeak, etc, then I'd be very interested in this getting into riot! :O can use right away!

@BloodyIron
Copy link

Can we get this bumped to a milestone or higher priority? Seriously, push to talk is actually a big deal for our implementation! I know a lot of other people will just not use Riot because it doesn't have push to talk. The value of this needs to be revisited! :(

@alphapapa
Copy link

@BloodyIron Mumble and TeamSpeak are real-time voice chat systems. Matrix is message/event-oriented.

It would be interesting if there were an addon to associate Matrix rooms with Mumble servers and channels.

@t3chguy
Copy link
Member

t3chguy commented Jan 16, 2018

@BloodyIron this is more akin to old Facebook and WhatsApp, record audio at the push of a button then send on release

@BloodyIron
Copy link

@alphapapa so what? Riot (which is what this is categorised under) has voice and video conferencing. Furthermore Matrix does interface with other voice communication tech such as Freeswitch. So push to talk for Riot makes a LOT of sense, especially in a busy channel.

If there is no push to talk, a lot of people just simply won't use Riot. I'm not just talking about myself, I run a large gaming community, and almost all of them require push to talk for whatever voice tech they use. Which, Riot advertises itself as (voice/video tech).

@BloodyIron
Copy link

@t3chguy that's going to cause a lot of latency problems, especially if you need to get info to someone in a sensitive time-frame, or if multiple people are talking. Nobody is going to want push to talk if it means they can only talk for a short period before they're heard at all. At scale it's going to lead to one big echo chamber, and actually make things worse.

Have you ever had to listen to your own voice when it was time delayed? It's extremely disorienting, and I seriously see the not sending till release causing new problems.

@t3chguy
Copy link
Member

t3chguy commented Jan 16, 2018

@BloodyIron its the concept of sending audio messages, like voicemails, its not Push-to-Talk for WebRTC calls.

Riot advertises itself as open source Team Collaboration, not voice/video tech.

@BloodyIron
Copy link

BloodyIron commented Jan 16, 2018

@t3chguy : http://i.imgur.com/hnxQsxc.png "VOIP & VIDEO CALLING"

Also, having the audio not send till release the button will lead to abuse when trolls find it, as they will just send large audio bombs. Seriously, I see no good reason why the audio should not send right when the button is pressed. But I see a LOT more problems being caused if that's the case (audio not sending till release).

@t3chguy
Copy link
Member

t3chguy commented Jan 16, 2018

Thats a feature within

It'd be no different than sending an audio file from your computer, so

Also, having the audio not send till release the button will lead to abuse when trolls find it, as they will just send large audio bombs.

is moot

This is simply not the issue you should be arguing in. This issue is for not a feature you care for.

Read the OP:

Button, which when pushed records audio and sends it as m.audio that automatically plays.

m.audio is an event type, events are sent after uploading the media they refer to, so they can't be "live"

@alphapapa
Copy link

@BloodyIron I feel like you're lacking some context here. What you're asking for is real-time, multi-party, channel-oriented voice chat, i.e. Mumble, TeamSpeak, Discord.

Discord is interesting to compare Matrix with since it's also got chat rooms like Matrix. But Discord is a large system with lots of funding that runs on AWS. Matrix/Riot is a relatively small, barely funded organization and service. The matrix.org homeserver already gets overloaded and slow sometimes. Providing real-time, channel-based voice chat as you desire would require much more infrastructure.

Ideally, sure, Matrix would provide everything. And maybe someday it will. But in the meantime, something like Mumble already provides real-time, channel-based voice chat in a distributed way, with lots of servers available. If there were a standard way to interface Matrix rooms with Mumble servers and channels, it would require no additional infrastructure on the Matrix side, as well as avoiding reimplementing all the functionality that Mumble provides. And interfacing a Matrix client with the Mumble client could make it work seamlessly and transparently to the user.

Do you understand what I mean?

@BloodyIron
Copy link

@t3chguy is right, I mis-read the scope of this particular issue. Sorry about that! I'll open another one more appropriate for what I am seeking. My bad, sorry for stealing your guys' time on my sillyness. :(

@alphapapa just to respond to what you said, before I stop being a silly goose in this thread, I'm not talking about when something like what I was talking about would be implemented, more how. But that's not actually relevant to this discussion, so I'm going to exit stage left.

Again, sorry for the misunderstanding!

@t3chguy
Copy link
Member

t3chguy commented Jan 16, 2018

I do completely agree a PTT for both native calling and Jitsi conferencing would be useful fwiw

@josephtocci
Copy link

josephtocci commented Jan 16, 2018

For simplicity, here are the two general types of PTT that people use today:

Nextel PTT (Always online, push notifications with audio, Example: Zello)
Remember this? I was too young to have a phone, but my parents each had one and the PTT was far above everything else at the time. There were no smartphones at the time. It was faster than calling, and usually the message got through. These days texting won out, partially because if you put your phone down and came back to it you could just read it. PTT you might miss. The other reason texting won out is because you don't necessarily want the person you are talking too IRL to hear what the person on the phone said. There are pros and cons to this method. There is a decent following on Zello which does this. In fact I know someone who uses Zello with all his camping buddies.

Mumble PTT (Must be running, no push notifications for audio, login/join/accept call to use and hear PTT)
This one is more what people will do now. Especially gamers. Even my friend on Zello would probably use this instead. The way he uses Zello is to schedule an hour with everybody, and everybody goes online at the same time and talks. There is no reason to use the always on Zello if you are just going to chatroom with voice. Basically you just mute the microphone, and have a keyboard button to hold down to unmute. (Headphone button on phone) If someone is trying to be quiet IRL and can't get around it, he can plug his headphones in to hear, and type in the room instead of talking. Optionally a text-to-speech thing would be cool, but that is a separate subject.

Those are the two types. I believe some of the confusion when I read this thread is people are talking about two different PTTs. Almost everyone is trying to go the Nextel PTT route, and I will admit it's pretty cool and I want it. But the Mumble type PTT is what people actually seem to need. You set up a room for the game, and everyone logs on and plays the game. (Not necessarily games but games is the common example) Text is king unless you are doing something with your hands. So I recommend we add a temporary unmute button to the native calling and Jitsi conferencing like the previous comment states until someone implements the Nextel PTT version. A keyboard button for desktop and headphone button for phone would be excellent shortcuts for this button.

Edit: Apparently an issue was just added an hour ago for the Mumble style PTT, #5993 My apologies for spamming a finished thread.

@brettz9
Copy link

brettz9 commented Sep 19, 2020

Hope this isn't considered as noise, but in case some weren't tracking the news on this, I just wanted to share how the popular WeChat service of China is now to be blocked to millions in the U.S. (as other platforms are already blocked in China) and offer the notion of the potential opportunity for the Matrix platform in appealing to such a large number of users if voice messages were to be implemented.

As a regular user of WeChat (based now in China) to communicate to my relatives back in the U.S., I can personally live without Matrix's current lack of payments compared to WeChat (the only other feature I would miss), but the inability of the client to leave brief voice messages is pretty fundamental.

If that could be implemented, I think it would be a much stronger "sell" to WeChat users, no less given that from my experience, Element video/audio quality has been actually much better and more stable than even WeChat for international communications. (And if payments were integrated, Matrix might become truly a force to be reckoned with, no less given how much Chinese are accustomed to being able to use it for payments everywhere from the subway to restaurants, offices, etc.),

The domain matrix.org is blocked in China as well as the Element app in app stores (but not element.io) but with this of course being a federated, HTTPS-based protocol, unless China were to block all foreign sites, I don't see that China would seek to cut themselves off from the world if the Matrix protocol expands, just as they have not sought to block email as a whole.

While I understand Matrix is being driven by the company behind Element, with a need to spend limited resources in adequate measure on its own interests and sustainable business model, and thus might be concerned about a China focus given that its matrix.org accounts are unavailable here (except to those on VPN), I would think that a greater attention to the ready distributability of the open source Synapse server implementation (e.g., through Ubuntu package managers) might also help gain adoption for the protocol, and once the Matrix protocol became more of a proven federation of servers, China would come to see Matrix.org as just another service in a truly international system, like Yahoo Mail among other internationally distributed email servers, rather than solely as a gateway to the West. They could, as with other countries, still block domains, but not be as likely to block average users, and give up on blocking some whole domains as well. (And such a genuine federation may promote trust in the decentralized, truly open nature of the system elsewhere as well.)

Another complementary approach might perhaps be applying to have a domain and hosting of matrix.org in China. While being subject to its laws (to the extent E2EE would even raise problems), the federation as a whole would not be restricted. This could open you to many Chinese users (and them in turn to much of the world as they can currently through email).

Anyways, I apologize for the seeming tangents, but it does strike me that they could be potentially relevant as far as prioritization of this issue especially. Thanks!

@dbkr dbkr changed the title Button to record audio snippets and send them as audio events Button to record audio snippets and send them as audio events (voice messages) Sep 25, 2020
@EchedelleLR
Copy link

Any update to this?

1 similar comment
@Pablini
Copy link

Pablini commented Oct 18, 2020

Any update to this?

@Pablini
Copy link

Pablini commented Oct 18, 2020

Any update to this?
Dont use Github that much, is there a way I can offer a bounty for some one to implement with the proper free software license and add it to the Android app and Web?

@aaronraimist
Copy link
Collaborator

@Pablini there are some websites you can use to add bounties to GitHub issues like https://www.bountysource.com

@Atalonica
Copy link
Contributor

The only major missing feature as an user point of view, hope it is considered.

@gabmert
Copy link

gabmert commented Nov 24, 2020

Cross referencing:
this feature is also wished for in android: element-hq/element-android#29
and it seems both are waiting for an update of the Matrix protocol matrix-org/matrix-spec-proposals#2516

@ludwigbald
Copy link
Contributor

ludwigbald commented Nov 24, 2020 via email

@JimmyCushnie
Copy link

Indeed. You can already send a voice memo if you record it using a program like Audacity and upload the audio file. My understanding was that this feature request is just for a built-in way to record those audio files.

@morrisonbrett
Copy link

I think that for Element to really be viable and formidable for mass adoption all of these features are important. This is a big one. I use voice memos all-the-time on WhatsApp and Signal. To be able to just tap-and-hold, record, and release is so easy. We need to have this in Element on desktop and mobile.

@DanHakimi
Copy link

DanHakimi commented Dec 10, 2020

I like this, except for the idea of auto play. None of the proprietary apps that do this have auto play, and it's always a jarring and unfortunate experience to have audio play without your direct and immediate action.

I also think Element needs an audio player. Right now, when I get an audio file, it opens in VLC in the background, and then I can't pause or seek from Element, which is annoying. More annoying -- I can't always even tell if the file has actaually loaded. Sometimes I need to click several times.

I suppose that's a separate feature, though. I'll post separately.

@aviraldg
Copy link
Contributor Author

I'm willing to pick this up again. I assume as long as we just send an m.audio event and don't do anything fancy re: typing indicators, a custom message type or autoplaying audio, a spec change should not block an initial version of this?

@dpflug
Copy link

dpflug commented Dec 29, 2020

None of the proprietary apps that do this have auto play, and it's always a jarring and unfortunate experience to have audio play without your direct and immediate action.

Voxer does, actually, and it works well. Its primary use is the audio clips instead of text, so it's not jarring at all. It allows more-or-less natural sounding conversations and makes a 100% eyes-free experience possible, especially if you spring for their premium speech detection feature.

But it's not worth holding up implementation and it probably shouldn't be the default.

@lordadamson
Copy link

So is this happening or what?

@turt2live
Copy link
Member

It's on a roadmap. Let's keep the comments in relation to the feature, not the scheduling, thanks.

@element-hq element-hq locked as off-topic and limited conversation to collaborators Jan 11, 2021
@turt2live turt2live added P1 and removed P3 labels Feb 23, 2021
@turt2live turt2live assigned turt2live and unassigned aviraldg Feb 23, 2021
@turt2live
Copy link
Member

This has now left labs and should be included in the next major release of Element Web (anticipated to be 1.8). It's also available on develop without a labs flag at the moment, and is receiving some final polish before launch: please open new issues for concerns with the feature.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests