Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Porting Teletype from Atom repo a.k.a Collaboration tool #536

Open
1 task done
matbgn opened this issue May 16, 2023 · 23 comments
Open
1 task done

Porting Teletype from Atom repo a.k.a Collaboration tool #536

matbgn opened this issue May 16, 2023 · 23 comments
Labels
enhancement New feature or request

Comments

@matbgn
Copy link

matbgn commented May 16, 2023

Have you checked for existing feature requests?

  • Completed

Summary

It would be great if we could capitalize on the work done by Atom (https://github.com/atom/teletype) to offer the real-time collaboration feature on Pulsar

What benefits does this feature provide?

An Atom package that lets developers share their workspace with team members and collaborate on code in real time.

Source: Atom / Teletype

Any alternatives?

Based on the previous work done on Teletype by Atom and to give Pulsar a rocket advantage on IDE market field I'm convinced that a real-time built-in feature is a key feature.

Again, on top of previous Atom's working solution, and based on the newly fully integrated Tests CI Pulsar seems to be the legitimate repo/owner of this reboot.

Other examples:

  • CodeTogether
  • Microsoft Live Share
  • etc.

All are closed-source solutions. Even VSCodium and it's Server declination is not able to solve that easily.

But with the previous work done by Atom on Teletype I guess Pulsar have an advantage on the implementation of this feature.

@matbgn matbgn added the enhancement New feature or request label May 16, 2023
@Daeraxa
Copy link
Member

Daeraxa commented May 16, 2023

We have had quite a few discussions on this already (informally on Discord mostly) as it is likely going to be a big project. The main issue is that Teletype runs on a server and we simply might not be able to justify running such a service without charging a maintenance cost for access.

For that reason we were also discussing the possibility of making it easier and simpler for somebody to host their own server or doing something p2p.

It is a huge project and currently most of the team are still very much occupied with some other large scale changes but I'd be keen to see this supported again.

@confused-Techie
Copy link
Member

I'll also add, since I've personally attempted a few rewrites of Teletype at this point, the functionality of the server is also littered with logic that preforms logging, or other types of data collection that we simply wouldn't be able to utilize (Nor want to).

But like @Daeraxa said, I think our most likely path forward would be in the form of a fully p2p implementation, to attempt to cut out the central connection broker server, allowing us to not have to front the cost of such a system, but also remove one of the more common complaints about such a system.

@matbgn
Copy link
Author

matbgn commented May 18, 2023

What about just open-sourcing the server running part with, say, a documented docker-compose.yml?

I mean, a lot of the open source community is concerned about data collection, but the solution of self hosting the server part is a good solution that is accepted by the vast majority.

To be more issue oriented and less debate focused, wouldn't it be a solution to document the server running process to capitalize on Teletype's previous work and address this issue quickly to give us plenty of time to address the p2p full rewrite, as it could take years for the community to finalize a working solution?

What would be the macro steps to solve this Issue? In what form would a merge request be accepted?

@confused-Techie
Copy link
Member

@matbgn I appreciate the issue focused approach here. So, since we only have access to what Atom had open sourced originally, we do have access to the already open sourced code of teletype, meaning that if anybody wanted to they could absolutely set up their own instance.

So just to make sure we are on the same page, here's what we have of teletype, which as far as I know is everything needed to get it working:

  • teletype: This is the package you install itself
  • teletype-client: This is the actual networking client to the teletype functionality
  • teletype-crdt: This is the CRDT logic allowing for two people to edit the same text file
  • teletype-server: This is the server itself, the real important bit in our discussion.

So right now, anyone could technically run the teletype-server themselves and get it up and running, the teletype package even lets you put in a custom URL to use as the server, so this would be very very possible.

The issue isn't that it can't be setup, or that the community doesn't already have access to the code, the issue really is that the teletype-server is ingrained with the setup that Atom had created. To move away from it in any reasonable manner, would be very difficult.

Since right now, the source code of teletype-server is completely reliant on a few different services, some of these are paid services (which has been our big blocker in setting it up as is)

  • It requires an SQL database
  • Requires using Pusher
  • Requires Bugsnag
  • Requires a Boomtown setup
  • Requires having GitHub credentials stored, and with that means figuring out rate limiting,
  • Requires having a GitHub OAuth sign in
  • Requires a twilio account (Which twilio as far as I can tell provides the ICE server for WebRTC)

Now from there, almost all of these integrations do require payment in some way for any reasonable functionality.
Plus it doesn't help that the only docs to accompany the server is a single deploying doc which essentially says who to ping to get it deployed after testing. Also it really doesn't help that the SQL database migrations are what I can only describe as purposefully misleading, and intentionally confusing to get a functioning SQL setup.

So with all that said, and making it obvious what our starting point is, lets get into how to fix it.
Now the best success I've had at rewriting this is on my what-if-i-just-rewrote-teletype branch, which does "moderately" function, as in it properly has the SQL DB working as needed, does direct clients correctly to connect, but in the end the clients will always fail to connect.

So as for solving this issue right now, I'd see really two best ways forward here:

And I want to add, while I could relatively easily get this into a docker hostable setup, I don't think that really addresses the problems I'm seeing with it, unless I'm underestimating how willing people would be to rely on third party paid services, but continuing with that assumption, here's what solutions I'm seeing.

  • We find a way to get my rewrite functional. My rewrite forgos many of the services that are relied on, Except pusher. Which again the free tier of this service would very quickly run out if teletype saw any real usage, and would require us to then pay for it. ( And it's only as I type this out, that the missing component may in fact be twillio)
  • The other solution, would be independent of the work I've done, we find some way to cut down the third party services used, while having the server function as it's supposed to.

So really, in short the macro steps needed to solve this issue:

  • Lower the barrier of entry into having a functional teletype server. That is:
    • Confirm, and simplify the SQL database setup needed
    • Remove any and all Boomtown integrations and bugsnag integrations (as these only preform data collection and logging, which is not something we particularly want, or need to continue)
    • Find a new ICE server, or decide we would rather pay for the twilio ICE server
    • Find out if theres another solution for Pub/Sub than pusher, or decide we want to pay for this service
    • Then for the GitHub OAuth, we do already have one for Pulsar accounts, so we can likely use that, but wouldn't help any community hosts of teletype as they would have to create their own
  • Update teletype-client as needed, to support and work with any changes made to teletype-server
  • Update teletype to utilize any changes that may or may not have had to be made in teletype-client

So sorry, I know that was a lot, but hope it can point out a bit why this hasn't been done yet, since we do want to get it done, and all of that leads to why I've partially nearly decided a full rewrite is the easier answer.

I do want to also point out, if people really really just want collaborative editing, we can do it totally differently. As in having a single centralized server, this is something that we can already do, as we can see with our backend, and frontend websites. It is something I'm already super familiar with. But would then remove any advantages we gain by being peer to peer, and would still up our costs for hosting.
But if the community felt, that they would rather have ANY solution to collaborative editing, even if just temporarily, I'd be more than happy to implement a single central server that makes collaborative editing possible. Which in that case, since I'm a stickler for not wanting to rely on external services, it would very likely be pretty easy to then just hand out a docker-compose.yaml that others could use to host this server themselves.


Also a quick aside, I totally agree that it would be best to document the server running process. But I do want to make it clear, the Pulsar team has never had any assistance or communication from the original Atom team. Nor have we ever been given any special kind of access. We have the same exact resources as anybody else would, anything we learn has to be learned from whatever documentation existed for Atom and their tools before they took everything offline, or the source code itself. So there's a lot we have to just figure out by staring at it for a few hours lol

@matbgn
Copy link
Author

matbgn commented May 19, 2023

What an amazing answer! You really made my day, so thank you for your thoughtful and complete response 🙏

First of all, I totally agree that we need to get rid of all non-open source third parties.

Now what do you mean by "ANY"? How big is the technical debt associated with this solution and what does it mean for further development of cleaner p2p solutions, meaning how do we pay it back do you think? In mostly, "any", cases I would vote for a quick "go2market" solution, especially if you are familiar with the technical stack. Which library is it based on by the way? Hoping I, or someone else, can contribute 😉


Finally, regarding the Pulsar team, I meant no offence, apologies if you felt that way, and sorry for my English in general. I wish this project all the best and would be happy to help in any reasonable way I can 😄.

@confused-Techie
Copy link
Member

@matbgn So to start at the end of your message, I'm sorry if my tone was inaccurate, you made zero offence, absolutely no apologies needed, and I honestly had zero clue English wasn't your primary language, so no sweat! Your fantastic!


But what I meant by 'ANY' solution, is simply, does our community just want collaborative editing, no matter how it's implemented? Or is the want of teletype, truly for the peer2peer nature of teletype itself? As in, what's the most important, collaborative editing, or collaborative editing via p2p technology?

But interesting to hear you yourself would vote for a "go2market" solution, since that's what the alternative would be. But @Daeraxa and I were just talking about what pursuing something like this would mean for a proper implementation, and we essentially seemed to come to the concern that if we instead pursued a centralized server methodology for collaborative editing, it would more than likely delay or completely kill a proper p2p implementation. If only because the time needed to maintain it afterwards would all be time taken where we couldn't work on the proper solution.

@Daeraxa said this best "For example if the p2p implementation was going to take 300 work hours to complete and [a centralized server] took 200, is it worth spending that 200 just to delay the p2p one and end up deprecated this[the centralized server]?"


But if I did pursue a centralized server, I would very likely want to use the same stack that we currently use on the Pulsar Package Registry, which you can view the source code on pulsar-edit/package-backend. But it's a tech stack of mainly the following:

  • JavaScript
  • ExpressJS
  • PostgreSQL
  • Google Cloud Platform's App Engine

@matbgn
Copy link
Author

matbgn commented May 20, 2023

OK, the central question is first of all the cost comparison. Based on your quote, I did not expect 200 hours.

I'm already super familiar with it.

Looking at the stack you pointed out, I don't see a core feature for real-time collaboration on flat files. I was looking for something like CRDT with this kind of libraries in mind: https://github.com/yjs/yjs

As an aside, since I'm talking about the yjs library, a central server (to initiate communication) relayed to p2p is also a possibility, as from what I've heard so far yjs seems relatively straightforward to implement. Maybe you or @Daeraxa could give some feedback on the technical feasibility of implementing yjs in the Pulsar codebase?

@Daeraxa
Copy link
Member

Daeraxa commented May 20, 2023

Based on your quote, I did not expect 200 hours.

This was just me giving a random example to get my point across, I honestly have no idea what the effort would actually be.

Thanks for the link to that library, I hadn't seen that one before - from a technical standpoint I'm definitely not the one you want to ask so I'll leave that one up to others :)

@confused-Techie
Copy link
Member

@matbgn
So a few points, lets pretend we did move forward with scrapping the code of teletype-server, we would still absolutely be able to use teletype and teletype-crdt (as well as teletype-client but that's less relevant to our discussion). What that means is that scrapping the server aspect for something simpler to implement we still do already have a CRDT module that's made for teletype and Pulsar by extension. Not that the library you shared doesn't look very interesting.

Since as it is now teletype-server is pretty data agnostic, as in it doesn't care what data is being passed, the actual handling of data (in reference to CRDT methods) is totally handled client side, not server side. A big reason from what I can gather, is the text itself is encrypted between each client, so the server couldn't modify it if needed.

That's why the tech stack I shared doesn't seem like it's helpful for a collaborative editor, it's not. The tech stack I shared doesn't care what the data is used for beyond the internal structures that's used to store data in transmission.


As for the feasibility of implementing yjs into a collaborative editing experience for Pulsar, it seems very possible. Although I am slightly nervous since I can see their p2p implementation uses libp2p, which is actually one of the earliest libraries I pursued as an alternative setup for teletype and found their documentation very lacking. As in libp2p's docs were very outdated, and just about every code snippet I looked at didn't work. But it's totally possible that y-libp2p has done something different, and alleviated my concerns there, so I'll want to take a further look.


And lastly, you are totally right that having a central server to help initiate a p2p connection is possible. Thing is, that's basically what all p2p is. Whether you use a STUN, TURN, ICE, ultrapeer, or relay server to get clients connected, nearly every popular, modern form of p2p relies on some centralization, so following this, is essentially going the route of a total rewrite of a p2p implementation, which now we've come full circle lol.

So it does make me consider, if our best bet isn't just finding the right answer in terms of technology we want to use to pursue a p2p implementation, since of course we wouldn't want to create something brand new from the ground up, although was still holding onto hope to move away from centralization totally, to forgo the hosting costs. But fingers crossed y-libp2p does better than libp2p.


Also last note, yeah 200 hours is a totally made up number lol

@matbgn
Copy link
Author

matbgn commented May 25, 2023

Thank you very much for these technical details. I must admit, I know very little about CRDT and p2p protocols in general.

If we summarize the effort to be made in terms of hours, and considering that 200 hours are overestimated, what would be your best estimate?

Based on Teletype's code inspection in terms of LOC with scc, a very first estimation for rewriting from scratch based on the p2p protocol is more than 2'000 hours (yes, you read it right 2k - see below for details, but take it only as a base of discussion).

The question is therefore, as @Daeraxa pointed out, to know on the basis of this evaluation, what would be the best route to take. But if I understand correctly, the gain factor is roughly 10x, actually.

Moreover, in either case and in a solution-oriented approach, what would be the next step to take in terms of code if we could split this challenge into smaller chunks? I'm thinking, for example, of information of lesser importance, such as just getting the status of connected clients, the position of the cursor, the name or any other simplistic idea that would allow a first Pull Request to see the light of day if we were to start with a complete rewrite.


───────────────────────────────────────────────────────────────────────────────
Language Files Lines Blanks Comments Code Complexity
───────────────────────────────────────────────────────────────────────────────
JavaScript 54 6251 1010 111 5130 321
Markdown 12 627 251 0 376 0
JSON 3 4242 0 0 4242 0
LESS 1 417 66 3 348 0
YAML 1 42 7 0 35 0
gitignore 1 2 0 0 2 0
───────────────────────────────────────────────────────────────────────────────
Total 72 11581 1334 114 10133 321
───────────────────────────────────────────────────────────────────────────────
Estimated Cost to Develop (organic) $307,340
Estimated Schedule Effort (organic) 8.78 months
Estimated People Required (organic) 3.11
───────────────────────────────────────────────────────────────────────────────
Processed 411548 bytes, 0.412 megabytes (SI)
───────────────────────────────────────────────────────────────────────────────

@confused-Techie
Copy link
Member

@matbgn Very interesting approach you've taken to determining the cost of producing the code. I've never seen this tool before so will have to take a look at it.

But in terms of what we need to do first, we need to consider, that Teletype is fully written, Teletype-Client is fully written, as well as Teletype-CRDT, ideally, we just rewrite Teletype-Server, and make it compatible with these other items.

So the first order of business is research, determine exactly what each of these expects, and determine what data structures they are intending to work with.

Then from there, it'd be research to determine how we do that on the server side.

Since while you are correct, that if we rewrote everything from scratch, we would need to set much smaller goals, such as seeing other clients and so on, if we pursued a centralized approach, or goal would be much much higher of getting things working, or measuring them as, having a functional server that can receive data and return it as expected. Then have one that works with the existing code.

But it sounds like we may all be thinking sticking with p2p is our best path forward. So considering that I'll stop responding for other methodologies.


Getting started with rewritten p2p:

While I can't say definitively what we would need to do for a first pull request, I can at least layout what we need to get started:

  • We need to determine the technology we plan to use to make everything work. By researching thoroughly into what's available, and what fits our needs best.
  • Ideally, once we determine what's best, we can then find actively maintained and working modules that will allow us to quickly use these technologies rather than implement from scratch.
  • Once we have determined the tech stack, we then need to mockup what the application structure will look like.
  • Finally, once we have all the above, will we be able to start implementing any code to do this.

So if we are committed to rewriting the p2p stack used within Teletype, I think the first order of business is to determine what tech we use to support this.

Obviously, we want to use a Pub/Sub system to communicate between clients. But then the big question, is how do we set up their connection? There's a few choices in this space, each with their own pros and cons, but some of these may be

  • TURN
  • STUN
  • Relay
  • Ultrapeer
  • ICE

When looking at which of these we want to use, we also should consider, are they fully open source? Are there any public services we could use to already implement this? What are the resources needed to host this if we have to do it (What would that cost us to do so)?

So determining that bit is probably our next step to start implementation.

@zhuyifei1999
Copy link

Obviously, we want to use a Pub/Sub system to communicate between clients. But then the big question, is how do we set up their connection? There's a few choices in this space, each with their own pros and cons, but some of these may be

Hi, I thought to comment on this since I've written p2p hole punching for a personal project of mine

STUN

All STUN tells really you is your external IP : Port pair as observed from the STUN server. Unfortunately this is insufficient because a lot of NAT will use different external ports when the destination of the packet is different, resulting in the port you get from STUN unusable for your peer. So STUN can be a useful reference, but not to be relied on alone.

Relay
TURN

Relay == TURN (Traversal Using Relays around NAT). This isn't really p2p but it can be a last resort, sure.

Ultrapeer

Never heard of this.

ICE

This looked really promising, in that it gathers address candidate that the host could potentially use to connect to the peer.

I initially tried STUN, then kept having reports of it not working I didn't use ICE at the time because of I didn't want to read the RFC that closely, and tried to search for library to use and found libnice, but I didn't want to introduce glib as a dependency.

For my project in the end it was just simpler for me to invent my own protocol. The following was what I wrote at the time. You can substitute "VMs" / "switch" (they are referring to nintendo switch) with "clients", and we use a network of p2p connections where every client connect to each other, instead of having one client only connect to one other client. There's also a server that does some initial message exchange (such as exchanging IP and port numbers and performing STUN) before the p2p is established.

[1:26 PM] ok I got an idea for how to do this on a high level, even in the [...]'s case of immediate port remap
[1:27 PM] instead of one port per VM that's used for all outside communication
[1:28 PM] use a mesh. each port will be taking to one other remote switch only
[1:31 PM] say A B and C are connected to the network, and D is joining the network. then A B and C will receive a message about D joining and D will receive a message about A B and C already being here.
[1:33 PM] A B and C will then open up a new port each and D will open up three new ports. let's say we are trying to establish a connection between A and D. A will attempt to send a packet to D's published port number for A, and D will send a packet to A's published port number for D
[1:36 PM] let's call that packet 1
[1:36 PM] when either of them see packet 1, they respond with what port number they are seeing and use that as the 'port to use', let's call this packet 2
[1:41 PM] say if packet 1 is A -> D, then packet 2 is D -> A, and if A can see packet 2, that have the expected port number, then it sends packet 3, A-> D, that it can see D so the bidirectional-communication is complete.
[1:42 PM] this is like a SYN; SYN ACK; ACK in TCP
[1:43 PM] at the same time, this sequence will happen the other way, with D->A tying to establish a connection
[1:46 PM] if both connections are established then the port number of both sides are the expected number. since during packet 2's processing both sides will check if the other's port is the published port number
[1:47 PM] so if both are established then both are using published port number, and everything is great
[1:47 PM] let's say neither connection can be established in say, 5 seconds
[1:52 PM] then both sides will attempt to use STUN to get its external port number, and publish it if it's different from what it sees directly. when one sides, say A, publish their STUN'ed port, the other side D will attempt to send to A's STUN'ed port instead, trying again to establish a connection D -> A
[1:53 PM] and if this still doesn't work, then I don't think p2p can be established between A and D.
[1:55 PM] since these are connections that might break any time, once a second VM to send a keep alive packet to every other node so NATs won't suddenly forget the connections
[1:55 PM] if it does, and we'll just hope that it doesn't happen both directions at the same time
[1:56 PM] if one NAT forget a connection, it might at the next outgoing packet start remapping ports again
[1:57 PM] assuming the other side did not remap, we just change the port number when we send packets out and be done with it
[1:57 PM] ^ this is what I originally wanted to do
[1:58 PM] this is so complicated. don't think I'll finish in a weekend :/
[1:59 PM] this will be a breaking change so after the change and before the change cannot see each other
[2:00 PM] UDP is supposedto be simple... now I feel like I'm reinventing TCP
[2:02 PM] oh wait that won't work even
[2:04 PM] if I send A->D, D won't ever receive it unless D sends to A first
[2:04 PM] because of NAT
[2:04 PM] and if D sends to A A won't ever receive it :/
[2:07 PM] so I guess I'll just blindly rely on timing then

Although I should mention that even in this case there are still combinations of NATs that cannot p2p each other, so I implemented a relay later when all else failed.

Though, I'm not exactly sure how all the p2p libraries in the wild performs hole punching, so they may perform better than my reinvention. (I do know that Nintendo is able to establish connections in some cases where mine cannot)

@confused-Techie
Copy link
Member

@zhuyifei1999 I really appreciate the info! Yeah I know my short list left out a bit of information there, but you do seem to have some really cool ideas around how to preform hole punching for these purposes.

One thing that I'm hoping to be able to use, since I'm wanting to pursue a Sub/Pub communication style, I was aiming to implement a gossipsub, which would allow data to travel between nodes on a network to others. Which would solve the problem you pointed out of D not being able to communicate directly with A. Since if A sends information on a gossipsub network, then it'll be 'gossiped' about from B to C to D, and vice versa. Obviously, it's better to maintain multiple peers rather than a single one, to avoid being cut off from the network and having to rejoin, but on the small scale this would be a methodology to fix that issue.


As for what protocol to use to implement the initial bootstrapping into the network, I am starting to think it might be worth it to study up on an ICE implementation, because (if memory serves) ICE actually includes STUN and TURN within it, so it's able to do quite a bit for us in terms of bootstrapping. But I do appreciate the info and link

@schadomi7
Copy link
Contributor

Hi, sorry to drop in like this. I only noticed this issue now.
Also, I apologize in advance for the long comment.

I made a self-hosted teletype version:

file.webm

(here is an embedded video file, it shows the teletype-diy package in action)

It works without Github-Login, Pusher, twilio, etc. It uses socket.io for a really simple signaling-server. Currently, it works without any NAT traversal, I guess it should work over the internet with IPv6. Though I did not test that, yet.

I seem to remember that the original teletype could also share the project files in the tree-view. That does not work, yet.
Have not really looked into it, to be honest.

I was planning on publishing it as a community-package, but then I noticed this issue.
What do you think?
Do you want to take over, or are you set on writing your own implementation? I only fixed what broke in the teletype package. I did not run the tests, though those are probably broken.
I wish, I saw this issue earlier and in particular this branch what-if-i-just-rewrote-teletype. I only glanced at it, but I believe I make mostly the same changes...

PS: I used your usernames in the video, this only fetched the user profile icons, which are public. No other information is fetched from Github. I hope that is ok.


A little bit of background:
On Friday, I was talking to a colleague about pair-programming while in home-office.
I mentioned that there was this really awesome thing called teletype.
They seemed interested to give it a try, but upon installing it failed to connect.
And that is the story, how I spend a rainy Saturday.
As a student, I got no money.
Therefore, I tried to get rid of any external service.
I do not know how to publish a node package; heck, I do not even really know JavaScript.


Atom was my favorite editor. Thank you for making my new favorite editor more awesome every day!

@confused-Techie
Copy link
Member

@schadomi7 This sounds amazing!
And I think far too much time has passed for me to be set on writing my own lol, even if it's something I'd still like to take a shot at. But if the community (yourself) is offering up a solution on a silver platter it seems irresponsible to not consider using that.

Now of course you can publish and take ownership, but if you'd like us to do so, that's something I'd be much more than happy to discuss with the team so that you don't have to worry about cloud hosting and long term ownership.

But I'd love to take a look at the source and see what you did to get it working, and see if I can get everything else fully functional. Mind sharing a link?

@schadomi7
Copy link
Contributor

schadomi7 commented Feb 12, 2024

I do not know why it works, either. Something must still be wrong with it, other than the missing ICE/STUN/TURN-Stuff, I mean.
When I tried to connect my desktop and my notebook, it worked and failed as follows:

Share Join signal/api server status
notebook notebook desktop works
desktop desktop desktop works
notebook desktop desktop works
desktop notebook desktop works? - PS: I'm sure that it did not work yesterday, but before I posted I went back and checked, and now it also works??

I would guess, the signal-server is not working correctly. I only guessed what the PusherClient is doing with the channels (based on function names), replicated that behavior with socket.io rooms, and moved on when it seemed to work.
Anyway, I still have to test if it works if the signal/api server is not one of the participants, and they have to do more network, than connect to localhost.
Also, I only really tested on linux. Don't have Windows or macOS machines available to me.

I don't mind sharing a link, and I really should have done so already. I was hoping to get a chance to clean up some of the mess I caused, but I see now that was futile.
As you also wrote above, I needed to touch teletype-client and teletype-server as well. Together with the signal-server they live here for now: https://github.com/teletype-diy/

I think it would be best if teletype became/stayed a core package. I, personally, would really like to retain an easy possibility to self-host it.
Ultimately, I think it would be cool if it became true P2P with a distributed hash table or something. But, as I understand it, that would necessitate a much greater rewrite or a new approach, as whoever has access to the portal names (the two concatenated uuids, host_peer_id in table portals), can effectively connect to your editor (and steal your code or save malware).

Maybe that could be solved/mitigated by just using the portal name as the invite-link; as it stands, the single uuid in the invite-link does nothing more then facilitate the lookup in the database for the real portal name. That would also eliminate the PostgreSQL requirement, and maybe even the API server, too. Just thinking out loud, not like I really know what I am talking about.

I think, I will keep on experimenting with it. At least for a bit. Feel free to pick and choose what you need for the core package.

PPS: Maybe it would be sufficient to ask the user for confirmation before a remote user connects.

@schadomi7
Copy link
Contributor

I did some stuff. I continued to hack on my self-hosted version, here are the broad changes:

  • no API-server anymore, teletype-server is not used at all.
  • ClientIds/Remote PeerIds/portal names are directly in the invite-link now, no need for a central instance to store and arbitrate them.
  • signal-server now has a Dockerfile/docker-compose config for convenient hosting. It uses a in-Memory Database by default, optionally you can enable a PostgreSQL Database for (hopefully) near limitless scalability, as this also enables you to use multiple signal-server instances. (I plan on running a load-test, WIP)

Other than that, I managed to successfully test a connection between two instances over the internet. Both had an ipv6-address, so still no NAT-transversal. Signal-server was a third party.
I plan on trying https://github.com/coturn/coturn for a ICE/STUN/TURN server. In my understanding, it should be mostly a drop-in solution.

I added a tag no-api-server for this version, and also a tag first-draft for the version last week.

I had some other ideas, like encrypting the signaling or using base62 encoded UUIDs. I would love to get some feedback on them and the changes above, too. From what I understand the direction and intention of the discussion above, I think those are aligned to the same goal.

@confused-Techie Any news from your side? Were you able to reproduce my results? What would still need changing in your opinion?

If you do not mind, I could prepare merge requests with the changes mentioned (or a subset of those), maybe that provides a better forum to discuss changes.

@YoSiJo
Copy link

YoSiJo commented Feb 21, 2024

Please note that when using STUN/TURN etc. it would be important to take SRV records into account.
Since the whole thing will possibly work without a central instance, I would have considered the following URL scheme:

0. Foreword

Since the list in the URL could become very long for some participants, I would suggest that gzip and base64 should be used by default.
Example:
pulser://localhost/ipv6=::&ipv4=0.0.0.0&stun=0.0.0.0&token=foo

 echo 'MemberIPv6=...&MemberIPv4=...&stun=...&token=...' | gzip | base64 | urlencode

pulsar://localhost/H4sIAAAAAAAAA8ssKDOztbJSyywoM7E10ANDteKS0jw4pyQ%2fOzXPNi0%2fnwsA2itrPCwAAAA%3d%0a

Specifications such as MemberIPv6, MemberIPv4, MemberDNS, and Member can be specified several times and should each time stand for one client.
Specifications such as stun, token etc. should be unique.

1. URL without any STUN/TURN instance

pulser://localhost/...

2. URL with STUN/TURN instance

pulsar://example.com/...

In this case, the domain should be used and, if necessary, the SRV record should also be checked.

2.1 URL with double STUN/TURN instance

pulsar://example.com/stun=...&...

In such a case, the additional specification of stun and turn should be interpreted as a higher priority and an attempt should be made to use the servers specified with turn and stun and, if domains are specified, an SRV record check should of course also be carried out here.

Ps. These are all just rough ideas and can be changed as much as necessary and desired. In the end, however, it would be great if it were possible to communicate sessions via URL and also influence the information contained for special cases.

@confused-Techie
Copy link
Member

@schadomi7 This is amazing! I'm taking some time to fully review the code, but this is looking amazing! Love that there's no central server, but love that things could technically function without it (in localhost and ipv6 instances).

But overall what you are describing sounds fantastic, and is far better success than I've had thus far. I'll get started reviewing your changes properly, and if everything seems good, I'd absolutely agree we could move foreward with a PR to review the more minor code details there.

Supremely awesome work!

@schadomi7
Copy link
Contributor

schadomi7 commented Feb 21, 2024

@confused-Techie
Oh, I am sorry, my last comment was misleading. In hindsight, I can see it, too.
I'll blame it on, me not being a native english speaker ;-)
The api-server (teletype-server) is no longer needed, that is true. Every call that was needed before, is now directly a call to the signal-server.
A instance of the signal-server is (so-far) always needed.
For Github it made perfect sense to proxy the calls to the signal server through the api-server/teletype-server, that way they could ensure only logged-in users could connect.
If we do not mandate a Github-Account/Login, we no longer care. So it made sense to just call the signal-server directly.

The signal-server does know the portal names.
As such, it can know which clients connect to which other clients.
In the signal-encrypted branch (https://github.com/teletype-diy/teletype-diy-client/tree/signal-encryption) all signals (WebRTC SDP/idk. some json with IPs in it) to the signal-server are encrypted on the clients/peers.
That way, the signal-server can no longer know details about your (local/work/university) network, such as subnets or routers.
Still, the signal-server provider has the capability to log the connecting IP-addresses (at least the public ones).

@YoSiJo
Very cool idea, to encode the needed information in the invite-URL. Though, I do not know how we would know the peer IP addresses beforehand.
So far, the signal-server is sort of a central meeting place to get to know who you want to talk with.
After that, the clients try to talk to each other directly, with help of the exchanged information. If you have a IPv6 this normally just works™. If you are behind a NAT with a shared IPv4, not so much.
That's where the ICE/STUN/TURN stuff would do their magic to fool your NAT.

I suppose we could ask the users to transmit the WebRTC SDP queries via the same third-party communication link they already use to exchange the invite Link.
That way, we would not need the signal-server. I will try to do a quick-n-dirty proof-of-concept of it; with a modified signal-server, running locally on both ends.

@schadomi7
Copy link
Contributor

Well, that's at least how I currently understand what is happening.

The proof-of-concept works, I think. The signals are (or at least can be) a dialogue, and you may need to send messages back and forth. Basically you play signal-server yourself. It is definitely more tedious than just using a single link. I may have done an error in the modified signal-server, so I really should test on different machines. Will do, once I have time for it.

@schadomi7
Copy link
Contributor

Sorry for the spam, I am just so happy that it works, wanted to share:

teletype_no_signal_server.webm

(here is an embedded video file, it shows teletype-diy in action without a signal server)

Does not need any server at all (if you do not need NAT-Traversal). I am not so good at UI stuff, so I used the clipboard for the time being. Someone with more UI know-how should probably change this to something more usable.
Also, I think I can move one of the signal transfers (from host/share-starter to joining parties) inside the initial invite URL. I do not know if it is also a good idea to wrap the other signal into another second invite URL or make it something different altogether to clearly distinguish between them.
I can see arguments for both.

I do not know, should this P2P-Mode be the primary mode of operation? Or would you provide a signal-server as default config for the core-package? I realize this is mostly a cost/benefit kind of calculation. Any idea what percentage of pulsar users might be using teletype?

I for one, was using it sometimes. But everytime I needed it, it was really convenient to have.


Also, I managed to fix the fuzzy-finder compatibility. I had to fix something inside the fuzzy-finder package, will file a PR for it.
As far as I can figure out (old marketing, videos on youtube, git history, issue on the original repo), teletype was never able to share the workspace in the tree-view. Funny how memory works, must have mistaken it with live-share from vscode. I might look into what it would take to share the tree-view, no promises. I guess, it should be quite similar to the fuzzy-finder.

@schadomi7
Copy link
Contributor

Here is a quick video file, showing tree-view sharing in action:

teletype_tree-view.webm

It just barely works, there is a lot of stuff still to be done for it to really feel like a normal tree-view. It is more of a proof-of-concept.
It was one of the most requested features in the old teletype, and I think it is quite nice to have.
It looks quite wonky because CSS. Need I say more?

I actually gave up trying to change the core tree-view package and modified the community package tree-view-extended instead.

I put the code here and tagged it with: first-working-version


For it to work, I more or less had to implement another suggested feature (atom/teletype#191), en passant so to say.
The teletype service now has additional API-methods to send notifications and subscribe to data channels.
This enables third-party packages to communicate over the teletype connection.

Looks like this:

async notifyOnDataChannel ({channelName, body}) {
[]
async subscribeToDataChannel ({channelName, callback}) {
[]
// only receive msg if we are host
async subscribeToHostDataChannel ({channelName, callback}) {
[]
// only receive msg if we are guest
async subscribeToGuestDataChannel ({channelName, callback}) {
[]

I used the tag api-subscribe-notify-draft.

Would love to get some feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants