-
-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Porting Teletype from Atom repo a.k.a Collaboration tool #536
Comments
We have had quite a few discussions on this already (informally on Discord mostly) as it is likely going to be a big project. The main issue is that Teletype runs on a server and we simply might not be able to justify running such a service without charging a maintenance cost for access. For that reason we were also discussing the possibility of making it easier and simpler for somebody to host their own server or doing something p2p. It is a huge project and currently most of the team are still very much occupied with some other large scale changes but I'd be keen to see this supported again. |
I'll also add, since I've personally attempted a few rewrites of Teletype at this point, the functionality of the server is also littered with logic that preforms logging, or other types of data collection that we simply wouldn't be able to utilize (Nor want to). But like @Daeraxa said, I think our most likely path forward would be in the form of a fully p2p implementation, to attempt to cut out the central connection broker server, allowing us to not have to front the cost of such a system, but also remove one of the more common complaints about such a system. |
What about just open-sourcing the server running part with, say, a documented docker-compose.yml? I mean, a lot of the open source community is concerned about data collection, but the solution of self hosting the server part is a good solution that is accepted by the vast majority. To be more issue oriented and less debate focused, wouldn't it be a solution to document the server running process to capitalize on Teletype's previous work and address this issue quickly to give us plenty of time to address the p2p full rewrite, as it could take years for the community to finalize a working solution? What would be the macro steps to solve this Issue? In what form would a merge request be accepted? |
@matbgn I appreciate the issue focused approach here. So, since we only have access to what Atom had open sourced originally, we do have access to the already open sourced code of teletype, meaning that if anybody wanted to they could absolutely set up their own instance. So just to make sure we are on the same page, here's what we have of teletype, which as far as I know is everything needed to get it working:
So right now, anyone could technically run the The issue isn't that it can't be setup, or that the community doesn't already have access to the code, the issue really is that the Since right now, the source code of
Now from there, almost all of these integrations do require payment in some way for any reasonable functionality. So with all that said, and making it obvious what our starting point is, lets get into how to fix it. So as for solving this issue right now, I'd see really two best ways forward here: And I want to add, while I could relatively easily get this into a docker hostable setup, I don't think that really addresses the problems I'm seeing with it, unless I'm underestimating how willing people would be to rely on third party paid services, but continuing with that assumption, here's what solutions I'm seeing.
So really, in short the macro steps needed to solve this issue:
So sorry, I know that was a lot, but hope it can point out a bit why this hasn't been done yet, since we do want to get it done, and all of that leads to why I've partially nearly decided a full rewrite is the easier answer. I do want to also point out, if people really really just want collaborative editing, we can do it totally differently. As in having a single centralized server, this is something that we can already do, as we can see with our backend, and frontend websites. It is something I'm already super familiar with. But would then remove any advantages we gain by being peer to peer, and would still up our costs for hosting. Also a quick aside, I totally agree that it would be best to document the server running process. But I do want to make it clear, the Pulsar team has never had any assistance or communication from the original Atom team. Nor have we ever been given any special kind of access. We have the same exact resources as anybody else would, anything we learn has to be learned from whatever documentation existed for Atom and their tools before they took everything offline, or the source code itself. So there's a lot we have to just figure out by staring at it for a few hours lol |
What an amazing answer! You really made my day, so thank you for your thoughtful and complete response 🙏 First of all, I totally agree that we need to get rid of all non-open source third parties. Now what do you mean by "ANY"? How big is the technical debt associated with this solution and what does it mean for further development of cleaner p2p solutions, meaning how do we pay it back do you think? In mostly, "any", cases I would vote for a quick "go2market" solution, especially if you are familiar with the technical stack. Which library is it based on by the way? Hoping I, or someone else, can contribute 😉 Finally, regarding the Pulsar team, I meant no offence, apologies if you felt that way, and sorry for my English in general. I wish this project all the best and would be happy to help in any reasonable way I can 😄. |
@matbgn So to start at the end of your message, I'm sorry if my tone was inaccurate, you made zero offence, absolutely no apologies needed, and I honestly had zero clue English wasn't your primary language, so no sweat! Your fantastic! But what I meant by 'ANY' solution, is simply, does our community just want collaborative editing, no matter how it's implemented? Or is the want of teletype, truly for the peer2peer nature of teletype itself? As in, what's the most important, collaborative editing, or collaborative editing via p2p technology? But interesting to hear you yourself would vote for a "go2market" solution, since that's what the alternative would be. But @Daeraxa and I were just talking about what pursuing something like this would mean for a proper implementation, and we essentially seemed to come to the concern that if we instead pursued a centralized server methodology for collaborative editing, it would more than likely delay or completely kill a proper p2p implementation. If only because the time needed to maintain it afterwards would all be time taken where we couldn't work on the proper solution. @Daeraxa said this best "For example if the p2p implementation was going to take 300 work hours to complete and [a centralized server] took 200, is it worth spending that 200 just to delay the p2p one and end up deprecated this[the centralized server]?" But if I did pursue a centralized server, I would very likely want to use the same stack that we currently use on the Pulsar Package Registry, which you can view the source code on
|
OK, the central question is first of all the cost comparison. Based on your quote, I did not expect 200 hours.
Looking at the stack you pointed out, I don't see a core feature for real-time collaboration on flat files. I was looking for something like CRDT with this kind of libraries in mind: https://github.com/yjs/yjs As an aside, since I'm talking about the yjs library, a central server (to initiate communication) relayed to p2p is also a possibility, as from what I've heard so far yjs seems relatively straightforward to implement. Maybe you or @Daeraxa could give some feedback on the technical feasibility of implementing yjs in the Pulsar codebase? |
This was just me giving a random example to get my point across, I honestly have no idea what the effort would actually be. Thanks for the link to that library, I hadn't seen that one before - from a technical standpoint I'm definitely not the one you want to ask so I'll leave that one up to others :) |
@matbgn Since as it is now That's why the tech stack I shared doesn't seem like it's helpful for a collaborative editor, it's not. The tech stack I shared doesn't care what the data is used for beyond the internal structures that's used to store data in transmission. As for the feasibility of implementing And lastly, you are totally right that having a central server to help initiate a p2p connection is possible. Thing is, that's basically what all p2p is. Whether you use a STUN, TURN, ICE, ultrapeer, or relay server to get clients connected, nearly every popular, modern form of p2p relies on some centralization, so following this, is essentially going the route of a total rewrite of a p2p implementation, which now we've come full circle lol. So it does make me consider, if our best bet isn't just finding the right answer in terms of technology we want to use to pursue a p2p implementation, since of course we wouldn't want to create something brand new from the ground up, although was still holding onto hope to move away from centralization totally, to forgo the hosting costs. But fingers crossed Also last note, yeah 200 hours is a totally made up number lol |
Thank you very much for these technical details. I must admit, I know very little about CRDT and p2p protocols in general. If we summarize the effort to be made in terms of hours, and considering that 200 hours are overestimated, what would be your best estimate? Based on Teletype's code inspection in terms of LOC with The question is therefore, as @Daeraxa pointed out, to know on the basis of this evaluation, what would be the best route to take. But if I understand correctly, the gain factor is roughly 10x, actually. Moreover, in either case and in a solution-oriented approach, what would be the next step to take in terms of code if we could split this challenge into smaller chunks? I'm thinking, for example, of information of lesser importance, such as just getting the status of connected clients, the position of the cursor, the name or any other simplistic idea that would allow a first Pull Request to see the light of day if we were to start with a complete rewrite. ─────────────────────────────────────────────────────────────────────────────── |
@matbgn Very interesting approach you've taken to determining the cost of producing the code. I've never seen this tool before so will have to take a look at it. But in terms of what we need to do first, we need to consider, that Teletype is fully written, Teletype-Client is fully written, as well as Teletype-CRDT, ideally, we just rewrite Teletype-Server, and make it compatible with these other items. So the first order of business is research, determine exactly what each of these expects, and determine what data structures they are intending to work with. Then from there, it'd be research to determine how we do that on the server side. Since while you are correct, that if we rewrote everything from scratch, we would need to set much smaller goals, such as seeing other clients and so on, if we pursued a centralized approach, or goal would be much much higher of getting things working, or measuring them as, having a functional server that can receive data and return it as expected. Then have one that works with the existing code. But it sounds like we may all be thinking sticking with p2p is our best path forward. So considering that I'll stop responding for other methodologies. Getting started with rewritten p2p:While I can't say definitively what we would need to do for a first pull request, I can at least layout what we need to get started:
So if we are committed to rewriting the p2p stack used within Teletype, I think the first order of business is to determine what tech we use to support this. Obviously, we want to use a Pub/Sub system to communicate between clients. But then the big question, is how do we set up their connection? There's a few choices in this space, each with their own pros and cons, but some of these may be
When looking at which of these we want to use, we also should consider, are they fully open source? Are there any public services we could use to already implement this? What are the resources needed to host this if we have to do it (What would that cost us to do so)? So determining that bit is probably our next step to start implementation. |
Hi, I thought to comment on this since I've written p2p hole punching for a personal project of mine
All STUN tells really you is your external IP : Port pair as observed from the STUN server. Unfortunately this is insufficient because a lot of NAT will use different external ports when the destination of the packet is different, resulting in the port you get from STUN unusable for your peer. So STUN can be a useful reference, but not to be relied on alone.
Relay == TURN (Traversal Using Relays around NAT). This isn't really p2p but it can be a last resort, sure.
Never heard of this.
This looked really promising, in that it gathers address candidate that the host could potentially use to connect to the peer. I initially tried STUN, then kept having reports of it not working I didn't use ICE at the time because of I didn't want to read the RFC that closely, and tried to search for library to use and found libnice, but I didn't want to introduce glib as a dependency. For my project in the end it was just simpler for me to invent my own protocol. The following was what I wrote at the time. You can substitute "VMs" / "switch" (they are referring to nintendo switch) with "clients", and we use a network of p2p connections where every client connect to each other, instead of having one client only connect to one other client. There's also a server that does some initial message exchange (such as exchanging IP and port numbers and performing STUN) before the p2p is established.
Although I should mention that even in this case there are still combinations of NATs that cannot p2p each other, so I implemented a relay later when all else failed. Though, I'm not exactly sure how all the p2p libraries in the wild performs hole punching, so they may perform better than my reinvention. (I do know that Nintendo is able to establish connections in some cases where mine cannot) |
@zhuyifei1999 I really appreciate the info! Yeah I know my short list left out a bit of information there, but you do seem to have some really cool ideas around how to preform hole punching for these purposes. One thing that I'm hoping to be able to use, since I'm wanting to pursue a Sub/Pub communication style, I was aiming to implement a gossipsub, which would allow data to travel between nodes on a network to others. Which would solve the problem you pointed out of D not being able to communicate directly with A. Since if A sends information on a gossipsub network, then it'll be 'gossiped' about from B to C to D, and vice versa. Obviously, it's better to maintain multiple peers rather than a single one, to avoid being cut off from the network and having to rejoin, but on the small scale this would be a methodology to fix that issue. As for what protocol to use to implement the initial bootstrapping into the network, I am starting to think it might be worth it to study up on an ICE implementation, because (if memory serves) ICE actually includes STUN and TURN within it, so it's able to do quite a bit for us in terms of bootstrapping. But I do appreciate the info and link |
Hi, sorry to drop in like this. I only noticed this issue now. I made a self-hosted teletype version: file.webm(here is an embedded video file, it shows the teletype-diy package in action) It works without Github-Login, Pusher, twilio, etc. It uses socket.io for a really simple signaling-server. Currently, it works without any NAT traversal, I guess it should work over the internet with IPv6. Though I did not test that, yet. I seem to remember that the original teletype could also share the project files in the tree-view. That does not work, yet. I was planning on publishing it as a community-package, but then I noticed this issue. PS: I used your usernames in the video, this only fetched the user profile icons, which are public. No other information is fetched from Github. I hope that is ok. A little bit of background: Atom was my favorite editor. Thank you for making my new favorite editor more awesome every day! |
@schadomi7 This sounds amazing! Now of course you can publish and take ownership, but if you'd like us to do so, that's something I'd be much more than happy to discuss with the team so that you don't have to worry about cloud hosting and long term ownership. But I'd love to take a look at the source and see what you did to get it working, and see if I can get everything else fully functional. Mind sharing a link? |
I do not know why it works, either. Something must still be wrong with it, other than the missing ICE/STUN/TURN-Stuff, I mean.
I would guess, the signal-server is not working correctly. I only guessed what the PusherClient is doing with the channels (based on function names), replicated that behavior with socket.io rooms, and moved on when it seemed to work. I don't mind sharing a link, and I really should have done so already. I was hoping to get a chance to clean up some of the mess I caused, but I see now that was futile. I think it would be best if teletype became/stayed a core package. I, personally, would really like to retain an easy possibility to self-host it. Maybe that could be solved/mitigated by just using the portal name as the invite-link; as it stands, the single uuid in the invite-link does nothing more then facilitate the lookup in the database for the real portal name. That would also eliminate the PostgreSQL requirement, and maybe even the API server, too. Just thinking out loud, not like I really know what I am talking about. I think, I will keep on experimenting with it. At least for a bit. Feel free to pick and choose what you need for the core package. PPS: Maybe it would be sufficient to ask the user for confirmation before a remote user connects. |
I did some stuff. I continued to hack on my self-hosted version, here are the broad changes:
Other than that, I managed to successfully test a connection between two instances over the internet. Both had an ipv6-address, so still no NAT-transversal. Signal-server was a third party. I added a tag I had some other ideas, like encrypting the signaling or using base62 encoded UUIDs. I would love to get some feedback on them and the changes above, too. From what I understand the direction and intention of the discussion above, I think those are aligned to the same goal. @confused-Techie Any news from your side? Were you able to reproduce my results? What would still need changing in your opinion? If you do not mind, I could prepare merge requests with the changes mentioned (or a subset of those), maybe that provides a better forum to discuss changes. |
Please note that when using STUN/TURN etc. it would be important to take SRV records into account. 0. ForewordSince the list in the URL could become very long for some participants, I would suggest that echo 'MemberIPv6=...&MemberIPv4=...&stun=...&token=...' | gzip | base64 | urlencode
Specifications such as 1. URL without any STUN/TURN instance
2. URL with STUN/TURN instance
In this case, the domain should be used and, if necessary, the SRV record should also be checked. 2.1 URL with double STUN/TURN instance
In such a case, the additional specification of stun and turn should be interpreted as a higher priority and an attempt should be made to use the servers specified with Ps. These are all just rough ideas and can be changed as much as necessary and desired. In the end, however, it would be great if it were possible to communicate sessions via URL and also influence the information contained for special cases. |
@schadomi7 This is amazing! I'm taking some time to fully review the code, but this is looking amazing! Love that there's no central server, but love that things could technically function without it (in localhost and ipv6 instances). But overall what you are describing sounds fantastic, and is far better success than I've had thus far. I'll get started reviewing your changes properly, and if everything seems good, I'd absolutely agree we could move foreward with a PR to review the more minor code details there. Supremely awesome work! |
@confused-Techie The signal-server does know the portal names. @YoSiJo I suppose we could ask the users to transmit the WebRTC SDP queries via the same third-party communication link they already use to exchange the invite Link. |
Well, that's at least how I currently understand what is happening. The proof-of-concept works, I think. The signals are (or at least can be) a dialogue, and you may need to send messages back and forth. Basically you play signal-server yourself. It is definitely more tedious than just using a single link. I may have done an error in the modified signal-server, so I really should test on different machines. Will do, once I have time for it. |
Sorry for the spam, I am just so happy that it works, wanted to share: teletype_no_signal_server.webm(here is an embedded video file, it shows teletype-diy in action without a signal server) Does not need any server at all (if you do not need NAT-Traversal). I am not so good at UI stuff, so I used the clipboard for the time being. Someone with more UI know-how should probably change this to something more usable. I do not know, should this P2P-Mode be the primary mode of operation? Or would you provide a signal-server as default config for the core-package? I realize this is mostly a cost/benefit kind of calculation. Any idea what percentage of pulsar users might be using teletype? I for one, was using it sometimes. But everytime I needed it, it was really convenient to have. Also, I managed to fix the fuzzy-finder compatibility. I had to fix something inside the fuzzy-finder package, will file a PR for it. |
Here is a quick video file, showing tree-view sharing in action: teletype_tree-view.webmIt just barely works, there is a lot of stuff still to be done for it to really feel like a normal tree-view. It is more of a proof-of-concept. I actually gave up trying to change the core tree-view package and modified the community package tree-view-extended instead. I put the code here and tagged it with: first-working-version For it to work, I more or less had to implement another suggested feature (atom/teletype#191), en passant so to say. Looks like this: async notifyOnDataChannel ({channelName, body}) {
[…]
async subscribeToDataChannel ({channelName, callback}) {
[…]
// only receive msg if we are host
async subscribeToHostDataChannel ({channelName, callback}) {
[…]
// only receive msg if we are guest
async subscribeToGuestDataChannel ({channelName, callback}) {
[…] I used the tag api-subscribe-notify-draft. Would love to get some feedback. |
Have you checked for existing feature requests?
Summary
It would be great if we could capitalize on the work done by Atom (https://github.com/atom/teletype) to offer the real-time collaboration feature on Pulsar
What benefits does this feature provide?
An Atom package that lets developers share their workspace with team members and collaborate on code in real time.
Source: Atom / Teletype
Any alternatives?
Based on the previous work done on Teletype by Atom and to give Pulsar a rocket advantage on IDE market field I'm convinced that a real-time built-in feature is a key feature.
Again, on top of previous Atom's working solution, and based on the newly fully integrated Tests CI Pulsar seems to be the legitimate repo/owner of this reboot.
Other examples:
All are closed-source solutions. Even VSCodium and it's Server declination is not able to solve that easily.
But with the previous work done by Atom on Teletype I guess Pulsar have an advantage on the implementation of this feature.
The text was updated successfully, but these errors were encountered: