-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
0.7.x Windows support - obsolete, check new one, leaving open for discussion #104
Conversation
Hi @janbiedermann , I love where you're going this, however, this week I'm super busy and won't have time to look it over (between my birthday obligations and a few projects I am fresh out of seconds, not to mention minutes or hours). I'l review it soon :) Cheers and THANKS!!! |
Hi @boazsegev , So with -w1 -t1 in the example "Hello wordl!" app i get with Apache ab ab.exe -c10 -n 100000 -k http://localhost:3000/: With multiple threads i get a lockup and with too many clients app exits. Still some work to do. |
Fixed all issues i came across so far (not yet pushed), but there is one issue, that needs a bit more attention. On windows there seems to be no determinable socket limit and socket fds (HANDLEs in winsock terminology) are all sorts of numbers and with multile threads often way beyond the fixed allocation of fio_data. For example: Windows may for example provide a fd/HANDLE of So for Windows a indirection would be required, that maps actual fds (HANDLEs) to fio_data->info[index]. I am thinking, how best to achieve that with as little as possible changes but did not arrive at a final solution yet. |
To solve the above problem u just added a fio_data->info[i]->soket_handle. Works nicely, forward lookup is fast, reverse lookup scans fio_data>info[]. Fast enough. Single threaded it still shows excellent performance and after fixing another crash bug (not #105) it works very reliable. Socket handling seems to be correct. |
Multi threaded with 1500 connections with a capa of 1024, 5 minute run:
Single threaded with 1500 connections and capa of 1024, 5 minute run:
|
Great performance and multithreaded! Finally! And very stable, stable enough for development. |
Hi @janbiedermann , First I want to thank you for all your amazing work. I love how minimal your code changes seem (though I think it would be even better if we could somehow abstract away some of the code changes to a centralized location using macros / inline functions). I am still at the beginning of learning what needs to be done for Windows integration and how I could test the WinSock API code (I don't have the OS and don't really want to shell out money for an OS I don't believe in)... in the meantime I found myself asking a few core questions...:
To clarify point 2: With WebSockets and (maybe in the future) HTTP/2, facil.io supports tens of thousands of concurrent connections (some of whom may be dormant for long periods of time while waiting for push events)... on Linux facil.io would be limited mostly by machine resources, not by any internal implementation detail. I would love Windows users to have the same experience.
Again, thanks for all the amazing work! Cheers! |
Hi @boazsegev ,
Ill try to get websockets running asap. Latest code is here https://github.com/janbiedermann/facil.io/tree/0.7.x_w |
Latest instructions for Windows: |
Websockets work, however ... after using some and many with multiple threads the app busyloops doing nothing, just heating the room. No idea why. Debugger shows nothing of help. |
Ok, poll() on Linux too has some issues dropping websocket connections and later on not taking any connection anymore. So there definitly is a bug in the common poll()/WSApoll() code. But on Linux it doesnt start busy looping. Anyway, it sometimes shows as this, on Linux and on Windows:
On Linux it restarts the worker, everything fine, until next time. On Windows, Windows kills the thread and starts spawning new ones and killing them again all the time, also spinlock waiting for something and thus causing the cpu usage and doing nothing. |
There is no reason access to ->flush should fail, its never changed to something else than the default flush function. So there must be some stack/heap/whatever overrun or something. But at this time i am getting nowhere with this. How to find the culprit? |
Actually this might be related to a bug I wanted to track down for a while that's probably related to race conditions in the protocol assignment logic and might be unrelated to the However, I think the I want to dedicate the main thread to the system and then all user code will run on "user" threads. This will ease much of the lock contention and allow buffered data to be sent even when all user threads are busy and without stressing the locks on the IO buffers (new client requests will wait for user threads, while buffered responses will be sent with lower latency). I'm very much behind on the new design for version 0.8.x and I don't want to pause the Windows support avenue, which is why I didn't point out before that there's additional future work in the pipeline. I thought I'll just port your code to the new design once it's ready. You can have a look at the re-write for all the The future 0.8 design will eventually go into this repo which is currently full of junk that I need to replace (the beginning of the new design is partly on my system and mostly in my head). One of the issues the new design raises is the question of the I wish I knew what I'm doing, but right now I'm spread thin on a big number of projects and nothing is moving fast enough. |
Hi @janbiedermann , I'm having some issues and I wanted your opinion. For the last week or so I've been intensively attempting to work out a manageable approach to Windows development. My main concerns are maintenance and testing. If we're looking at getting Threads and I think it would be better to code these few layers in a native Windows API rather than use MSYS2 / Cygwin / Mingw. The facil.io custom memory allocator requires the ability to allocate memory on aligned addresses (it uses masks and pointer addresses to figure out the original allocation block and access the block's metadata)... I'm not sure I can rewrite it for Windows. However... Keeping an intel computer around just so I can run a virtual Windows dev machine and provide support for Windows sounds a little too heavy duty for me. I cannot test contributions without running the code and I don't want to run a windows machine (not even a virtual one) on my network. That's one high-risk OS where everyone (including Microsoft) it trying to mine personal data from both the machine and the local network. So... I don't know what to do. Maybe Windows support will (forever) remain "unofficial" and "untested", where I trust contributors without testing their code... or maybe Windows support will just fade away and Windows related bugs / issues / features will never get addressed. I intend to try my hand at incorporating what you've already done so that we are as close to Windows compatible as we can be, but... this week that I spent learning the Windows API (and their types, they have to name all their types in capitals?)... it really made me feel that keeping up with Windows support over time might be impossible :( What do you think? Will we be able to maintain support for Windows over time? Will there be enough contributors? Who will test Windows code? ...? Sorry if I'm ranting, it's just that this Windows experience reminded me why I left that OS both as a user and as a developer. Thanks for your input. Bo. |
Hi @boazsegev , ill give my best:
Thats basically done in my code. There are some windows support functions, well, posixish wrappers for native win api.
Sure, but i am not sure, if there would be any benefit, even performance wise. Sticking to pthreads of msys/mingw keeps thing simple and portable. Cygwin isnt a target for me, too slow.
I agree. But honestly, that Windows tooling is beyond my understanding. All i need is a compiler and header files and libs. What you get is a web installer with lots of fancy things that i dont understand and gigs over gigs of software. No idea what this all is about. And it seems to change over and over again with each new VisualStudio release or Windows Release or whenever some new marketing guy is hired at Microsoft. I am unable to keep up with this ever changing vastness. However, i found out, that are "build tools" available and after installing >1G compiler, headers and libs are there, somewhere on the system ...
It seems Windows doesnt provide aligned memory allocation, so that must be done "manually". I omitted that part and just use malloc/free, good enough for now.
What kind of computers do you have?
No, i think we dont need to. Once Isomorfeus is running on Windows the world will gradually switch to Linux or *BSD ;-D
Maybe, i could imagine that people might be interested in something fast and simple on Windows, not sure. But maybe not.
Clearly, its the same for me, but i would like isomorfeus to be more accessible for a broader audience of developers/users because of reasons. As long as Windows is dominating the desktop, i think i can maintain and support iodine/isomorfeus on Windows in such a way, that isomorfeus development works nicely. So for that i am testing. And now for something completely different: |
I think i found a bug, with the fix websocket things seem to work better on Linux:
Currently it doesnt fallthrough and just returns, and so the buffer passed in case of locked_error never gets freed. Further on linux with poll() testing websockets, there seem to be "waves" of activity and sleep, like connections/data are coming in but facil.io does nothing for 5s and then is very busy for 1s and then again doing nothing for 5s. (times not accurate, just to explain). When i stop the debugger in such situation for example all threads are waiting for a lock to fio_postoffice.pubsub.lock or all threads are waiting for poll(). Still investigating. |
Why do you use your custom locking and not pthread_rwlock? |
I think i found the problem with the locking.
channel gets unlocked in fio_subscribe fio_unsubscribe
Lets look at 2 threads running that interleaved by accident: |
Alright, fixed all that and pushed latest code to 0.7.x_w but i got a new problem: Only when using websockets. |
Windows support -> heading for world domination!
To use:
ridk enable
make test
Remarks:
There is no ssl support.
There is no fork() on Windows, so only one worker will work.
There are no unix sockets on Windows, sadly. Named pipes would be the best solution, but semantics would change and make things complicated.
Instead i opted for using tcp ports for local communication, as it keeps things simple, keeps the socket semantics and its possible to use WSAPoll() for everything.
https://docs.microsoft.com/en-us/windows/win32/api/winsock2/nf-winsock2-wsapoll
The point of all of this is, to make iodine and thus isomorfeus development accessible to a broader audience. Patches for iodine follow some time soon.
Although tests work, an actual app does not respond ... need to fix.