New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Getting rid of pthsem #29
Comments
I do not think we should completely remove threads. You either re implement libpth or we will loose the flexibility knxd has know. Now every part works more or less independent of each other. With a single event loop this will not be so easy. If pthreads is available on all target platform (I think so), we should use this. But this will change the architecture of knxd. Therefor we should not start this now. With enterBusmonitor: It not completely implemented on all interfaces. I think it's not even possible on all interfaces. Busmonitor can be ignored for now. |
Probable it's not easy to remove all threads. That's why it is my loooong term goal. But even if we only get rid of some of the threads, the transition to pthreads will be easier. And for me the code is easier to understand if there are less threads. About ignoring Busmonitor: Do you mean we can take the code out of the emi-Classes? We can leave it in but if it doesn't work and we cannot test it, it will be just in the way if we want to change anything on the code. |
It is possible, but in my opinion it in not a good idea. We should keep threads and keep the code modular. This gives tho most possibilities for future development. Removing threads is not a good goal. To me it seems you see knxd just as a knx stack to access the bus. But it can do more. Currently every class works on its own. You could create more than one backend instance for example. If you try to create one big event loop things will get much more complicated in future. Removing feature to develop some of the most used core features more easily is not the way to go.
No. Just keep it as it is. Knxd should be as compliant with the standard as possible. |
Hello Timo,
thanks for your reply. I do disagree about the benefits of having threads, but it looks like I'm the only one that does. So I'll leave them in. |
I want to clarify my opinion on this point again: I think pthsem is an organisational problem - yes!
|
With my Debian Developer hat on: Whatever you depend on needs to be in Debian as a dynamically linked library so I would vastly prefer something which is in Debian (and thus presumably more standard), already. With my CS hat on: Thread, done well, are almost always better than single-threaded. |
I also agree, threads are an integral part in getting a performant system However, there are some points for the single-process event-loop approach 2015-01-27 22:34 GMT+02:00 Richard Hartmann notifications@github.com:
Elias Karakoulakis |
I share Elias point of view, that synchronizing threads is difficult. And it is difficult to test. When I cancel knxd (with USB) with Strg-C sometimes I get a segmentation fault (something with double freeing memory I think) depending on which order the threads are destroyed. I tried to find the cause but I didn't succeed. Michael, can you clarify about the next steps:
|
I suggest to fully merge the code of pthsem into knxd and link statically. This way provides minimal risk and efforts to reach milestone version 1.0 any time soon capable to be pushed to Debian or other distributions. For >1.0 a new thread model may be developped. |
I do not think so. This just bloats the code and makes it more confusing. A better option is to add a option to autoconf/automake to download(or take a existing file) pthsem, build it and link (static) against this. This way it does not change the code base but packaging would be easier. |
Thats true but it's also true for your "one big loop" implementation. The big advantage of threads is you can develop each thread completely on its own. Without you have a lot of dependencies. In the end you will get a big state machine with a lot of queues... This is much worse then threads. knxd is not just a simple knx stack. There can happen more than one communication at a time. Your may have more than one physical interface plus more than one client. And then the single thread application gets complicated. So you either get a big complicated piece of code or you reimplement pth...
For threads we need a new architecture. Actually pthsem is your single thread select loop approach. Just in a nice library. And that is the problem now. pthsem is cooperative multitasking. Today there are no widely used cooperative multitasking libraries. So we should switch to preemptive multitasking. For this we need a new concept... Transformation of data should be done without threads. It possible this code should be reentrant. This is a big task and I thin this should be done by more than one person alone. You seem two step ahead with your plans. First we should comment and document the current code. Second we should think about restructuring parts of the code. (emi/cemi, packet/data transformation seem a bit scattered - could be nice encapsulated classes) The we can talk about how the new threading/locking should look... This is definitely after 1.0. Version 1.0 should be bcusdk-0.0.5 eibd with some fixes and cleanup. |
I agree with all of you - in most points. First of all: please see myself merely as a moderator, not claiming any lead here! Let me explain a little:
So, what are the options:
I prefer 2) or 3) - in this order. |
Disclaimer: I don't know the internals of knxd. |
(3) is something Debian really doesn't like. (2) should be possible, given that it's a drop-in; I just looked at the changes, seems reasonable. Also only five Debian packages require libpth, it should be easy to check that they still work. (1) No. Please don't. We have more important issues than to destabilize the code base and rewrite everything so that it fits into an event loop's mold. (4) We have more important problems than re-implement libpth and/or the code that uses it. |
On Wed, Mar 11, 2015 at 6:53 AM, Matthias Urlichs
Embedded libs make sure you will not get into Debian. Period.
I am asking the maintainer of pth about this. Hopefully, my other Richard |
Hi Yutaka, we want to get knxd[1], a replacement of eibd[2] into Debian. Both knxd and eibd depend on pthsem[1] a GNU pth fork which supports semaphores. pthsem is supposed to be a drop-in replacement of pth so it's not sure Would you be willing to look at replacing GNU pth with pthsem? You can find a discussion about this topic here[4]. In parallel, I am asking FTP masters if they would accept pthsem alongside pth. Richard [1] https://github.com/knxd/knxd |
On Wed, Mar 11, 2015 at 8:17 AM, Richard Hartmann richih@debian.org wrote:
Paul Wise suggested simply merging pthsem into pth. Of course, that Richard |
Theer's nothing much to merge from pth itself, as development basically stopped with 2.0.7. What's left are a bunch of patches by the distros, and to step up the version number to 2.1. I don't see the point of having pthsem alongside pth. It's suppoed to be upwards compatible after all. |
_-- This email is from Yutaka - GitHub's quoting system sucks a bit --_ On 03/11/2015 04:17 PM, Richard Hartmann wrote:
I maintain Pth in Debian, because of GnuPG and its friends. Pth has been unmaintained in the upstream any more. It seems pthsem In my opinion, Pth is mostly dead. I think that I will orphan Pth GnuPG "modern" 2.1 is now migrated to nPth. nPth is better software, I don't look at the code of pthsem and knxd, but if I were the assume cooperative threads._-- This email is from Yutaka - GitHub's quoting system sucks a bit --_ |
@RichiH Good point, I didn't know about nPth; we should investigate that option. |
@smurfix Just to make really sure, the email you are replying to is from Yutaka, the DD maintaining pth in Debian. I will poke GitHub about proper handling of email workflows... |
@RichiH Forgive me, I still don't understand two points here, maybe you can tell me:
|
@tru7 Debian had, in the past, a lot of problems with providing security fixes for libraries that are embedded in some other package and linked statically. Therefore current policy is that This Must Not Happen Again. Thus, there shall be One True Version of any library in Debian, and everybody else is supposed to link to it. Dynamically. Given that pthsem is (supposed to be) a drop-in replacement for pth, the easiest way forward is to rename libpthsem-2.0.8 to libpth-2.1.0 and upload that to Debian Experimental – short term. Long term, replace libpth with npth, as that has at least been maintained during the last decade or so. |
I understand all points, a third point is: not breaking eibd/knxd, so I ask all of you to find a good&resonable solution ! Edit: And I'm very happy to have Debian-Maintainers "on-bord" here!! |
To anybody wo does this (i.e. pthsem>npth replacement): Note that there are lots of semaphore up+down calls which are closely associated with the fill state of a queue. That should be encapsulated in the queue object, having these separate makes the code less readable and may hide a bug or two. |
I have looked at npth somewhat more closely. Basically it's a thin and mostly-undocumented layer to use pthreads as a pth replacement. The problem is that it doesn't implement most of what knxd needs. In particular, event chaining is not implemented – in fact, event anything is not implemented. knxd uses pth's events quite extensively. It's probably possible to teach npth that, but the amount of work is non-trivial. Not something that's done i a weekend. Or a week. It may be less effort to rewrite knxd to use pthreads directly. Fewer layers of software also tend to be easier to debug. On the other hand, there's no good way to do this incrementally. :.-/ |
Two reasons why libev is a better fit IMHO, at least right now.
The libev branch already uses libev's mainloop, with a pthsem back-end. I have also moved signal handling to libev because pthsem's signal handling lost signal events when both mainloops were running. I am confident that this approach will work. Layer3 is next. |
@smurfix, can you please give short instructions on how to build branch libev after cloning knxd? My libev directory stays empty. How about updating bootstrap.sh for it? |
.gitmodules had has the wrong URL. Already fixed. Putting it into bootstrap is a good idea, I'll do that. The current version, with a libev-ified L3, hits strange errors (randomly cleared pthsem event ring pointers …) on my system which I haven't figured out yet. Right now I'm also working on the unix/TCP socket handling code (which is way more work than I thought). I expect to have something useable on the 23th or so. |
Progress: Errors fixed. knxd runs successfully with TPUART (libev'd), unix/inet/systemd server (also libev), and multicast (phsem). Pthsem-using code doesn't always shutdown correctly, but that's a minor issue I'm not too concerned with. Converting multicast to libev is next on my agenda, then I'll need to refactor groupcache and management support. Both need to be disabled (which is the current default) if you build the libev branch. I'd appreciate help with the "other" back-end modules. The tpuart code should demonstrate what needs to be done. Note that it still uses its own local input buffer, which is no longer necessary (I added small classes for wrapping stream i/o) but I wanted to get knxd up and running on that interface without yet more refactoring. |
tpuart has a tpuarttcp backend which you can use to talk with a "remote" interface. Thus you can use |
Progress: multicast and the ip:/ipt: back-ends now run on libev. Of course I found a few bugs while I was at it; most notably there is some busmonitor code lying around in eibnetserver.cpp which causes all your monitor packets to get duplicated. It also (still) duplicated layer3's work of decrementing the hop counter. Somebody might want to backport these. The parts which I have tested are valgrind-clean. The tpuart back-end also has an important bugfix: if the hardware repeats a packet, knxd will now not also do that. Next on my agenda is fixing the group monitor and management code, which will hopefully make ETS5 work again. TODO: The tunneling code is untested (client as well as server). The tunnel client needs a flag to teach it not to do address mapping (i.e. it's talking to knxd instead of a dumb hardware interface). The other backends which still use pthsem need to be converted, which requires hardware that I don't have. Testers and/or co-authors wanted; the process of converting a back-end is reasonably straightforward. |
I never suceeded in compiling the libev branch. Shoud I report in this issue or create a separate one? |
Am 27. Dezember 2016 19:42:08 MEZ, schrieb Othmar Truniger <notifications@github.com>:
I never suceeded in compiling the libev branch. Shoud I report in this
issue or create a separate one?
for example: make[3]: *** No rule to make target 'tpuartcommon.cpp',
needed by 'tpuartcommon.o'. Stop.
--
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#29 (comment)
That's my fault, will be fixed tomorrow. In the meantime edit src/backends/Makefile.am and remove the 'common'.
--
-- Matthias Urlichs
|
… quick interrupt, as I added the capability to interpose filters between layer3 and a back-end. I did this in order to move the reverse address mapping from layer3 and/or the ipt: back-end to a filter, allowing us to use the ipt: driver to connect to either a hardware gateway which doesn't gate physical addresses, or a knxd running with -TS which can forward physically addressed packets. Future filters can easily do filter datagrams, translate group addresses (you want to interconnect a building where each floor uses the same address scheme … ugh), or whatever. Next will be to get the test script to work again. Then, the TODOs form the last update. |
Travis build fails because it cannot find libev. So it looks like this is a new requirement when pthsem is not used anymore?? |
Travis will show a broken build on the libev branch in any case, because the test script is still failing. |
TODO left:
The pei16s is marked as "very experimental" in configure.ac. While it's not that difficult to rewrite, the process is a lot of work. Thus, unless somebody shows up and tells me they're actually using this thing in production, I will drop it from the libev branch. |
Am 02.01.2017 um 20:01 schrieb Matthias Urlichs:
TODO left:
...
* bcu1 AKA pei16s driver
I vote for dropping it, I know noone actually using it, it never worked
"mature stable" anyway..
Michael
|
Dropped (for now). |
Current status: hardware interfaces besides tpuart (both serial-port and serial-over-TCP) work. Multicast router and tunnel, client and server, works. Group caching has been tested though it's not yet in the test script; ditto for eibread-cgi. Open issues:
|
The libev branch now works with USB interfaces (lightly tested). Lots of thanks to tru from Uster(CH) for suppyling me with an interface to test with. |
Next on the agenda to test is a strange doesn't-forward-tunnelled-packets bug (testcase exists). |
Fixed. Remaining problem is broken forwarding of phys-addressed packets (required for ETS5) which I'll look into tomorrow. As soon as that works again I'll push 0.12.0 to master. |
ETS5 works with USB interfaces in tunnel mode(!). You may have to use the "single" filter Router mode and tpuart support are still broken (but work for non-ETS use); will look at that tomorrow. Probably. |
Router mode works, ETS5 tested with tpuart (serial) and USB (CEMI) adapters. |
Pushed. |
Wait, so the whole dependency is gone? It's just too late to make it into Debian testing, I fear: https://wiki.debian.org/DebianStretch#Before_the_release |
I know, a week late. Sorry about that … |
No worries, glad it's done :)
On the positive side, I don't need to rush too much and can have it
hang and dry in unstable for some time.
|
I am starting to assess the amount of work needed to get rid of pthsem in linknx. I read this entire issue's thread and would like to thank all of you for the valuable research and investigation in moving away from pthsem. Special thank you to @smurfix for his invaluable work thus far! I am not an expert with respect to libev and GNU stuff in general, so reading from people who know what they are doing is a life-safer :) Since I am finally considering using libev in linknx three years after knxd, I was wondering if you had any feedback regarding this replacement. You definitely did it but do you still believe it was the best fit? Thanks all! |
Let me put it this way: there are reasonable ways to write state machines (you use libev or something else that's event-based) and there are reasonable ways to write multithreaded programs (you use posix threads). The latter is difficult in C because there's no good cancellation model except "add the read end of a pipe to your Frankly rewriting the code to use libev was necessary anyway because in a reasonably-designed program every procedure / method should do one thing well, not three mixed together. That's a good fit for event-based code. Yes the rewrite was a lot of work but I managed to do it somewhat incrementally, if you dig through the knxd git archive you'll find a libev-based dispatcher for pthsem. That helped a lot. |
Thanks @smurfix. I'll take a closer look at your changes to better understand how to proceed with libev. |
Getting rid of pthsem
Hello,
I want to help getting getting rid of pthsem so one day we can get knxd into Debian.
My long term goal is to get rid of all threads and have one single select-loop.
As mid term goal, I want to remove all threads that are "in the middle". I understand that threads can help with communication to the outside world (to the bus or to clients). But the layers in between should be doable without threads.
I have a created spreadsheet with all threads: http://nimkannon.de/Overview-threads.ods
The ones I could test have stacktraces for creation and shutdown.
The first thread I looked deeper into is EMI1Layer2Interface. (EMI2 and CEMI are similar.) This threads task is to read data from the LowLevelDriver and transform it L_Data_PDU / L_Busmonitor_PDU for the upper layer. This alone could be done synchronously without threads. But there are couple specialities where I have questions to it:
There is a difference between open() and enterBusmonitor():
Why is enterBusmonitor() sleeping after calling SendReset()?
Is there any public documentation to knx that could explain some of the questions?
Thanks Christian
The text was updated successfully, but these errors were encountered: