Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

comments on highlevel API and LCDproc work #2772

Open
haraldg opened this issue Jun 10, 2019 · 57 comments

Comments

Projects
None yet
3 participants
@haraldg
Copy link

commented Jun 10, 2019

Hi,

I just have looked at a few samples of the LCDproc code you wrote and in extension the highlevel API for the first time. And with "looked" I mean just looked: Hadn't time yet to setup a container where I can safely build elektra master and try a few things for real or check the benchmarks we defined at the start.

The first thing to notice is, that the code is very verbose. While tokens like ELEKTRA_TAG_LCDEXEC_ADDRESS are easy to understand on first glance, I find that they make the code harder to read if you have many lines of this.

Next you seem to love opaque data types. I'm aware of the benefits for keeping the ABI stable while things change under the surface. However the downside is, that this makes the code again harder to read in huge quantities and also bloats the resulting binaries a lot. (Each call to an external getter typically takes at least 3 instructions and invalidates the contents of all registers. From a green programming POV that's rather unfortunate.) The latter problem can be "fixed" by making the getter inlineable, but of course at the cost of all the benefits. I'd reconsider which data types are actually worth being opaque.

I see that you are providing your own versions of basic data types like long. What problem are you trying to solve with that? IMO it mostly makes the code more difficult to read.

In particular I dislike, that you also used elektra data types in LCDproc data structures not directly related to configuration.

That error handles are malloc()ed data structures is rather surprising. Especially because I remember we had a discussion about a year ago, where I advocated treating malloc() as fail safe and you disagreed. Now you are using malloc in a context where you clearly have to treat it as fail safe. There are only a few functions, that actually can cause errors, so this is a rather unimportant topic, but I find the choice rather unorthodox. Most people seem to try to do error handling on the stack.

Maybe this is already answered in some document, but I didn't find it quickly: When elektraGet() returns a string (or other pointer to a data structure), what's the scope where the object is valid? My guess is until elektraClose() is called, but it's not obvious. Also what's a good policy for keeping the Elektra handle around with regards to resource usage (and other considerations)?

I see that you sticked pretty close to the original structure to the code. I guess for some of your goals this makes sense, but OTOH I wonder if you can really show off elektra while sticking to the structure dictated by a foreign configuration API. Maybe you can share your thoughts on the topic. Also maybe it is interesting for the thesis to compare for one driver the difference between a straight translation and an idiomatic elektra implementation, if there is any.

Sorry if this sounds mostly negativ. I suppose it's natural, that, when looking at something very quickly, mostly the negative things stick out. I will try to soon find the time to explore your code in more detail.

HTH,
Harald

@markus2330

This comment has been minimized.

Copy link
Contributor

commented Jun 11, 2019

Thank you very much for your review!

The first thing to notice is, that the code is very verbose. While tokens like ELEKTRA_TAG_LCDEXEC_ADDRESS are easy to understand on first glance, I find that they make the code harder to read if you have many lines of this.

I think it is a good default to have names which are safe from collisions. Do you want for LCDproc LCDEXEC_ADDRESS instead?

Next you seem to love opaque data types.

Yes, I love them because we want to provide APIs with smooth upgrade experience. In the concrete case of getters we could make an exception (we already do this in the C++ code-generation). But I see this as an optimization which only should be done if we see that it is relevant for LCDproc. (@kodebach will benchmark this.) From my experience it is only relevant if the getters are called very often in tight loops. If the getters are only used a few time in the startup procedure it should not matter.

I see that you are providing your own versions of basic data types like long. What problem are you trying to solve with that?

That types like "int" have different sizes on different platforms but we need to know what we allow in the config file.

That error handles are malloc()ed data structures is rather surprising.

Yes, we had long discussions about this. @domhof was very much convinced that it is much better for the usability of the API. If you do not pass the error object, the API call will exit, which makes it very hard to use it wrongly. I am not sure if this is fully implemented now, though. (@kodebach?)

Now you are using malloc in a context where you clearly have to treat it as fail safe.

It is also that we call other mallocs when errors happen. The concept of the API would be fail safe: If the malloc for the error object fails, a null pointer will passed to the API and then Elektra will exit on a failure. But this is untested, though. Is it important that this works?

My guess is until elektraClose() is called, but it's not obvious.

Good point, the API docu must be very clear about that, I created #2774.

@haraldg

This comment has been minimized.

Copy link
Author

commented Jun 11, 2019

I think it is a good default to have names which are safe from collisions.

Sure, but as long as the compiler detects them, nothing really bad can happen.

Do you want for LCDproc LCDEXEC_ADDRESS instead?

LCDEXEC_ADDRESS or CONF_ADDRESS both seem reasonable to me.

But I see this as an optimization which only should be done if we see that it is relevant for LCDproc.

This was mostly a general remark. I don't thing lcdproc is affected any worse then others.

From my experience it is only relevant if the getters are called very often in tight loops. If the getters are only used a few time in the startup procedure it should not matter.

My experience is quite the opposite: In a tight loop, when everything is in cache, a few extra instructions hardly matter - even on the ARM CPUs I'm mostly using. It's loading code into the cache in the first place, that is the actually expensive operation. Of course for initialization code, that's not an issue either. In this context I'm only concerned about the size of the resulting binaries: Compiler optimization can do many things, but it can't work around a API.

Using opaque data structures costs about 20 bytes per member access over the transparent case. A RAM cost that has to be paid even if the access is in some rarely used error path and a flash cost that has to be paid even if the code isn't running at all. Multiply this with the number of accesses in a typical application and the number of systems where the application is running, and you easily get a rather big number. There are certainly cases where spending such resources for extra flexibility is reasonable. But in many other cases it's not and only happens out of habit.

I see that you are providing your own versions of basic data types like long. What problem are you trying to solve with that?
That types like "int" have different sizes on different platforms but we need to know what we allow in the config file.

Yes, "int" is bad for stored data. But AFAIK the other types like "long" don't have the same problem. And what's more important, there are already the types from "stdint.h" with explicit size, that are widely known. I'd recommend you use those. Inventing your own just makes the code harder to read for everybody.

If the malloc for the error object fails, a null pointer will passed to the API and then Elektra will exit on a failure. But this is untested, though. Is it important that this works?

Of course not, I still advocate to treat malloc() of small data structures fail safe after all. I wouldn't have commented on this, if it wasn't for the discussion last year.

@markus2330

This comment has been minimized.

Copy link
Contributor

commented Jun 11, 2019

Sure, but as long as the compiler detects them, nothing really bad can happen.

Yes, now you get quite good error messages even with macros. Some years ago the compiler error did not indicate how macros manipulated something: Then it sometimes needed a lot of time to find out that macros are involved.

This was mostly a general remark. I don't thing lcdproc is affected any worse then others.

There is a huge difference if the API is used for every config access compared to if an application copies the config data into structs. Affected will be mostly applications that do not copy. And not copying is actually recommended if you want on-the-fly updates of config.

Compiler optimization can do many things, but it can't work around a API.

Exactly, so I wonder why your experience is contrary? If our getter would return some member of a struct and can be inlined, there would be no API cost. If you only pay the API cost in a constant number of cases it is very different to applications where an unbounded number of API calls can be made (which do not copy).

Yes, "int" is bad for stored data. But AFAIK the other types like "long" don't have the same problem. And what's more important, there are already the types from "stdint.h" with explicit size, that are widely known. I'd recommend you use those. Inventing your own just makes the code harder to read for everybody.

For C99 or if stdint.h is available, we typedef to them. So we could use these fixed-sized-types if you only support systems that have C99 or stdint.h.

@kodebach kodebach referenced this issue Jun 11, 2019

Open

Meta-issue: Highlevel Code-generation #2680

9 of 24 tasks complete
@kodebach

This comment has been minimized.

Copy link
Contributor

commented Jun 11, 2019

I hope, I addressed everything...

Things that can actually be changed

(without redesigning large parts of the API)

While tokens like ELEKTRA_TAG_LCDEXEC_ADDRESS are easy to understand on first glance, I find that they make the code harder to read if you have many lines of this.

I think it is a good default to have names which are safe from collisions.

Sure, but as long as the compiler detects them, nothing really bad can happen.

I agree, the current tags are very long. Sadly since C doesn't support any kind of namespaces, it is hard to avoid long names and at the same time avoid name collisions in general. I will add some options to manipulate the generated tags. Currently they are created using the prefix ELEKTRA_TAG_ and then an adapted version of the key name. So ELEKTRA_TAG_LCDEXEC_ADDRESS stands for the key lcdexec/address.

You also don't have to use the tags at all. If you take a look at how the tags are resolved you can will find multiple alternatives that resolve to the same inline function. Some of these alternative might be shorter. It should also be fairly safe to directly use the inline functions (e.g. elektraGetLcdexecAddress).

In particular I dislike, that you also used elektra data types in LCDproc data structures not directly related to configuration.

We could probably use the equivalent stdint.h types. Although it is not necessarily 100% compatible, if you use a compiler that is not C99 compliant. (see below for details)

Maybe this is already answered in some document, but I didn't find it quickly: When elektraGet() returns a string (or other pointer to a data structure), what's the scope where the object is valid? My guess is until elektraClose() is called, but it's not obvious. Also what's a good policy for keeping the Elektra handle around with regards to resource usage (and other considerations)?

This is actually quite complicated. See my answer in #2774.

Concerning the API design

Next you seem to love opaque data types.

I am also not a fan of how Elektra uses structs. However, I did not have any influence on this decision. I did try to minimize the performance (and binary size) impact of the code-generation API. That is why all of the generated accessor functions are static inline. AFAIK that way the compiler should inline whenever possible and since they are static all the unused ones should be stripped out.

In the concrete case of getters we could make an exception (we already do this in the C++ code-generation). But I see this as an optimization which only should be done if we see that it is relevant for LCDproc. (@kodebach will benchmark this.)

What exactly do you mean? Please clarify what exactly you mean by "making an exception".

I see that you are providing your own versions of basic data types like long. What problem are you trying to solve with that? IMO it mostly makes the code more difficult to read.

I absolutely agree. AFAIK the problem we are trying to solve is that C doesn't define fixed sizes for it's data types. The C standard only defines lower limits.

Why such verbose names based on the CORBA types are used, I don't know. This decision was before my time and I have absolutely no idea, how this conclusion was reached. I get that by using our own types we retain some C89 compatibility (although I am fairly certain we are not strictly speaking C89 compatible, the code-generated API certainly isn't), but that is not a good enough argument against using stdint.h IMHO.

However: As stated above, if C99+ is used all of the kdb_ types fallback to the equivalent stdint.h versions anyway.

if you only support systems that have C99

The code-generation API more or less requires C99. We use things like for (int i = 0; ...) { ... } so we are definitely not C89 compliant (GNU89 might work).

That error handles are malloc()ed data structures is rather surprising.

Error handling was also decided before my time. I wouldn't have done it this way either. However, I am definitely of the opinion that in most cases we can treat malloc() as fail proof. Also handling errors on the stack is difficult to impossible here, since there could be many warnings. To handle everything on the stack, we would need to allocate space for the maximum number of warnings (1000 I think) on the stack.

There is also currently a major refactoring of the low-level error handling under way, so some of the high-level error handling is going to change. This is also why currently the only publicly accessible property of errors is the description.

Misconceptions

If you do not pass the error object, the API call will exit, which makes it very hard to use it wrongly. I am not sure if this is fully implemented now, though. (@kodebach?)

You don't have to pass an error object. You only pass an ElektraError **. We cause a fatal error, if this pointer is NULL. Fatal errors by default call exit(). In pure C this is probably also the only way to handle fatal errors. However, if the API is used from e.g. C++ the fatal error handler can also just throw an exception.

There is a huge difference if the API is used for every config access compared to if an application copies the config data into structs.

Yes, not copying is far more wasteful. (see below)

And not copying is actually recommended if you want on-the-fly updates of config.

On-the-fly updates (as in the notification API) are not supported in any way by the highlevel-API. Currently we only update the internal KeySet in elektraOpen, elektraEnsure (will be merged into elektraOpen) and elektraSet* calls (which makes elektraSet quite expensive). We might support on-the-fly-updates in future, but I would strongly advise against calling elektraGet* every time you need your config value.

I actually recommend calling elektraGet* only once for every config value, unless you are pretty sure it will return a new value. This is because of the way Elektras type system works. Every elektraGet* call converts the config value from a string.

Optimization Problems

Sadly my experience is that Elektra is very badly optimized and that if you really care about performance, you shouldn't use it (yet). IMO Elektra uses malloc() far to much and a lot of operations that seem to be cheap in fact aren't (°). I get that kdb* are expensive operations (kdbOpen is still more expensive than you might think), but also a lot of the Key manipulation stuff is costly. A good example is keySetMeta. It seems that should be fairly straight forward. If you know that metadata is stored in an internal KeySet (°°), you would think that keySetMeta is more or less the same as keyNew + ksAppendKey. However, there is a ksLookup and sometimes a elektraStrDup (°°°) in there.

RAM usage is also not a strong suit of Elektra. For example each Key stores the full key name twice. Once in escaped and once in unescaped form. This also makes the mmap-cache files a lot bigger than one would expect.


(°) This probably due to the fact that many people have worked on Elektra. Most of which probably weren't too concerned with performance.

(°°) Having a hidden KeySet inside every Key is another thing I don't like. Most of the KeySet and Key functionality is not necessary to store metadata.

(°°°) I guess we have to have elektraStrDup and elektraStrNDup to make sure they use elektraMalloc (also not a fan of that), but I am pretty sure by not using strdup or strndup we break some compiler optimizations. Also why elektraStrLen is a thing, I will never understand. It says it is unicode and multibyte safe, but I don't see how. Clearly it wouldn't report the correct length for UTF-16 encoded ASCII.

@haraldg

This comment has been minimized.

Copy link
Author

commented Jun 12, 2019

Looks like github ate my reply via e-mail yesterday. Pasting it here again:

Thanks for your instructions. I actually compiled elektra and lcdproc
this evening. Here are my results, I will answer the conversations tomorrow.

After compiling I compared binary sizes with and without your modifications
(commits f0cb7bb1 vs. 7973efc3) and noticed that all binaries increased
a lot. For example:

harald@debian:~$ ls -l lcdproc*/clients/lcdexec/lcdexec
33852 Jun 11 17:39 lcdproc-0.5base/clients/lcdexec/lcdexec
73520 Jun 11 17:56 lcdproc/clients/lcdexec/lcdexec

In some ways a disappointing result. Especially considering that the
base version has a statically linked configuration parser while the
new version is dynamically linked against libelektra.

The main cause seems to be the code inside elektragen.c, but most other
object files have increased in size too.

I also tried to take some timings, but any client segfaults immediatly.
This is on buster armhf. (kdb ls / works fine.)

I haven't looked into it yet, but I guess the segfault will be fairly
easy to fix. The size issue OTOH bothers me a lot.

Here is what gdb says about the segfault:

Reading symbols from lcdproc/clients/lcdproc/lcdproc...(no debugging symbols found)...done.
(gdb) r
Starting program: /home/harald/lcdproc/clients/lcdproc/lcdproc
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0xb6745ed0 in getPluginPlacementList (plugin=0x0)
    at /home/harald/libelektra/src/plugins/list/list.c:496
496             Key * pluginInfo = keyNew ("system/elektra/modules/", KEY_END);
@kodebach

This comment has been minimized.

Copy link
Contributor

commented Jun 12, 2019

Looks like github ate my reply via e-mail yesterday.

It ended up over here #2748 (comment).

all binaries increased a lot.

A part of that is the specification that is compiled into the binary. However there are still some redundancies. I will look into it.

But if you want to keep binary size small, you will always be better of with your own specialised solution. Actually, it is quite pointless to compare the binary sizes alone, since you need to have the Elektra libraries installed as well. Those alone are bigger than the old LCDproc clients. You just have to make a decision here: Do you want all the features of Elektra or do you want small binaries?

@markus2330

This comment has been minimized.

Copy link
Contributor

commented Jun 12, 2019

prefix ELEKTRA_TAG

Actually, we only need a user-chosen prefix, and in LCDproc we would choose to not have a prefix.

What exactly do you mean? Please clarify what exactly you mean by "making an exception".

We could make an exception that elektraGet does not make an API call.

Why such verbose names based on the CORBA types are used, I don't know. This decision was before my time and I have absolutely no idea, how this conclusion was reached.

It is documented in #1324.

This is because of the way Elektras type system works. Every elektraGet* call converts the config value from a string.

This is because of how the implementation works, not because of the type system. The C++ generated code already shows that it could be different. But I am afraid this would be a reimplementation of elektraGet and some parts of code generation.

The optimization is not so trivial (especially in C, where you would need to generate structs).

A part of that is the specification that is compiled into the binary. However there are still some redundancies. I will look into it.

Maybe we should reconsider compiling-in the spec if the binary size is so important. (And simply fail if there is no spec)?

Or have a compile switch to compile it in?

@haraldg What do you think?

You just have to make a decision here: Do you want all the features of Elektra or do you want small binaries?

Most features of Elektra should not affect the binary size of the applications at all. The compiled-in spec, which provides the feature that applications also start without being installed, is an exception and not the rule.

@kodebach

This comment has been minimized.

Copy link
Contributor

commented Jun 12, 2019

This is because of the way Elektras type system works. Every elektraGet* call converts the config value from a string.

This is because of how the implementation works, not because of the type system.

The type system (as in which types are available) is irrelevant yes. But "the way Elektras type system works" is not. What I mean is that the core of Elektra doesn't have a type system. Everything is a string. Why have type checking via the type plugin. But we only store strings, because a Key has no notion of a type.

The optimization is not so trivial (especially in C, where you would need to generate structs).

We would just need to store the converted value inside the Key struct after type checking. But there is no way to do that right now. Not even via metadata, which IMO would be the wrong solution anyway.

Maybe we should reconsider compiling-in the spec if the binary size is so important. (And simply fail if there is no spec)? Or have a compile switch to compile it in?

I will add an option to the code-generation to disable it. But at some point we have to define what our goal is. Elektra simply cannot fulfill all use cases. Binary size, RAM usage and performance are all metrics where a specialized solution limited to the exact needs of an application will always outperform a general solution like Elektra.

Another thing that might increase binary size is the fact that we generate setters for everything as well. I will add an option to disable setters, in case your compiler doesn't strip out these unused functions.

@markus2330

This comment has been minimized.

Copy link
Contributor

commented Jun 12, 2019

The type system (as in which types are available) is irrelevant yes. But "the way Elektras type system works" is not. What I mean is that the core of Elektra doesn't have a type system. Everything is a string. Why have type checking via the type plugin. But we only store strings, because a Key has no notion of a type.

Yes, this is how your implementation works. But you could already transform earlier and only return the value in elektraGet.

We would just need to store the converted value inside the Key struct after type checking. But there is no way to do that right now. Not even via metadata, which IMO would be the wrong solution anyway.

If you need a ksLookup you are too slow anyway. If you want it really fast, you need to generate a struct, fill it up in elektraOpen, and in elektraGet is a static inline function which returns the member of the struct. (Then you trivially also have all the guarantees.)

I will add an option to the code-generation to disable it.

@kodebach Better wait for the answer of Harald. And first we need to find out what the causes of the big binary is, maybe it is not only the spec.

I actually like your idea of compiled-in spec with specload very much. Thus I was also thinking that we could compress the spec but this would be way too much for your thesis now.

Another thing that might increase binary size is the fact that we generate setters for everything as well. I will add an option to disable setters, in case your compiler doesn't strip out these unused functions.

@kodebach Were you able to reproduce the large binary size? Please always first try to find the causes before you optimize.

@haraldg

This comment has been minimized.

Copy link
Author

commented Jun 12, 2019

Hi, since I'm late to the party and you already discussed a lot, I won't respond to everything again. If you feel you miss a reply on something from me, please raise the issue again.

Where does the size increase come from:

Yes, strings are a big part. However the text section is much bigger then the ro-data section. I guess pushing all the arguments to the many calls of keyNew() onto the stack actually takes more space then the arguments themselvs.

I just used the default CFLAGS of lcdproc, not sure which optimization they choose - maybe none. However I don't think that setters or other unsused code are an issue: The sizes of most objects only grow moderately - probably just due to extra function arguments and other minor changes.

I manually compiled elektragen.c with -O optimization and the resulting object file is a bit smaller, but nothing dramatic. I guess more experiments can be done, but this doesn't seem like a solution on its own.

On metrics and optimization

We don't actually optimize lcdproc for anything, but try to not pointlessly use resources. There definitively isn't a strong size limit. However I do care about supporting embedded systems, after all I'm the OpenWRT package maintainer of elektra.

It's true that the size increase is still small compared to the size of elektra itself. It probably would be insane to use elektra in lcdproc, if it is the only user on the typical system where lcdproc is installed. I'm gambling on elektra eventually taking off and being installed widely anyway, so the size of elektra it self doesn't count directly. However, if the trend of installed size of applications almost doubling continues and elektra does take off, I will need a new laptop just because of that. I don't like it.

Also I feel like the size of elektra is justified, because it provides many nice features. But a huge size increase in application binaries is less justified: Right now we need twice the size in lcdproc to do pretty much what we did before. (Yes, we don't care much about the specification part of elektra.)

The metrics I use are: installed size, binary size, memory footprint, (cache lines periodically filled), (instructions executed.) The last two are not relevant here, because elektra related code won't be execuated repeatedly.

Of the other three binary size is kind of the least important: Not regularly used mmaped binary sections will just be dropped from RAM and not use resources permanently. binary size is mostly useful because it is very easy to measure and compare. installed size obviously takes a space on the flash chip and any memory that is not mmaped from the file system (ie malloc()ed data structures) has to be kept in RAM (unless you have swap space, which most systems don't) forever.

Personally, I mostly spend optimization effors on cache related issues. But aside from general API considerations, that's not relevant for elektra.

Possible improvements and workarounds

I'm lacking much of the context of design decisions behind elektra, so I probably can't contribute much to this part of the discussion. A few ideas, that float in my mind:

Some time ago Markus and I discussed the possibility to use code generation with some kind of embedded mode: On embedded systems typically nobody is doing any configuration changes at all. All the (help) strings targetted for users and many of the checks aren't useful. Maybe this goes into the same direction as the "compile switch" suggested above?

Maybe the specification can be stored in a more efficient format then construct it programmatically. (I hear you already complain about ABI stability issues, but it is worth considering at least.)

Actually, this whole business of storing the specification in the binary (which binary even in case of modular applications?) and then printing it on stdout on demand seems rather roundabout to me. You could store it in an elf library and dynamically link it (and dl_open() it from kdb) or even store it in some other suitable and efficient format and load it in the elf constructor. (Assuming there actually is a reason not to just load it during elektraOpen(), which probably is what I'd have done.)

Ultimately, I don't have a clear picture of the tradeoffs, so I can't make serious suggestions. It's something that only can be developed during discussions.

@kodebach

This comment has been minimized.

Copy link
Contributor

commented Jun 12, 2019

Answers for @markus2330

But you could already transform earlier and only return the value in elektraGet.

And where would the value be stored between transformation and elektraGet? Also when would the transformation happen? Would you transform all keys in elektraOpen? That would be pretty wasteful, if only a small amount of keys are accessed. We could of course transform during the first get and then only return the value. But then there is still the question of how store the transformed values. If we modify the Keys in the internal KeySet, we have to "unmodify" them for every elektraSet otherwise kdbSet doesn't work correctly. So we would have to have a separate kind of cache KeySet, but then we need multiple lookups during elektraGet and we waste RAM. So in the end the current solution is probably the best we can do, without modifying the Key struct.

If you need a ksLookup you are too slow anyway.

If ksLookup is too slow, all of Elektra is too slow. ksLookup is one of the most optimized operations of Elektra.

If you want it really fast, you need to generate a struct, fill it up in elektraOpen, and in elektraGet is a static inline function which returns the member of the struct. (Then you trivially also have all the guarantees.)

You can generate a struct with the code-generation API already. In fact this is widely used across LCDproc already. (It is a bit different, because there is special elektraGet/elektraFillStruct that returns the whole struct.) We could hide such a struct internally, but that would mean the high-level API doesn't work without code-generation anymore.

Were you able to reproduce the large binary size? Please always first try to find the causes before you optimize.

An option to disable generation of setters may also be useful for other things, e.g. if you intentionally want to prohibit the application from setting keys. Of course just disabling the generation doesn't actually prohibit the application from setting values (there are other ways), but I reduces the likelihood of mistakes.

Disabling the generation of the specload stuff could also make sense, if used properly. See below for an example.

Answers for @haraldg

I just used the default CFLAGS of lcdproc, not sure which optimization they choose

AFAIK if debugging is not enabled, -O3 is used. Which according to this article may actually increase memory usage in favor of execution speed. To optimize for code size -Os should be used.

I manually compiled elektragen.c with -O optimization and the resulting object file is a bit smaller, but nothing dramatic.

I am certainly no compiler or optimization expert, but AFAIK object files will always contain all of the code. Only when linking the executable unused functions may be removed.

Maybe the specification can be stored in a more efficient format then construct it programmatically.

More efficient storage formats will result in a loss of execution speed.

Actually, this whole business of storing the specification in the binary and then printing it on stdout on demand seems rather roundabout to me.

The idea was that the specification is embedded into the executable the uses it. That way the specification is always exactly what the executable expects. It cannot be modified without also recompiling the executable (or some intense hacking).

You could store it in an elf library and dynamically link it

I thought about that, but there are two downsides:

  1. all applications have to ship an executable and a shared object library
  2. the application and the library might be modified separately and become out of sync

Elektra also has a static version, in case dl_open isn't available or is discouraged for some other reason. Using a .so wouldn't be compatible with this version.

which binary even in case of modular applications?

Currently the code-generation API expects exactly one executable per specification, so the question is trivial. However, there is nothing stopping you from not using the generated doSpecloadCheck (can be renamed) function. You have to mount the specification using specload manually. In theory you could easily mount a different application or even just a simple bash script that prints the specification to stdout. If you store the specification as a gzipped Elektra quickdump file /somewhere/spec.eqd.gz you could even mount it via

sudo kdb mount -R noresolver spec.eqd "spec/sw/lcdproc/lcdproc/#0/current" specload "app=/usr/bin/zcat" "app/args=#0" "app/args/#0=/somewhere/spec.eqd.gz"

or even store it in some other suitable and efficient format and load it in the elf constructor

Not sure how the whole ELF constructor stuff works. That is beyond my knowledge of C.

Assuming there actually is a reason not to just load it during elektraOpen(), which probably is what I'd have done.

One thing you need to take into account, is that the specification has to be accessible to kdb as well as the user application. Otherwise kdb set cannot run validation.

My Observations

I did some tests too and while I didn't use an ARM chip, I could reproduce similar size increases on x86-64. In my case the increases where not as extreme but still ~1.8x for lcdexec.

I then simply commented out all the keyNew lines (obviously the executable won't run correctly, but the size can be compared), and the size increase went down to just ~1.3x. Both the .text and the .rodata sections are significantly affected by these lines. I will think about more efficient ways of storing the specification.

If we then use the -flto and -fuse-linker-plugin to enable dead code elimination during linking, the size increase reduces to merely ~ 1.02x. (Note: the old version is basically not affected by these options; < 0.5% size difference). So we should probably enable these options by default (for non-debug builds).

All of these tests where done with lcdexec. For lcdvc we go from ~ 1.18x to ~1.02x. For lcdproc this only gets the difference from ~1.6x down to ~ 1.25x (here -flto -fuse-linker-plugin had more effect on the old version in addition to some inline preventing pointer stuff).

TL;DR I think the size increases can be mitigated via compiler options and some changes to how the spec is stored. Sadly it seems, the binary size will still increase, despite not using a statically linked config parser.

@markus2330

This comment has been minimized.

Copy link
Contributor

commented Jun 13, 2019

@haraldg wrote:

please raise the issue again.

If it is required that the binary must be able to start without installation is not yet answered. I started an issue for prioritization of features in #2779.

However, if the trend of installed size of applications almost doubling continues and elektra does take off, I will need a new laptop just because of that. I don't like it.

I also do not like it. It was surprising for me that it does. @kodebach I am very much interested in your results how the low-level API compares with the high-level API.

Maybe this goes into the same direction as the "compile switch" suggested above?

Yes, it goes in this direction (especially @kodebach's suggestion to not generate setters). If you are not against such "compile switches" we can discuss further possibilities to reduce the binary size.

Actually, this whole business of storing the specification in the binary (which binary even in case of modular applications?) and then printing it on stdout on demand seems rather roundabout to me. You could store it in an elf library and dynamically link it (and dl_open() it from kdb) or even store it in some other suitable and efficient format and load it in the elf constructor. (Assuming there actually is a reason not to just load it during elektraOpen(), which probably is what I'd have done.)

Of course we also support to load it on elektraOpen() but then it only works if the application is already installed (the spec needs to be installed and mounted). To have it part of the binary always works, which is very handy for development (you can run the binary from the build directory). In a further away future, something like "strip" might remove the spec during installation (as for installed binaries it might not be needed). To write such a "strip" utility, however, will also be out-of-scope for @kodebach.

@kodebach wrote:

we have to "unmodify" them for every elektraSet otherwise kdbSet doesn't work correctly.

Of course, elektraSet would be more expensive. It always needs to update the KeySet and the cache (struct).

If ksLookup is too slow, all of Elektra is too slow. ksLookup is one of the most optimized operations of Elektra.

If you consider code-generated APIs to be also part of Elektra, then these APIs are not too slow. They can be as fast as access to variables.

elektraFillStruct that returns the whole struct

Ahh, ok, now I better understand what you wanted to do. So basically our ideas are the same: I would only hide the structs from the users so that they cannot do changes in the struct (which would destroy synchronization to the KeySet).

One thing you need to take into account, is that the specification has to be accessible to kdb as well as the user application. Otherwise kdb set cannot run validation.

And we would also loose introspection, access to defaults, ... and everything else which makes the life of admins easier. Validation is only a small part of this.

@kodebach

This comment has been minimized.

Copy link
Contributor

commented Jun 13, 2019

Of course we also support to load it on elektraOpen() but then it only works if the application is already installed (the spec needs to be installed and mounted). To have it part of the binary always works, which is very handy for development (you can run the binary from the build directory).

There seems to be a huge misconception here. At the moment applications using the code-generation API absolutely and without exception require the execution of a kdb mount call before running the application. Otherwise the application should work to some extent, but it will not have all the functionality. The reason for this is simple. We don't have a way do "inject" stuff into kdbGet in a way that it is passed to plugins.

@haraldg

This comment has been minimized.

Copy link
Author

commented Jun 13, 2019

I am certainly no compiler or optimization expert, but AFAIK object files will always contain all of the code. Only when linking the executable unused functions may be removed.

IIRC when using at least -O the compiler drops unused static symbols. The linker might drop external symbols depending on flags.

More efficient storage formats will result in a loss of execution speed.

Of course there are tradeoffs. However this just isn't true in general.

Assuming there actually is a reason not to just load it during elektraOpen(), which probably is what I'd have done.

I have been sloppy. What I meant was something like: Read it during loadConfiguration() from an external file.

One thing you need to take into account, is that the specification has to be accessible to kdb as well as the user application.

Yes, which is one reason why this spec-is-in-the-binary thing is roundabout. Having the spec in a separate file would make things much simpler. And probably also support modular applications nicely.

the application and the [spec] might be modified separately and become out of sync

Should be easy to prevent accidents by generating unique filenames. You could even use checksums to mildly guard against tempering, but I don't recommend it.

If we then use the -flto and -fuse-linker-plugin to enable dead code elimination during linking, the size increase reduces to merely ~ 1.02x. (Note: the old version is basically not affected by these options; < 0.5% size difference). So we should probably enable these options by default (for non-debug builds).

That's interessting - I should check how much size the old config parser takes, to get a better idea where we are standing. However keep in mind, that this are gcc specific optimization features and also (not a concern for lcdproc at all) using these optimizations might not be feasable for linking huge applications. (I don't know anything about this topic, but doing optimzations globally instead of per file seems like something, that might scale badly.)

If it is required that the binary must be able to start without installation is not yet answered.

Ultimately, that will be nice to have. But I suggest we first focus how things should be on production systems and the general ABI for spec access. Development features and lcdproc specifics should come later.

If you are not against such "compile switches" we can discuss further possibilities to reduce the binary size.

Yes, ideally we would select with configure if this is a build for debugging, production, embedded targets, etc. configure would set things up for the code generator to do the right thing. I don't have a strong opinion on the details. I think we agree on the general idea?

Of course we also support to load it on elektraOpen() but then it only works if the application is already installed (the spec needs to be installed and mounted). To have it part of the binary always works, which is very handy for development (you can run the binary from the build directory).

You could also just embed a fallback filename instead of the whole spec inside the binary.

In a further away future, something like "strip" might remove the spec during installation (as for installed binaries it might not be needed).

This seems impractical. Much easier to do the right thing during code generation.

To explore possible approaches a bit and in the interest of me getting a clear picture about the tradeoffs: Do I understand this correctly, that, if we don't have the spec available, the application will (or at least could) still run, but will crash (via fatal error callback) when we access a value that is missing in the configuration? This behaviour probably is good for unattended embedded systems. kdb would be used to merge default configuration and custom configuration during building the firmware and validate everyting.

Other systems probably don't care much about validation, but still would like to have defaults for missing configruation items. Those could quite easily supported by a variant of the elektraGet API: Something like elektraGetLongDefault("key", default_value) would in many cases take just one extra instruction to pass default_value - ie just the minimal space needed to store the value anyway. I don't know if this would return the default value or crash on conversation errors, but let's not get distracted by details yet. At code generation time we would select between this and full-spec-mode. This is probably what I would use for OpenWRT LCDproc packages.

For non-embedded systems we clearly want the full spec around. If parsing the spec from a space efficient format is deemed too expensive, you could cache a validated version and only revalidate if timestamps change. But this is not relevant at the moment other then to point out that we don't have to take bad compromises between size and speed.

@kodebach

This comment has been minimized.

Copy link
Contributor

commented Jun 13, 2019

Do I understand this correctly, that, if we don't have the spec available, the application will (or at least could) still run, but will crash (via fatal error callback) when we access a value that is missing in the configuration?

There are two parts to this. The KeySet with the default values passed to elektraOpen and the full specification that has to be mounted in some way. The defaults passed to elektraOpen prevent calls to elektraGet from failing if a Key is missing. They do however not prevent failure because of type conversion problems. To prevent those failures the spec has to be mounted. With a spec mounted elektraGet calls should never fail. If there are problems elektraOpen would fail instead.

Something like elektraGetLongDefault("key", default_value) would in many cases take just one extra instruction to pass default_value - ie just the minimal space needed to store the value anyway.

Like stated above, you can pass default values to elektraOpen. When using code-generation this KeySet is generated and your defaults will always match your specification. There might have been other reasons as well why this API is not used, but it was decided before my time.


If I understand all the concerns correctly, everything should be more or less covered by these three modes then:

  1. Embedded spec and defaults. Similar to what we have now. Embedding the spec ensures it cannot become out of sync with the application and allows single file applications.
  2. Spec in external file, no defaults. No KeySets in binary ensure minimal binary size. Application will only run, if spec is mounted. loadConfiguration checks that spec is mounted. The external file can also store the KeySet more efficiently. kdb gen will produce a quickdump file. How that is mounted is up to the user (directly, via cat with specload, gzipped via gzip with specload, etc.).
  3. No spec, no defaults. Like 2, but loadConfiguration doesn't check for spec. Therefore the application will run, as long as the configuration is correct. Intended for use-cases where the configuration has been pre-validated and will not change.
@markus2330

This comment has been minimized.

Copy link
Contributor

commented Jun 14, 2019

@haraldg wrote:

Yes, which is one reason why this spec-is-in-the-binary thing is roundabout. Having the spec in a separate file would make things much simpler. And probably also support modular applications nicely.

Elektra is designed for external specifications and this is also the normal way in Elektra. In the discussions about -c one outcome for me was that developers want to execute LCDproc without having it installed, which is not possible if the specifications are external. Thus @kodebach invented a way that the specification can be internal, and is still usable externally (specload). Of course we do not need to use this by default or at all, it is just one further possibility Elektra provides. (Thanks to @kodebach)

Should be easy to prevent accidents by generating unique filenames. You could even use checksums to mildly guard against tempering, but I don't recommend it.

We also support signatures of config files. But the built-in spec is the most easy and safe way to never have an out-of-sync spec. Of course it increases the binary size (more than we thought it would).

I should check how much size the old config parser takes

I am also interested. Unfortunately, the YAML parser is quite big and needs an external library (yamlcpp). This is also a further decision for you to make (which config file format you want by default).

Ultimately, that will be nice to have. But I suggest we first focus how things should be on production systems and the general ABI for spec access. Development features and lcdproc specifics should come later.

Yes, you have a better feeling of what LCDproc people want.

Yes, ideally we would select with configure if this is a build for debugging, production, embedded targets, etc. configure would set things up for the code generator to do the right thing. I don't have a strong opinion on the details. I think we agree on the general idea?

Ok, should the option rather be called --works-as-standalone or --minimize-binary-size? Or asked differently: do you have plans to reuse the option also for some other optimizations?

You could also just embed a fallback filename instead of the whole spec inside the binary.

This does not help if it is not installed. And if the spec is installed, it works anyway.

Do I understand this correctly, that, if we don't have the spec available, the application will (or at least could) still run, but will crash (via fatal error callback) when we access a value that is missing in the configuration?

Exactly. It would have no defaults, no validation, no description of the config, ....

Those could quite easily supported by a variant of the elektraGet API: Something like elektraGetLongDefault("key", default_value)

We should consider here:

  1. the API already has many variants, this would again duplicate 1/3 of the API
  2. this API would be discouraged to be used in non-code generation cases as documented defaults can be different than the used defaults or even worse: the same config option could have different defaults in different code paths.

@kodebach wrote:

There might have been other reasons as well why this API is not used, but it was decided before my time.

See above.

If I understand all the concerns correctly, everything should be more or less covered by these three modes then:

@haraldg can you please prioritize these three modes?

How that is mounted is up to the user

I assume for LCDproc we should prepare some installation scripts?

Like 2, but loadConfiguration doesn't check for spec.

What about some run-time option to tell the application it should not fail for mistakes in the spec (like no spec)? Then we would have only two compile modes (either spec is included or not).

@kodebach Did you already try to compress the spec?

@kodebach

This comment has been minimized.

Copy link
Contributor

commented Jun 14, 2019

In the discussions about -c one outcome for me was that developers want to execute LCDproc without having it installed, which is not possible if the specifications are external. Thus @kodebach invented a way that the specification can be internal, and is still usable externally (specload).

The goal of specload is to link executable versions to specifications. Mounting is still required. It does not in anyway affect whether an application can be run immediately after compiling or not.

What you can do however is compile the application and mount this local executable via specload (°). Then you can run the local executable. Now you can simply recompile and run again (without another mounting procedure) and will still have the correct spec (even if you changed it before compiling).

(°) If you have an installed version of your executable, you need to unmount its spec and then mount with specload. Alternatively you can manually change the configuration of the specload plugin for your mountpoint to use a different executable path. Sadly there is no kdb tool to change the configuration of a mounted plugin.

Exactly. It would have no defaults, no validation, no description of the config, ....

Please don't post contradicting information. As I stated above, the defaults can be given separately from the spec.

@haraldg can you please prioritize these three modes?

Each of the 3 modes serves an entirely different purpose and the implementation of these modes is not really complicated so prioritization isn't really necessary. Instead it would be good to know, if these modes cover everything or if we might need additional modes.

@kodebach Did you already try to compress the spec?

First of all: Right now we use separate KeySets for defaults and spec. I will definitely make the changes necessary, so that we can use the same one for both. This should reduce the binary size quite a bit already. Beyond that there are a few possibilities.

If we want to construct the embedded spec using the C API (ksNew, keyNew), it seems there isn't much that we can do easily. As far as I few short experiments showed, using a different API (no varargs, many functions calls; different keyNew API etc.) won't help much. (Disregarding how complicated the implementation would be) keyCopyMeta would probably not help either, as the identical strings should be deduplicated by the compiler anyway. (at least for binary size, it would of course reduce RAM usage).

Another option would be to embed a single string (or byte array) that encodes whole KeySet and use a plugin to use it. Since we have to send the KeySet in quickdump format for specload, the quickdump format would offer itself for embedding. However, right now quickdump seems to be slightly larger than C code (in the case of lcdexec at least). See #2788 for more.

For non-embedded specs you can simply take a quickdump file compress it with gzip and mount the resulting file /path/x.eqd.gz with:

sudo kdb mount -R noresolver spec.eqd "spec/mountpoint" specload "app=/usr/bin/zcat" "app/args=#0" "app/args/#0=/path/x.eqd.gz" 

You could also use any other method of compressing the quickdump data as long as there is an executable that can print the uncompressed file to stdout.

@markus2330

This comment has been minimized.

Copy link
Contributor

commented Jun 14, 2019

Please don't post contradicting information.

I could not find any documentation about this, can you please point me to it? I only know of one argument in elektraOpen which accepts a specification and I assumed the code-generated API will pass the specification using this parameter and that the application called with --elektra-spec will return exactly this.

As I stated above, the defaults can be given separately from the spec.

Why would you do this?

I will definitely make the changes necessary, so that we can use the same one for both.

Thank you.

@haraldg

This comment has been minimized.

Copy link
Author

commented Jun 14, 2019

First of all I should point out, that I'm not familiar enough with Elektra jargon, to fully understand all your remarks. There might be misunderstandings in what I write. I'll try to be aware of the limits of my understanding, but might miss something.

When you say "mounting a specification": Am I correct that this basically means having an entry in a file like /etc/elektra.conf point to the file where the spec is stored except for the case where the spec is embedded in the application and you do some things in loadConfiguration(), that have the same effect without actually touching persistent files?

If I understand all the concerns correctly, everything should be more or less covered by these three modes then: [1, 2, 3]

Basically yes. I still think mode 2.5 (cheap default API) is worth exploring, but see below. Also for the sake of compleness, there might be

  1. Use some sample configuration (not necessarily the default) and use the values directly in elektragen.c without linking against libelektra at all. This mode definitely is not useful for lcdproc at all (what with all the plugins, etc.) but might in theory be useful for other applications in a deep embedded context. It's definitely very low priority, but since you already have a fancy code generator, you might actually want to show it off a little. Also it would provide a very nice benchmark to compare against. Maybe this data is not interesting for kodebach's thesis, but I can see this as a very nice thing to write some papers about. (And if this mode becomes available, I will definitely use it to benchmark lcdproc a little.)

  2. Provide a spec (in which way ever) but strip it of all the strings only intended for humans but not programmatically relevant. (Maybe this is not a task for the code generator but some other tool. I don't know - just exploring the space of ideas.)

However, I also have to point out, that I don't understand the difference between (1) and (2): This would behave the same way, just with the spec stored differently?

Elektra is designed for external specifications and this is also the normal way in Elektra. In the discussions about -c one outcome for me was that developers want to execute LCDproc without having it installed, which is not possible if the specifications are external.

I don't see this. You can at compile time (code generation time) store the current spec in a file with a unique name and embed the name (absolute path of course) in the executable. If the spec isn't mounted, the executable falls back to the stored location. No?

Thus @kodebach invented a way that the specification can be internal, and is still usable externally (specload). Of course we do not need to use this by default or at all, it is just one further possibility Elektra provides. (Thanks to @kodebach)

We really should have had more discussions earlier, if this was the result. But if this is just a development feature, I don't care that much about size actually. I was under the impression, that production builds would have a similar size.

If this is just a development feature, then I don't understand why you bother to dump the spec for external users though?

This is also a further decision for you to make (which config file format you want by default).

I don't have a strong opinion. Unless there is something wrong with ini, I'd just stick with it.

Ok, should the option rather be called --works-as-standalone or --minimize-binary-size? Or asked differently: do you have plans to reuse the option also for some other optimizations?

I don't believe I have the full picture yet, so can't really say. Also I believe somebody might add more modes to the code generator in the future. So maybe just --mode={full-spec,no-spec,...} or perhaps --spec={include,check,ignore} or something like that?

the API already has many variants, this would again duplicate 1/3 of the API

I assumed this would be cheap to implement in terms of functions already existing to support the existing APIs. If this isn't true, then the idea is much less appealing.

this API would be discouraged to be used in non-code generation cases

Exactly. Maybe except for the case where people use the API, but decide not to use a spec at all.

or even worse: the same config option could have different defaults in different code paths.

Somebody doing such sloppy work might just as well access the wrong key in the first place. ;)

I assume for LCDproc we should prepare some installation scripts?

make install likely should result in a system that keeps working even if the directories, where lcdproc was compiled, are completely deleted.

What about some run-time option to tell the application it should not fail for mistakes in the spec (like no spec)? Then we would have only two compile modes (either spec is included or not).

I think I don't really understand where you are getting with this. I don't see a usecase for ignoring mistakes that we actually detected successfully?

(°) If you have an installed version of your executable, you need to unmount its spec and then mount with specload. Alternatively you can manually change the configuration of the specload plugin for your mountpoint to use a different executable path.

I have spent hours reading things and discussing with you two and I still don't really understand what this means. Even Markus seems to get confused at times. I stand by what I said earlier: This concept seems very roundabout to me. How can you hope to ever document this in a way that the average user will understand it?

@markus2330

This comment has been minimized.

Copy link
Contributor

commented Jun 15, 2019

When you say "mounting a specification":

Mounting in Elektra is always the same concept because specifications are also stored like configurations. Mounting is the mechanism to describe how Elektra can find the configuration below some key name.

Am I correct that this basically means having an entry in a file like /etc/elektra.conf point

Yes, the file that contains the mountpoints is called /etc/kdb/elektra.ecf by default. kdb file system/elektra gives you the real file name.

to the file where the spec is stored except for the case where the spec is embedded in the application and you do some things in loadConfiguration(), that have the same effect without actually touching persistent files?

Usually mounting refers to file on the file system but it it is not need to be like this. You can also mount "files" from git, network and what executables output.

It's definitely very low priority, but since you already have a fancy code generator, you might actually want to show it off a little

I would very much prefer to only have functionality which is needed but the needed functionality works very well. Elektra has already more than enough features to "show off", which tend to rather scare away people because not everything works as expected.

In particular, Elektra already has features to make configuration values read-only. Having multiple ways to do the same thing need especially good justification. Improved performance could be such a justification but at the moment we do not know anything about run-time in LCDproc yet.

Provide a spec (in which way ever) but strip it of all the strings only intended for humans but not programmatically relevant.

You should also consider, that blowing up a build system with too many option introduces complexity. But yes, an option to only include what is needed for the guarantees (type+defaults) makes sense.

You can at compile time (code generation time) store the current spec in a file with a unique name and embed the name (absolute path of course) in the executable. If the spec isn't mounted, the executable falls back to the stored location. No?

Of course you can do this but as long LCDproc is not installed, the spec file would not be at this absolute path and LCDproc would fail finding the file. Furthermore, having absolute file names inside the executable has other drawbacks (binary would not be relocateable anymore).

If this is just a development feature, then I don't understand why you bother to dump the spec for external users though?

Having the specifications of applications available within Elektra is one of the main goals of Elektra: Only then you can query which values the application will get, provide nice tools and so on and so forth.

I don't have a strong opinion. Unless there is something wrong with ini, I'd just stick with it.

The INI parser called "ini" is the one with most features but also with most bugs.

Exactly. Maybe except for the case where people use the API, but decide not to use a spec at all.

In some early discussions we also said that the API should gently push people towards more modern and robust ways of how to access configuration. Having a spec is a must-have for many reasons. Only in very restricted systems someone should consider to not have a spec at all: in cases where you want no run-time configuration at all.

I think I don't really understand where you are getting with this. I don't see a usecase for ignoring mistakes that we actually detected successfully?

This would be a developer feature to allow developers to start LCDproc without spec. The idea here was that we reduce the number of compilation variants to two: compiled-in spec and external spec.

How can you hope to ever document this in a way that the average user will understand it?

The high-level API already has nice documentation: https://www.libelektra.org/tutorials/high-level-api

But as you noticed, the recent features of code generation currently do not have such documentation and sometimes also work differently than I would have expected it. Thus I urge to a very small number of use cases and implement and document exactly these 2 or 3 cases.

@haraldg

This comment has been minimized.

Copy link
Author

commented Jun 15, 2019

@markus2330

This comment has been minimized.

Copy link
Contributor

commented Jun 15, 2019

Is it a correct summary of your the mounting description that configuration
sources (files, whatever else) can be mounted either globally from elektra.ecf
or process-local from within loadConfiguration()?

There is some mounting-functionality in the APIs (kdbEnsure) but it is not to be used to mount configuration files but only process-local configs like command-line arguments. The reason not to use it, that any non-global mounting would destroy the correct view of tools.

I actually suggested to store the path of the not-installed location. Ie the
path in the build directory.

This would then only be interesting for developers where we can add the whole spec anyway.

I fail to see how referencing the correct not-installed executable in whatever tools you are thinking about is easier then referencing the correct not-installed specification.

I am talking about Elektra's tools (kdb, qt-gui, web-ui, ...). Of course developers can also mount the files of their build directories. Unforunately, you need to be root for mounting because mounting writes to /etc/kdb/elektra.ecf. See #1074 for a proposal to change this.

So basically this would mean we always have to run elektrified applicatons
on openwrt with some command line switch, that nobody uses on non-embedded
systems? People will forget this in their scripts more often then they
make mistakes in their configuration. Seems like a poor tradeoff.

No, the other way round: developers would need to add the switch when they get the error that the specification is missing.

But let us first focus on the use cases we need to support, and then decide how to implement them.

@kodebach

This comment has been minimized.

Copy link
Contributor

commented Jun 15, 2019

Ok there is a lot to unpack here.

When you say "mounting a specification"

Mounting in Elektra is very similar to mounting a file system in UNIX. Elektra has a system wide configuration of mountpoints. These mountpoints are just keys that defined that everything below this keys (and not below another mountpoint) is loaded using the storage plugin for this mountpoint.

So for example you can mount the key user/test/my/key with the ini plugin. This just means that if somebody accesses keys below user/test/my/key (e.g. user/test/my/key/hello), we invoke the ini plugin, which then loads some file (which file is loaded is a bit more complicated but not really relevant here).

UNIX has synthetic file systems like /proc or /sys. Similarly Elektra doesn't actually require that a plugin loads from a file. There are plugins for loading from git or via curl or by invoking some process (like specload does).

So if you "mount a specification" you will modify the system wide configuration of mountpoints to tell Elektra how to retrieve your specification. This cannot be done inside loadConfiguration(), because then tools like kdb wouldn't work, because only your application would know about the specification.
(It would be possible to automatically mount on the first startup, but since mounting requires root access, I wouldn't recommend it.)

Use some sample configuration (not necessarily the default) and use the values directly in elektragen.c without linking against libelektra at all.

I also thought of this as a future possibility. It could be useful for the unattended embedded systems you mentioned earlier. Basically one could play around with the specification and different configurations. Once the final configuration is found a dummy elektragen.c which always returns the same values (and therefore should be fully inlined by the compiler), could be generated and used to create a firmware image.

Provide a spec (in which way ever) but strip it of all the strings only intended for humans but not programmatically relevant.

Yes, I would say that should be done via a different tool. The tool would pre-process the specification before it is passed to the generator.

I don't understand the difference between (1) and (2): This would behave the same way, just with the spec stored differently?

Yes, the binary would be smaller. The external spec could also be compress in other ways then is easily possible inside the binary. In addition you could store binary and spec in different places. For example the binary could be compiled into a firmware image, while the spec is stored on some other bigger flash chip not intended for executable code. No idea, if such devices exist. Basically (2) is a bit more flexible. While (1) guarantees that binary and spec are always compatible.

I assumed this would be cheap to implement in terms of functions already existing to support the existing APIs. If this isn't true, then the idea is much less appealing.

Of course it is just a bit of copy/paste. But as soon as there are different options for doing things, you have to be very explicit about the trade-offs between the different options, otherwise people will become confused and might be deterred from using Elektra. One goal of the high-level API was to make it easier for beginners to use Elektra, as the low-level API is quite complex and unintuitive at times.

Exactly. Maybe except for the case where people use the API, but decide not to use a spec at all.

Like I said above, it is already possible to use the highlevel API without a full spec. The defaults parameter of elektraOpen would contain the expected types and defaults for all known key names. This is kind of a lightweight spec. The idea is that nobody can just insert a line elektraGetLong ("/some/key") without anybody knowing that this configuration option exists. (I found a few drivers in LCDproc where not all config keys used in code were documented).

I fail to see how referencing the correct not-installed executable in whatever
tools you are thinking about is easier then referencing the correct
not-installed specification.

It absolutely isn't. Again (right now) the only real benefit of using specload is that the binary and spec are always compatible (because they are the same file). Apart from this the main idea behind specload (although this is not implemented yet, because it needs more wide reaching changes), was that it could allow the user to modify the specification in safe ways, while preventing unsafe changes. For example, it is always safe to change the description metadata, because it is not intended for machine processing. Similarly, it would be safe to restrict the allowed values for port that LCDd uses.


@haraldg you can probably ignore the stuff below, it will contain a lot of Elektra terminology and basically talks about implementation details

Please don't post contradicting information.

I could not find any documentation about this, can you please point me to it? I only know of one argument in elektraOpen which accepts a specification and I assumed the code-generated API will pass the specification using this parameter and that the application called with --elektra-spec will return exactly this.

As I stated above, the defaults can be given separately from the spec.

Why would you do this?

There is no real documentation, but looking at a file created by the code-generator might help.

Like you said there is the defaults KeySet passed to elektraOpen and there is the KeySet printed to stdout in specload mode (--elektra-spec). These two are at different locations in the code and right now, they are created by separate ksNew statements. This is because elektraOpen needs the keys without the parent key (e.g. /lcdexec/address) while specload mode needs the full spec namespace keys (e.g. spec/sw/lcdproc/lcdexec/#0/current/lcdexec/address). I am working on a fix for this, so that we can use the same KeySet for both.

It is still true, that the defaults KeySet passed to elektraOpen and the specification (mounted via specload or otherwise) are technically independent and could even be contradicting (°). The code-generator (and to some extent specload) exists to prevent this. You are correct, the generated code does pass more or less a copy of the specification to elektraOpen. And like I said in future it will be the exact same KeySet as used by the specload mode.

However, the defaults KeySet does not have to contain the full spec. First of all, if you are sure there is a mounted spec, you don't have to pass defaults at all. If you do pass defaults, the only requirements (not yet checked in elektraOpen) are that every key has a type and a default. A copy of defaults (with the parent key prepended to each key) is used as the KeySet that is passed to kdbGet. It works, because the Keys we create in elektraOpen are the same that the spec plugin would create and are therefore used as a fallback in cascading lookups.

And finally, just to clarify this once and for all, specload does not help in any way with running non-installed executables. You still have to mount the specification for specload to work. You are just mounting in a different way. The defaults KeySet to some extent allows to run binaries without mounting, but it is not a full replacement for the mounted specification. The reason is simply, the Keys in defaults are only processed by global plugins (possibly excluding postgetstorage plugins, I need to look into that). This is because the keys don't belong to any mountpoint. To mitigate this one of two things would have to happen: 1) we need to account for cascading keys in splitAppoint or 2) kdbEnsure has to support creating mountpoints. Option 1) still requires that a mountpoint exists where the cascading keys could be put and that has the required plugins, so not really a solution. With Option 2) we could just run kdbEnsure to create a spec mountpoint in elektraOpen and remove it again in elektraClose.

PS. I have no idea, how the new cache plays into this. It might just work or it might break everything. (I suspect the latter.) Creating and removing mountpoints in kdbEnsure would probably cause problems with the case, because kdbEnsure only modifies the KDB handle, not the actual on disk configuration. I guess it would just cause cachemiss the first time. However, I don't know if we would then create a cache of this pseudo-mountpoint and what that would mean for the next time the application is run.

(°) AFAIK the spec plugin overwrites what is passed as defaults so there can't be any kind of conflict. But changing defaults without changing the specification, probably won't have an effect.

@markus2330

This comment has been minimized.

Copy link
Contributor

commented Jun 16, 2019

Thank you for the detailed answer!

No idea, if such devices exist. Basically (2) is a bit more flexible. While (1) guarantees that binary and spec are always compatible.

Better to only support (1) then?

There is no real documentation, but looking at a file created by the code-generator might help.

This is definitely an important thing to be changed. Did you consider that the code generated API looks identical to the non-code generated API? This would deduplicate a lot of the documentation. Ideally, we would simply describe one way to use the high-level API and without code-generation you simply include a different (non-generated) elektra.h file.

And like I said in future it will be the exact same KeySet as used by the specload mode.

Perfect.

The defaults KeySet to some extent allows to run binaries without mounting, but it is not a full replacement for the mounted specification. The reason is simply, the Keys in defaults are only processed by global plugins (possibly excluding postgetstorage plugins, I need to look into that).

Can you create an issue/proposal about these problems? For me the expected behavior is that the spec plugin copies the metadata so that it is available for all plugins. Ideally, keys passed to kdbGet are identical to keys retrieved. (Obviously not for cache hits or NO_UPDATE where kdbGet essentially is a no-op.)

@kodebach

This comment has been minimized.

Copy link
Contributor

commented Jun 16, 2019

Did you consider that the code generated API looks identical to the non-code generated API?

In what way? I don't see any part that is really identical. A lot is similar, since one is based on the other, but nothing is really identical.

Can you create an issue/proposal about these problems?

Once I know what I am proposing I can do that...

For me the expected behavior is that the spec plugin copies the metadata so that it is available for all plugins.

That is exactly what happens, there are no other plugins, if no mountpoint exists.

Ideally, keys passed to kdbGet are identical to keys retrieved. (Obviously not for cache hits or NO_UPDATE where kdbGet essentially is a no-op.)

Not sure what you mean by that. Documentation for kdbGet quite clearly states that the returned KeySet may contain arbitrary Keys and that retrieved keys are appended with ksAppendKey and therefore will replace existing ones.

@markus2330

This comment has been minimized.

Copy link
Contributor

commented Jun 16, 2019

In what way? I don't see any part that is really identical. A lot is similar, since one is based on the other, but nothing is really identical.

Especially in terms of the API, e.g. to have elektraOpen and not loadConfiguration. The "default" parameter would simply be unused or can even be another KeySet that is appended.

Are there any other differences in the generated vs. non-generated API? (Except of additional guarantees in the generated API.)

@haraldg

This comment has been minimized.

Copy link
Author

commented Jun 16, 2019

@kodebach: Thanks for the detailed explanation. It helps a lot, it also raises a lot of new "Why are you people doing things the way you do?" questions. They probably are not useful at the moment, so I think I have only one remaining question:

Is it correct to say, that (1) ensures that the application validates it's own configuration against the correct spec even if the wrong or no spec is mounted?

@kodebach

This comment has been minimized.

Copy link
Contributor

commented Jun 16, 2019

Is it correct to say, that (1) ensures that the application validates it's own configuration against the correct spec even if the wrong or no spec is mounted?

There is not a 100% guarantee, because we still just use, what Elektra provides to us. If everything is set up correctly, the specification Elektra provides (via kdbGet) will come from the application itself (via specload). But the specification still has to be mounted, so we don't really know where it comes from.

To be 100% certain that by executing e.g. lcdproc we will use the specification embedded in lcdproc, we would need some changes. Basically we would have to check somehow whether the executable that specload is going to call is the same as the one currently running. The only way to that (which I can think of right now) involves the absolute path to the current executable. Sadly there is no portable way to access that path, we would have to rely on things like /proc or WIN32 and OSX system calls.

@markus2330

This comment has been minimized.

Copy link
Contributor

commented Jun 17, 2019

Why don't you simply compare what spec delivered and what is built-in (and fail on mismatches)?

It would be very good to write now a summary of which modes we are gonna to support.

@kodebach

This comment has been minimized.

Copy link
Contributor

commented Jun 17, 2019

Why don't you simply compare what spec delivered and what is built-in (and fail on mismatches)?

Comparing two potentially very large KeySets will be quite expensive, I would like to avoid it if possible. Also since the KeySet we receive from kdbGet already passed through various plugins, it wouldn't be exactly the same as what we sent to specload. But I had some other ideas, I will look into.

It would be very good to write now a summary of which modes we are gonna to support.

I will implement 3 options to the code generator:

  • specLocation=( embed | external ): This changes how the processed spec is output. embed (default) embeds it into the binary, external produces a separate file.
  • defaultsHandling=( embed | speconly ): This changes how defaults are handled. embed (default) passes a defaults KeySet to elektraOpen (the KeySet is the embedded spec, if present otherwise it is the minimum KeySet needed, i.e. less metadata than spec). speconly passes NULL as defaults to elektraOpen and therefore elektraGet* will fail without a mounted spec.
  • specValidation=( none | minimal | full ): This changes what validations of the specification are done. none is no validation, minimal (default) checks that the spec parent key is present and that the first key in the spec can be retrieved (to ensure spec-mount was executed). full won't be implemented for now, it would do a full validation like the one you suggested.

Combinations of these options can be used to recreate the modes I suggested. There are a few other combinations as well. I think these options are a lot more transparent to users, and the implementation should be cleaner too.

@haraldg

This comment has been minimized.

Copy link
Author

commented Jun 17, 2019

Sorry, if I'm being pedantic now, but this still confuses me.

But the specification still has to be mounted, so we don't really know where it comes from.

But you are passing the built-in spec to kdbEnsure(), so what happens to that?

Also I think it was mentioned somewhere, that kdbEnsure() is there to "overwrite" keys of the configuration with values from the command line. So wouldn't kdbEnsure() overwrite the mounted spec with the provided one?

I guess I see now, why you didn't like my unique filenames proposal: It doesn't help because after mounting we don't know which file was used anyway.

As somebody used to linux and package managers for almost my whole live, the more I understand what you are doing, the less I understand your use case. But I guess people are still moving binaries around manually on other systems.

I think one reason, why this is so confusing, is, that elektra tries to do a square circle here: On one hand you want there to be one global database following the one and true spec to the extent where you are even mounting the spec into the database. On the other hand you allow applications to evolve their spec and are trying to deal with there not being a true spec after all.

I believe in the end there are only two ways to resolve this:
a) Pretend that there is a true spec anyway and stop worrying about keeping applications and specs in sync and expect that application developers exercise due diligence when updating their specs and run proper installation scripts and maybe do some kind of version detection.
b) Not actually mounting the specs per se. Instead invent yet an other abstraction based around the idea of application integration: Since elektra is big on application integration you probably have already thought about overlapping specs though I have no idea what your solution is. Acknowledge that multiple versions of the same spec (same application) is really just a special case of application integration and provide the mechanisms to make them coexist in the same namespace. (Might be a lot of work, not sure if it is worth it.)

My gut feeling is to go with (a), but let's also explore (b) based on my very spotty knowledge of what you are doing. The new mechanism would have the following properties:

  1. Each spec has a version field, containing one (or maybe more?) versions it implements and maybe some other IDs, checksums or whatever.
  2. Each application using full and strict validation has a compabtible value, that can be tested against the version field from (1).
  3. Each spec should actually specify all the keys an application might access, even if they are outside of it's nominal namespace. (Not sure how to deal with indirect key access, where the key is accessed because of the value of an other key.)
  4. specs aren't mounted anymore. Let's call the new mechanism "installation". When a new spec (version of a spec) is installed, it is checked against the already installed specs and installation fails if there is an incompatibility. (User needs to uninstall an other spec first.)
    Also the database is validated against the new spec and the user is required to fix any conflicts, before installation can complete.
  5. From all the installed specs the true spec is calculated by merging all restrictions and merging all the version fields.
  6. The true spec might be mounted into the global database. I'm not sure I recommend that, but for compatibility reasons you probably will want to do that.
  7. To the application it doesn't matter if the configuration is validated against the true spec or only its own fragment. The only thing that matters is that the compatible field validates against the version field.

Aside from (b) being a lot of work to implement it also adds much complexity. I'm not sure if e.g. debian maintainers would love you or hate you, if you did that. Maybe both. Anyway it is definitely outside the scope of kodebach's thesis.

My conclusion is, that in the short term we are stuck with (a) anyway and the use case for the built-in spec seems rather thin and it leads to a lot of confusion and is not well explored. Therefore I recommend to put as little effort as possible into built-in mode and not use it per default.

@haraldg

This comment has been minimized.

Copy link
Author

commented Jun 17, 2019

therefore elektraGet* will fail without a mounted spec.

"will fail" or "will fail if the database is incomplete"?

@kodebach

This comment has been minimized.

Copy link
Contributor

commented Jun 17, 2019

Sorry, if I'm being pedantic now, but this still confuses me.

That's fine, it took me some time to understand how Elektra works as well. It can be very complicated at times.

But you are passing the built-in spec to kdbEnsure(), so what happens to that? Also I think it was mentioned somewhere, that kdbEnsure() is there to "overwrite" keys of the configuration with values from the command line. So wouldn't kdbEnsure() overwrite the mounted spec with the provided one?

We are not passing the spec to kdbEnsure. kdbEnsure is there to modify the KDB handle in a non persistent way. This is used to enable the plugin that processes command-line options. Because of some limitations kdbEnsure right now cannot add mountpoints. But even if it could it still wouldn't be wise to inject the specification this way. Using kdbEnsure would only affect our current process (in fact only the current KDB handle). If we injected the specification this way, the kdb command-line tool wouldn't know about it and allow modifications that don't conform to the specification.

My conclusion is, that in the short term we are stuck with (a) anyway and the use case for the built-in spec seems rather thin and it leads to a lot of confusion and is not well explored. Therefore I recommend to put as little effort as possible into built-in mode and not use it per default.

I think we should just think of the embedded spec as a different way of storing the specification. We should probably just describe it as convenience option, as it allows to package an application into a single file. That could be useful for small applications, as long as the spec doesn't increase the binary size too much.

If you package your application as e.g. an .deb or .rpm anyway, I would recommend using the specLocation=external and defaultsHandling=speconly options for release builds.

"will fail" or "will fail if the database is incomplete"?

Will fail if the database is incomplete. Of course it still works, if you manually added the key with the correct type.

@haraldg

This comment has been minimized.

Copy link
Author

commented Jun 17, 2019

We are not passing the spec to kdbEnsure.

Hm, then I need to reread the generated code. I must have gotten lost somewhere.

Using kdbEnsure would only affect our current process (in fact only the current KDB handle). If we injected the specification this way, the kdb command-line tool wouldn't know about it and allow modifications that don't conform to the specification.

I have been thinking this was the point: To detect misconfigurations by validating against a known to be correct spec in case kdb has for some reason the wrong idea what the spec is and therefore modifications that don't conform to the specification have slipped in.

Of course, even if you did it like above, you would still want to mount the specification globally too.

Anyway, I think it is clear now and mode (1) just doesn't have any usecase for people using package managers or building lcdproc locally. Maybe its useful for peole moving binaries around on USB-sticks, though they will need to know a lot about elekra to actually make it work.

@markus2330

This comment has been minimized.

Copy link
Contributor

commented Jun 17, 2019

Comparing two potentially very large KeySets will be quite expensive, I would like to avoid it if possible.

Did you benchmark it?

I will implement 3 options to the code generator:

I think you are too eager here. Let us first agree on use cases and then on how to implement then.

specLocation=( embed | external ): This changes how the processed spec is output. embed (default) embeds it into the binary, external produces a separate file.

Agree, if we had one outcome it would be that. Maybe rewording: embedSpec: full, minimal?

defaultsHandling=( embed | speconly ): This changes how defaults are handled. embed (default) passes a defaults KeySet to elektraOpen (the KeySet is the embedded spec, if present otherwise it is the minimum KeySet needed, i.e. less metadata than spec). speconly passes NULL as defaults to elektraOpen and therefore elektraGet* will fail without a mounted spec.

Can you explain please? Why would you not pass everything to defaults what we have embedded?

specValidation=( none | minimal | full ): This changes what validations of the specification are done. none is no validation, minimal (default) checks that the spec parent key is present and that the first key in the spec can be retrieved (to ensure spec-mount was executed). full won't be implemented for now, it would do a full validation like the one you suggested.

See above.

Anyway, I think it is clear now and mode (1) just doesn't have any usecase for people using package managers or building lcdproc locally. Maybe its useful for peole moving binaries around on USB-sticks, though they will need to know a lot about elekra to actually make it work.

Yes, I think our goal now should be to find a minimum amount of use cases we actually can document and support. @haraldg can you help with that?

@kodebach

This comment has been minimized.

Copy link
Contributor

commented Jun 17, 2019

Did you benchmark it?

Do I need to? Comparing two KeySets of the same size is clearly O(m + n) (m total number of metakeys, n total number of keys). For different sizes it would be O(1). If we where to use keyCompare the whole thing would basically amount to 2*(m+n) strcmps and n memcmps.

We would also need real-world data, because the average number of metakeys per key could make a difference.

I did a quick trial run on my machine comparing two KeySets with 1,000,001 keys each, where each key had 1 metakey and everything was the same, except for the last metakey. The full run took about 4.5 seconds, of which 3 where spent on building the KeySets.

I think you are too eager here.

I am sorry but, my time is not infinite.

Can you explain please? Why would you not pass everything to defaults what we have embedded?

If specLocation=external is used, there is no embedded spec. In this case defaultsHandling=embed would use a KeySet with only key values and type metadata to save binary space.

However, this combination is a bit of an odd one. I wouldn't recommend it unless there is a very specific need for it. In the documentation I would recommend using defaultsHandling=speconly for specLocation=external.

See above.

Not sure which "above" you mean, but I will try to explain the use cases for specValidation.

  • none is useful for systems where a spec is guaranteed to be mounted and guaranteed to not be remove. For example an embedded system where everything compiled into a readonly firmware image.
  • minimal or if implemented full are the recommended setting and the default. full could only be used with specLocation=embed, since otherwise the full spec isn't embedded and can't be used for validation. The only other difference between them is performance.

minimum amount of use cases we actually can document and support

We shouldn't document use cases at all. We should document the options available in the code-generator. How people use these options is up to them. We should of course give recommendations and examples, but why intentionally limit what people can do with our tools? As long as we clearly state potential drawbacks of using different options, I don't see a problem.

@markus2330

This comment has been minimized.

Copy link
Contributor

commented Jun 18, 2019

We would also need real-world data

We have real-world data: the spec of LCDproc.

The full run took about 4.5 seconds, of which 3 where spent on building the KeySets.

Thanks for the benchmark! So comparing is much faster than creating the KeySets. But maybe we do not need it anyway: see use cases below.

I am sorry but, my time is not infinite.

Of course, exactly because of that I suggest you to reduce the number of variants you implement. The documentation and testing of all what you suggested would take too long.

Your highest priority now should be to create a minimal working PR (for one client) with good docu for one use case (I would suggest the 1. case of below) so that we finally get public feedback.

We shouldn't document use cases at all. We should document the options available in the code-generator. How people use these options is up to them. We should of course give recommendations and examples, but why intentionally limit what people can do with our tools? As long as we clearly state potential drawbacks of using different options, I don't see a problem.

There is a big problem: The cognitive load we put on people. They need to understand many details of Elektra and even if they do, they can easily mess up and pick one combination of the 223=12 variants that does not work nevertheless.

Based on what is implemented, what Elektra's goals are and the time left I would think we should have 3 use cases:

  1. (default) full spec embedded, nothing mounted: for developer who simply want to execute LCDproc from the build directory.
  2. installed variant of 1: the scripts during make install mount the spec in the binary with specload. This use case is for playing around with config and spec (once specload supports save editing of the spec. For now the minimal changes we allow should suffice).
  3. minimal binary size for embedded systems: We assume that the spec is mounted via ni (and also provide a script that mounts the ni + the spec, which is executed during make install). In this mode, the binary cannot be executed as long it is not installed. But we assume that the specification is correct for performance reasons.

Are you both ok with this? @kodebach: what would be the minimal set of variants in the code generator and LCDproc's build system to support this?

@haraldg

This comment has been minimized.

Copy link
Author

commented Jun 18, 2019

I think kodebach's proposal is quite reasonable. Not sure what help I can provide on top of that.

The mental load comes from understanding the concepts like mounting, specification etc, which must be invested no matter how the available options are presented to users. I think the biggest confusion comes from "the spec is built-in but not used without global mounting", but it seems you both are quite sold on that thing. I don't see much else to drop, that would make things easier to understand.

they can easily mess up and pick one combination of the 223=12 variants that does not work nevertheless.

I think there might be a difference between the options to the code generator and the options to configure. The code generator is used by developers (or rather developers working on the build system), while configure needs to be understandable to everybody. I'd probably just expose the code generator options via configure, but provide good defaults, so that users don't need to think about them at all.

(default) full spec embedded, nothing mounted: for developer who simply want to execute LCDproc from the build directory.

I think it was the conclusions of the last few days of discussion, that this scenario is not on the table because specload doesn't work that way and changes to internals like kdbEnsure() would have to hapen anyway to implement it?

Also for this to be useful for running LCDd from the build tree, some äquivalent to the -c option needs to be working, because a complete default setup would only work for very few drivers. I guess one way to make this work would be to have some way to tell elektra to use $BUILD_DIR/elektra.ini instead of /etc/kdb/elektra.ecf - but no idea how difficult that would be and it is yet an other distraction, so let's not explore that right now.

If you still want to know what's the minimal set of variants for LCDproc, then I think in terms of kodebachs options it would be:

specLocation=external, defaultsHandling=speconly, specValidation=( none | minimal )

So yes, it could be stripped down to one option.

I think one of the open questions is, what ´specValidation´ would actually do, when the spec is missing: fail? emit a warning, but try to continue? I guess whatever we decide here, some other developers of some other application will want something else. Maybe the space of possible actions isn't even expressable with options, but only with code. Ie allow to specify some callback in case of validation error. But again, I think this isn't something that has to be decided now as long as what we do now doesn't actually make it difficult to implement in the future.

@kodebach

This comment has been minimized.

Copy link
Contributor

commented Jun 18, 2019

I think the biggest confusion comes from "the spec is built-in but not used without global mounting", but it seems you both are quite sold on that thing.

Its an option that is already available, so why drop it? I will be careful how I document the built-in spec and will explicitly state that is only a different way of storing the specification, but doesn't really have any other advantages.

I think it was the conclusions of the last few days of discussion, that this scenario is not on the table because specload doesn't work that way

With full spec embedded and nothing mounted, you can execute from the build directory, but no validation or conversion will happen, since no plugins run without additional setup.

Also for this to be useful for running LCDd from the build tree, some äquivalent to the -c option needs to be working
I guess one way to make this work would be to have some way to tell elektra to use $BUILD_DIR/elektra.ini instead of /etc/kdb/elektra.ecf

We already have the dir namespace. You can mount a configuration there an it will be local to your working directory. Sadly the dir namespace is very weird, because the configuration is system wide. See also #1074.

I think one of the open questions is, what ´specValidation´ would actually do, when the spec is missing: fail? emit a warning, but try to continue?

loadConfiguration can already return an error that has to be handled by the application. Failed spec validation would just be another kind of error. If we see the need for it, we could also provide a way to continue despite the failed spec validation (wouldn't recommend it).

@markus2330

This comment has been minimized.

Copy link
Contributor

commented Jun 18, 2019

The mental load comes from understanding the concepts like mounting, specification etc, which must be invested no matter how the available options are presented to users.

This is not necessarily true: there are many concepts in Elektra you do not need to understand as long as you are not confronted with them. In use-case 1. you would not be confronted with mounting and spec at all.

In general Elektra is carefully designed that distributions can already decide a lot for the users (e.g. mounting all files in some config format). Then users can use the distribution in very similar ways than they use distributions now (only that configuration is unified). They would not, however, need to understand mounting. (That is also the reason why mounting is only possible for root: the feature is meant to be for distributions or make install, not for end-users.)

I think the biggest confusion comes from "the spec is built-in but not used without global mounting", but it seems you both are quite sold on that thing. I don't see much else to drop, that would make things easier to understand.

If you see we should not support/document it, we will happily do so. It means that installing LCDproc will be required for starting it. I would find that very annoying.

I think it was the conclusions of the last few days of discussion, that this scenario is not on the table because specload doesn't work that way and changes to internals like kdbEnsure() would have to hapen anyway to implement it?

specload and kdbEnsure would not be used here. We would simply lose most of the guarantees, as no validation would happen. This would be only for developing LCDproc without changing spec or conf. But imho this is also a valid use case. If you do not think so, we will not support it.

but provide good defaults, so that users don't need to think about them at all.

Of course this is always a dream but the reality often is that if you mess up in configuration design, users will find many ways to make mistakes. And exposing decisions that could be done by developers is very smelly configuration design.

So yes, it could be stripped down to one option.

Then we should do so.

Its an option that is already available, so why drop it? I will be careful how I document the built-in spec and will explicitly state that is only a different way of storing the specification, but doesn't really have any other advantages.

You assume that everyone reads and understands the docu. In fact only people that are as deeply involved as we are now, can really decide which way of storing the specification is better. So it is our responsibility to decide that properly. Pushing this decision to users is not the way to do it.

loadConfiguration can already return an error that has to be handled by the application. Failed spec validation would just be another kind of error. If we see the need for it, we could also provide a way to continue despite the failed spec validation (wouldn't recommend it).

Yes, I would also not recommend to do it that way. But it hackers will find out and use this trick locally. Hopefully @haraldg will not merge this hack 😄

@kodebach

This comment has been minimized.

Copy link
Contributor

commented Jun 18, 2019

In use-case 1. you would not be confronted with mounting and spec at all.

End users (e.g. people using LCDproc) should never be confronted with mounting, completely independent of the code generation modes. The mounting should always be automated in the install process.

Developers (e.g. people working on LCDproc) always have to understand mounting, unless there are serious changes to how Elektra works. You cannot develop an application using the code-generation API without knowing about mounting, simply because you have to create the install scripts responsible for mounting an possible have to mount manually during development.

It means that installing LCDproc will be required for starting it.

Dropping the built-in spec only requires a different way of mounting. You can still run LCDproc from any directory and and a valid mounted spec is required in both cases.

You assume that everyone reads and understands the docu.

That is why we need to decide good defaults, not why we shouldn't have options. Also if a developer uses an API without properly studying the documentation, they shouldn't expect to get the best possible results.


Since this whole discussion is leading us nowhere, I will now start the implementation. Once I have some code you can try out, we can still decide what do keep, what to change and what to add.

@haraldg

This comment has been minimized.

Copy link
Author

commented Jun 18, 2019

I mostly agree with @kodebach in the discussion above.

If you see we should not support/document it, we will happily do so. It means that installing LCDproc will be required for starting it. I would find that very annoying.

Yes, it is very annoying, but it seems to be beyond the scope of what @kodebach is doing, so we kind of have to live with it. Also it seems to tear at the foundation of elektra to allow any non-global configuration like what the -c option does, at all, so we should definitely open a new issue for this "fakeroot mode", if we want to work on it.

This would be only for developing LCDproc without changing spec or conf. But imho this is also a valid use case. If you do not think so, we will not support it.

I don't remember ever running any of the programms from LCDproc with complete default configuration. Maybe I did at some occasion for some very trivial testing, however I have 5 different variants von LCDd.conf (designed to test different aspects) in my build directory, that I use with -c frequently. So running LCDd from the build dir but without "changing conf" seems like a very very corner case to me.

Yes, I would also not recommend to do it that way. But it hackers will find out and use this trick locally. Hopefully @haraldg will not merge this hack 😄

What get's merged largely depends on people explaining their use case and convincing me, that it does more good than harm. If people sometimes need to run an application without a spec (say it is on a filesystem, that isn't always mounted), but still would like a warning when the spec isn't available, then I probably shouldn't tell them to "just use the OpenWRT mode, where the spec is never checked" but allow them, to add a switch to the build system, that makes the missing spec less fatal.

@kodebach

This comment has been minimized.

Copy link
Contributor

commented Jun 18, 2019

that I use with -c frequently

If -c is just used for development, you could kdb mount LCDd.ini dir/sw/lcdproc/lcdd/#0/current. Then whenever you execute LCDd it should use $PWD/.dir/LCDd.ini, if it exists. Then you could either run LCDd from different directories to use different configs, or you could simply swap out the .dir/LCDd.ini file whenever you want to change config. As I said, this is a super weird setup and #1074 should definitely be implemented (IMO a dir mountpoint should be specific to a single directory not system wide), but it is a solution for now.

If people sometimes need to run an application without a spec [...] but allow them, to add a switch to the build system, that makes the missing spec less fatal.

I switch in the build system would be an acceptable solution (that switch could just disable the check). What I think we shouldn't implement (unless there is absolute need for it), is to generate the check, but then allow the application to continue even if the check failed. IMO if the check is generated, it has to succeed for the application to work.

@haraldg

This comment has been minimized.

Copy link
Author

commented Jun 18, 2019

[ dir mountpoint ]

Yes, I'm sure everybody will find some workaround that will be acceptable to them.

IMO if the check is generated, it has to succeed for the application to work.

So you think a warning is worse then no check at all? (BTW this branch of the discussion wasn't about what you should implement, but what I might feel tempted to merge, if somebody proposes it for LCDproc.)

@markus2330

This comment has been minimized.

Copy link
Contributor

commented Jun 19, 2019

You cannot develop an application using the code-generation API without knowing about mounting, simply because you have to create the install scripts responsible for mounting an possible have to mount manually during development.

We make the scripts now for LCDproc. And adding some elektraGet in a module/driver hopefully will be possible without understanding mounting.

That is why we need to decide good defaults, not why we shouldn't have options.

This is a general misconception: it is not possible to make configuration sane just by providing good defaults. At the moment you need to change something, you are confronted with the complexity of the configuration. Thus you need to be always aware of two things:

  1. keep the configuration design simple
  2. have good defaults

Also if a developer uses an API without properly studying the documentation, they shouldn't expect to get the best possible results.

Yes, APIs are an excellent example! Would you like to have an API where you have calls that decide about some internals you have no clue about?

Since this whole discussion is leading us nowhere, I will now start the implementation. Once I have some code you can try out, we can still decide what do keep, what to change and what to add.

Yes, as already wished several times: please make the necessary fixes so that the clients successfully start, make a PR at the LCDproc repo and then write to the mailing list.

Further code-generator-modes and optimization of the binary size should be done after we get further feedback.

So you think a warning is worse then no check at all? (BTW this branch of the discussion wasn't about what you should implement, but what I might feel tempted to merge, if somebody proposes it for LCDproc.)

The problem with continuation on spec errors is that maybe also other errors are present that are not reported (we always report only one error).

@kodebach

This comment has been minimized.

Copy link
Contributor

commented Jun 20, 2019

You cannot develop an application using the code-generation API without knowing about mounting, simply because you have to create the install scripts responsible for mounting an possible have to mount manually during development.

We make the scripts now for LCDproc. And adding some elektraGet in a module/driver hopefully will be possible without understanding mounting.

You missed the point. To develop an application using the code-generation API you need to know about mounting. More specifically, the person responsible for creating install scripts that do the mounting has to know about the concept of mounting. In the case of LCDproc this person is me. I know about mounting, but I won't write the scripts for all project that want to use the code-generation API.

The main point was that you left the impression that there is a solution where we completely abstract over the concept of mounting. This is not the case.

Thus you need to be always aware of two things:

  1. keep the configuration design simple
  2. have good defaults

Of course simplicity is also important. But if you focus too much on simplicity, you loose flexibility. At some point you have to sacrifice simplicity, if you want to provide a general solution. Elektra wants to provide a general solution, so there has to be some complexity.

Would you like to have an API where you have calls that decide about some internals you have no clue about?

Isn't that the whole point of having a high-level API, or any API for that matter? It hides the internals of lower levels, so that you don't have to know about all the complicated stuff.

Also this kind of contradicts your previous statement. If the API doesn't decide for me, it has to expose all the options. Therefore it cannot be simple. So do you want it simple or do you want the options?

Yes, as already wished several times: please make the necessary fixes so that the clients successfully start

I already stated a few times, that the clients (and also the server) should start successfully already. I also provided instructions in #2748 (comment). If it doesn't work for you, please let me know.

@markus2330

This comment has been minimized.

Copy link
Contributor

commented Jun 20, 2019

The main point was that you left the impression that there is a solution where we completely abstract over the concept of mounting. This is not the case.

Not yet but maybe you find a way to write reusable install scripts? Something like:

magic.sh path-to-spec-file application-root-key-name

What a the two arguments are, developers obviously need to know.

Of course simplicity is also important. But if you focus too much on simplicity, you loose flexibility. At some point you have to sacrifice simplicity, if you want to provide a general solution. Elektra wants to provide a general solution, so there has to be some complexity.

If someone wants some feature, then he/she of course need to dive into some complexity, e.g. reading about some concepts or plugins (and we are willing to provide also complex features). And you did wonderful work for many different features.

But these concepts and plugins should be as simple as possible and do only what is needed. We already have enough monsters (👋 ini, spec, list plugins; the old code generator). We need to get rid of these monsters and not introduce new ones. Thus Elektra has a very clear quality goal: Prefer simplicity over robustness and extensibility (flexibility). This is stated very clearly in GOALS.md.

And we also live this: it is still possible (and always will be) to get/set values in Elektra without using any of these concepts we talk here about.

Harald asked about 5 different modes (which I already find over-the-top, but ok, he is the customer) but we should not introduce 12 nobody asked for. In particular, I still do not understand what any user should do with the "defaultsHandling" option. It seems to me a typical "I need an if in the source code, so let us add an option". This is definitely not how configuration design should be done. We should always assume that people look at us how do to proper configuration design. E.g. spec or list plugin are very embarrassing and we should get rid of this asap. You already helped a lot with the spec plugin but there is much more stuff to be removed in there. Of course I did not ask you because you already did so much work which is beyond you originally wanted to do.

I already stated a few times, that the clients (and also the server) should start successfully already. I also provided instructions in #2748 (comment). If it doesn't work for you, please let me know.

@haraldg tested it already (see some comments below your link) and he said:

I also tried to take some timings, but any client segfaults immediatly.

Furthermore, instructions should not be hidden in some discussion. Please wrap everything up and make a proper PR for the main LCDproc repo which also includes a tutorial (for instructions). Thank you!

@kodebach

This comment has been minimized.

Copy link
Contributor

commented Jun 23, 2019

Update progress in #2805

@haraldg tested it already (see some comments below your link) and he said:

I also tried to take some timings, but any client segfaults immediatly.

I don't have access to an armhf device, so I cannot reproduce this. From what little information was given in the original comment, I suspect that the gopts plugin wasn't found.

In particular, I still do not understand what any user should do with the "defaultsHandling" option.

The specValidation option is somewhat independent of the other two. defaultsHandling (DH) and specLocation (SL) always have to be coordinated to some degree.

For both DH and SL currently the default is embed, since it enables the most flexibility.

To reduce binary size, but still allow running the application without the need for kdb mount you may set SL to speconly (and leave DH as is). In this version the binary doesn't contain the full spec. Instead we embed a KeySet that only has keynames, default and type metadata. This is all we need for the highlevel API.

If you want minimal binary size you should set SL to external and DH to speconly. In that case no specification will be contained in the binary. This also saves some time during startup, that would be needed to construct the KeySet(s) (although the impact should be minimal).

The last combination specLocation=embed and defaultsHandling=speconly doesn't have as much of a use. The only advantage I can see here is, that you save a little time by not constructing the KeySet with the defaults for elektraOpen, and not processing it in elektraOpen. This setting is just a consequence of having two separate options.

Furthermore, instructions should not be hidden in some discussion.

I didn't give proper instructions yet, because they might have changed massively after this discussion. I will now start working on the documentation as well.

make a proper PR for the main LCDproc repo

After adding the different modes and some more testing and experimenting this week, I am now firmly of the opinion that this is not a good idea. Of course the feedback would be nice, but I don't see a way that this PR will be merged in the near future (=this summer). The closer I get to finishing my work, the more problems in other parts of Elektra find.

Many of these problems are systemic and obviously arise from the fact that many of Elektras features are hacks. Hacks meaning parts that use the core in ways it was never designed for. Most prominently these are the spec and type plugin. Also when using cascading keys many things just break for inexplicable reasons.

For example today I found this problem. I have no idea, why it didn't come up until now, or why it even is a problem, but here we are:

kdb set -N user /sw/lcdproc/lcdproc/#0/current/lcdproc/foreground true
#> Using name user/sw/lcdproc/lcdproc/#0/current/lcdproc/foreground
#> Set string to "true"

kdb get /sw/lcdproc/lcdproc/#0/current/lcdproc/foreground
#> Sorry, module type issued the error 52:
#> error in the type plugin: The key user/sw/lcdproc/lcdproc/#0/current/lcdproc/foreground was already normalized by a different plugin! Please ensure that there is only one plugin active that will normalize this key.

kdb get user/sw/lcdproc/lcdproc/#0/current/lcdproc/foreground
#> 1

For context, the type plugin is the only one mounted at this mountpoint that does normalization. It is also not a problem with kdb, since I originally noticed the error with lcdproc itself. So it has to be a problem in kdbGet itself. For some reason it seems type is called twice with the same key.

For another problem see #2806.

At this point, the only way forward I see is that I create a version of LCDproc that is usable enough to do benchmarks for my thesis. After that I think we need to at least wait for the new plugin system (needed anyway for a proper server implementation without workarounds).

@haraldg

This comment has been minimized.

Copy link
Author

commented Jun 23, 2019

@kodebach

This comment has been minimized.

Copy link
Contributor

commented Jun 23, 2019

you just have some uncommitted changes in your working copy [...] If I can do anything to gather information, that helps you pin down the problem, then please let me know.

Unlikely. You can try again with the code from #2805. I just tried with that version and starting and connecting the clients did work.

You can also check the CMake output for any messages regarding gopts. And check that [prefix]/lib/elektra/libelektra-gopts.so exists (where [prefix] is your CMAKE_INSTALL_PREFIX, which defaults to /usr/local) after installing.

If you get sefaults again, a full backtrace would also be helpful.

Especially since kdb works fine.

kdb doesn't use the gopts plugin right now.

@markus2330

This comment has been minimized.

Copy link
Contributor

commented Jun 24, 2019

The specValidation option is somewhat independent of the other two. defaultsHandling (DH) and specLocation (SL) always have to be coordinated to some degree.

One reason more to not export this interface to the user.

The last combination specLocation=embed and defaultsHandling=speconly doesn't have as much of a use.

Another reason.

I do not know why you are so reluctant in making this interface easier to use. If it is because you already implemented everything, then you could simply expose only one option (let us say specificationStrategy) and from this option you derive the 3 other options.

After adding the different modes and some more testing and experimenting this week, I am now firmly of the opinion that this is not a good idea. Of course the feedback would be nice,

Feedback is not nice but essential for the success of the LCDproc elektrification.

but I don't see a way that this PR will be merged in the near future (=this summer).

This is a completely different matter that is not controlled by you.

The closer I get to finishing my work, the more problems in other parts of Elektra find.

Good, please create issues.

For example today I found this problem.

Please report this issue (even if you cannot reproduce it yet). For LCDproc this issue is irrelevant as we decided that we will not care about validation. Every attempt to set a boolean to something else than "1" or "0" is not supported. I do not think that LCDproc people will have problems understanding "1" or "0".

At this point, the only way forward I see is that I create a version of LCDproc that is usable enough to do benchmarks for my thesis.

No, the only way forward is to reduce to a set that works. We already decided do exclude validation and as next step we could exclude mmap. I hope it is not necessary as we have someone actively working on mmap. Let us see in #2806.

After that I think we need to at least wait for the new plugin system (needed anyway for a proper server implementation without workarounds).

The changes in the plugin system are only related to the maximum number of plugins. It will not magically fix bugs.

@kodebach

This comment has been minimized.

Copy link
Contributor

commented Jun 24, 2019

One reason more to not export this interface to the user.

User as in user of the code-generator or user as in user of the application (e.g. LCDproc user)?

It will only be exported to the former not the latter. And not exporting it to the former means limiting the use cases of the code-generator needlessly.

then you could simply expose only one option (let us say specificationStrategy) and from this option you derive the 3 other options.

That is exactly what I don't want to do, because this is not very transparent. It is also not future proof. If we use the specification for new things, do we add more strategies? Do all the old strategies have the same behaviour or not? With separate options we don't have to think about any of this. It is very clear what the 3 options do. The developer has to decide what is best for their specific use case.

Good, please create issues.

I am sorry but that is not good. It is catastrophic. I created enough issues already to know, they won't be fixed any time soon, unless I do it myself or somebody is actively working on the broken part already.

Every attempt to set a boolean to something else than "1" or "0" is not supported.

That's another problem, because "not supported" actually means "you must not do it, even though it works" in this case. I can't say right now how or when exactly, but there are definitely cases where a kdb set -N user ... succeeds, even if the type plugin should fail.

I do not think that LCDproc people will have problems understanding "1" or "0".

No, but they certainly won't like that the version using an actual configuration framework can't do what the old minimalist handwritten parser could do. They also won't like that all numbers (even those that make no sense in decimal like USB Vendor IDs) have to be in decimal, because hexnumber is disabled since it would be one plugin too many.

Feedback is not nice but essential for the success of the LCDproc elektrification.

But the feedback doesn't help for the problems inside Elektra itself.

No, the only way forward is to reduce to a set that works. We already decided do exclude validation

It was decided that validation is not required by LCDproc people.

But removing validation and conversions is not a good idea. I know @haraldg said he doesn't care about that. But (assuming Elektra's validation actually works) having a spec actually greatly simplifies the code reading the configuration. For example by limiting the range of an integer in the spec you can eliminate the ifs checking the value after reading it.

If we give up on validation through Elektra, I have to put all the code back in.

Apart from that my main point was: Creating a PR in the LCDproc repo and using a mergable/merged version in my thesis, means involving even more people into this discussion. And that means even more delay until I have a version with which I can do benchmarks.

@markus2330

This comment has been minimized.

Copy link
Contributor

commented Jun 24, 2019

It will only be exported to the former not the latter. And not exporting it to the former means limiting the use cases of the code-generator needlessly.

It is definitely not needlessly: we found out that many (most?) combinations do not make sense. It is very much needed to restrict such configurations.

That is exactly what I don't want to do, because this is not very transparent.

It is very transparent if you document the strategies. And this will be needed anyway.

It is also not future proof. If we use the specification for new things, do we add more strategies?

It is unlikely that anyone will want a further strategy in the near future. Harald basically already wants all of them that make sense. It is much more likely, that some of the strategies will rarely needed and bring maintenance headaches.

Do all the old strategies have the same behaviour or not? With separate options we don't have to think about any of this. It is very clear what the 3 options do. The developer has to decide what is best for their specific use case.

So you suggest: let us push the responsibility to someone else?

I am completely happy with two strategies: the one you implemented up to now and another one to make Harald happy (minimize binary size).

I am not happy with the thinking "ohh, we are so flexible, so it must be the users fault when something is messed up".

I am sorry but that is not good. It is catastrophic. I created enough issues already to know, they won't be fixed any time soon, unless I do it myself or somebody is actively working on the broken part already.

Of course it is good that you find it. It will be fixed by the person who implements validation (nobody is working on this right now). Validation is a no-goal now. You can be assured it is a long term goal for me.

That's another problem, because "not supported" actually means "you must not do it, even though it works" in this case.

Of course, that is the status-quo of nearly all FLOSS apps out there. If you mess up the config it might not work, might crash, might format the hard-drive. Elektra is there to change that but one step at a time. We at least guarantee that we do not crash or format the hard-drive. We could give a better error message thus please report the bug.

No, but they certainly won't like that the version using an actual configuration framework can't do what the old minimalist handwritten parser could do. They also won't like that all numbers (even those that make no sense in decimal like USB Vendor IDs) have to be in decimal, because hexnumber is disabled since it would be one plugin too many.

This is your opinion, which I value very much. And you do your best that everything works very good in your opinion. But we are also interested in the LCDproc people's opinions. At the moment we did not ask them what they actually need, which is not a good situation.

But the feedback doesn't help for the problems inside Elektra itself.

Of course it helps: we'll see to it that Elektra evolves. But without feedback we might run into wrong directions. Especially the validation and transformation problems are very domain-specific.

But (assuming Elektra's validation actually works) having a spec actually greatly simplifies the code reading the configuration. For example by limiting the range of an integer in the spec you can eliminate the ifs checking the value after reading it.

Exactly!

If we give up on validation through Elektra, I have to put all the code back in.

No, we simply assume everything is validated for now so that you can finish your thesis. In later work we will of course fix the problems.

Apart from that my main point was: Creating a PR in the LCDproc repo and using a mergable/merged version in my thesis, means involving even more people into this discussion.

Ok, this is what it is all about: Do not worry about that. Your grade on your thesis will definitely not depend on whether the LCDproc people like it or not. I like your work except when I say I don't.

And that means even more delay until I have a version with which I can do benchmarks.

For your thesis you can pick any version you like. It will not be the latest version, this never works out so do not try to do that. Some automation for benchmarking might be nevertheless useful, sometimes it is necessary to repeat everything, even without changes in the code.

@kodebach

This comment has been minimized.

Copy link
Contributor

commented Jun 24, 2019

we found out that many (most?) combinations do not make sense

1 out of 4 is neither many nor most.

It is very much needed to restrict such configurations.

Why? What is the benefit of disallowing certain configurations? If somebody has a use-case we didn't think of why prohibit it?

It is much more likely, that some of the strategies will rarely needed and bring maintenance headaches.

Exactly why strategies are a bad idea. Separate options are much easier to maintain, since each of them affects much less of the code. That is the basic principle of "separation of concerns".

and another one to make Harald happy

So if X comes along and wants a performance optimized version we add another strategy? And if Y wants some other version we add that as well?

ohh, we are so flexible, so it must be the users fault when something is messed up

Of course not. But your version is basically "if you do want we say it works; if that doesn't fit your use case that's not our fault". Also I can't see how anything could be messed up here at all. There are different prerequisites for the different combinations to work, but that is still the case if everything is controlled by one option and there are less possibilities. We already said that there is no magic solution that just works.

@markus2330

This comment has been minimized.

Copy link
Contributor

commented Jun 24, 2019

Why? What is the benefit of disallowing certain configurations? If somebody has a use-case we didn't think of why prohibit it?

https://cseweb.ucsd.edu/~tixu/papers/fse15.pdf

Exactly why strategies are a bad idea. Separate options are much easier to maintain, since each of them affects much less of the code. That is the basic principle of "separation of concerns".

https://en.wikipedia.org/wiki/Separation_of_concerns

Hint: the term is about code and not (configuration) data.

So if X comes along and wants a performance optimized version we add another strategy? And if Y wants some other version we add that as well?

This is how software development works. You add support for another use case you originally did not think of if somebody needs it. Then it makes sense to make it configurable so that for others their old setup still works (so in this case: adding another strategy). As you know software development is not: "let us add random stuff in the case someone might need it". Unfortunately, at the moment it is not widely known that the same is also true for configuration.

@kodebach

This comment has been minimized.

Copy link
Contributor

commented Jun 24, 2019

https://cseweb.ucsd.edu/~tixu/papers/fse15.pdf

The paper talks about applications with hundreds of options. We are talking about the different between having one parameter with 2-3 values and two parameters with 2 values each.

I don't want to go into further discussion (IMO the paper supports my version at least as well as yours), so what strategies do you propose?

I think we at least need

  1. Spec external, defaults embedded: allows starting without mounting
  2. Spec external, defaults not embedded: minimizes binary size

Possibly also:
3. Spec embedded, defaults embedded: allows starting without mounting, everything is contained in one file -> version conflicts are harder

And at that point it doesn't really matter that there is a fourth setting, if it makes the documentation easier to understand. In the end the most important thing is that users of the code-generator understand the options.

Maybe this is just me, but if each option just controls a single thing (one for spec, one for defaults) that is easier than a single option controlling both.

the term is about code and not (configuration) data.

Yes and how the code-generator works is about code as well. It has nothing do with configuration or data.

This is how software development works. [...]

It is about the balance between fulfilling exactly the use cases you have right now and making a piece of software that is generally useful. Elektra wants to be useful in general not for one specific case (e.g. embedded system, or server systems). Therefore we should have some flexibility from the start. More importantly software should always be designed in a future proof way. Too many limitations and future changes may become very difficult (e.g. Elektra's limit on the number of plugins is, so deeply ingrained in Elektra that is hard to change now).

@markus2330

This comment has been minimized.

Copy link
Contributor

commented Jun 24, 2019

The paper talks about applications with hundreds of options.

In total Elektra has hundreds of options.

so what strategies do you propose?

I am okay with all strategies:

  • that we have a good use case for
  • are tested
  • are documented

In the end the most important thing is that users of the code-generator understand the options.

This is is not the only point, see above. E.g. see spec, list or csvstorage. They have only understandable options but still they are hardly usable as it is so easy to get these options wrong because most options are usually not needed and depend on each other in surprising ways.

Elektra's limit on the number of plugins is, so deeply ingrained in Elektra that is hard to change now

Yes, it was wrong to limit the number of plugins.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.