Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error message concept #2435

Merged
merged 22 commits into from Mar 28, 2019

Conversation

Projects
None yet
3 participants
@Piankero
Copy link
Contributor

Piankero commented Feb 22, 2019

This PR is done as a basis for discussions on the new error concept and is not intended to be merged.

@Piankero

This comment has been minimized.

Copy link
Contributor Author

Piankero commented Feb 22, 2019

@markus2330 Should I mark it as "Ready to merge" if want you to take a look at it? (despite it is not intended to be merged)

@markus2330
Copy link
Contributor

markus2330 left a comment

partly review

@@ -0,0 +1,250 @@
# Error Message Concept

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 22, 2019

Contributor

Maybe you can use the doc/decision format?

1. Too many errors in the [specification](src\error\specification) (210 errors/warnings)
2. Many duplicate or redundant errors
3. Description/ Reason are very similar
4. Potential file splitting (1327 LOC)

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 22, 2019

Contributor

The problem is the missing modularity of the file.

3. Description/ Reason are very similar
4. Potential file splitting (1327 LOC)
5. Verbose error message

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 22, 2019

Contributor

Another problem is that strerror sometimes is not part of the reason, sometimes (not) in quotes, sometimes \n is at the end. (These are basically bugs to be fixed but maybe a better error concept would also avoid such errors.)

This comment has been minimized.

Copy link
@Piankero

Piankero Feb 24, 2019

Author Contributor

I do not think that such problem is solvable here as in most languages exceptions/error messages are getting simply converted into another one (encapsulation, layering, etc.) and this can simply lead to information loss or formatting problems. It is up to the developer to do it correctly in my opinion.

This comment has been minimized.

Copy link
@Piankero

Piankero Feb 24, 2019

Author Contributor

I could just think about another method which takes the strerror() as a parameter and the error logging tool handles it

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 24, 2019

Contributor

Another method will not help much. The framework should reset errno correctly and it should be part of the error/warning if it was changed by the plugin.

We should also have tests if errno is handled correctly.

* I highly doubt that most common users will require this information (unnecessary)

Suggestions:
Remove it completely. If users want to know the location they still can use `kdb file` on the concrete setting.

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 22, 2019

Contributor

The information was added on request, so I would not remove it completely. Maybe we disable it by default and only show it on request (-v for cmd-line).

9. Configfile: ...../<file>.25676:1549919217.284067.tmp
```

The lines are numbered and will be referred in the following sections.

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 22, 2019

Contributor

Did you compare what qt-gui and web-ui show you?

This comment has been minimized.

Copy link
@Piankero

Piankero Feb 24, 2019

Author Contributor

Concerning specification related errors yes,
web gui does not show anything at all
qt gui shows a bad message (#2406)

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 24, 2019

Contributor

Please write this in the text, and in more detail (with examples)

#2406 is one specific problem, it is unfair to generalize this to "qt gui shows a bad message"

* I highly doubt that most common users will require this information (unnecessary)

Suggestions:
Remove it for standard error messages. Maybe a `kdb set -v` could include such information for developers, users.

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 22, 2019

Contributor

Yes, I like the idea with -v. But then we should also say that you get more error information with -v?

This comment has been minimized.

Copy link
@Piankero

Piankero Feb 24, 2019

Author Contributor

Do you mean to include it in the message (eg. extra line saying that you can get more verbose output with -v)?
Isn't it common in the unix world to simply use -v when wanting more output (meaning we could potentially spare the extra line)

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 24, 2019

Contributor

It is not uncommon but still most people would not try it without any indication that it exists (or would waste time to find out).

The text "use -v for more information" is not so long and potentially helps a lot.


Suggestion:
Having both ingroup and module is too much information. Also again in most cases users should get the relevant information out of the reason line.
As of now there are 6 different "ingroup"s. I would suggest to remove it.

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 22, 2019

Contributor

Yes, the group concept needs rework.

@markus2330
Copy link
Contributor

markus2330 left a comment

review, part II

* I highly doubt that most common users will require this information (unnecessary)

Suggestions:
Remove it for standard error messages. Maybe a `kdb set -v` could include such information for developers, users.

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 23, 2019

Contributor

Maybe even kdb <command> -d (debugging). We should however tell people that they need to use -d for bug reports.

This comment has been minimized.

Copy link
@Piankero

Piankero Feb 24, 2019

Author Contributor

Do you want to have -d and -v both present? I think one would be enough honestly

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 24, 2019

Contributor

It is for different purposes: -v is to be more verbose, -d is to debug Elektra.


Reasons to keep it:
* See the concrete plugin which causes the failure
* Prohibits error masking (eg. `stat`ing a file happens in more plugins, devs will know where the error occurs)

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 23, 2019

Contributor

Not only devs: also users mount some plugin, which might cause unexpected errors later.

Reasons to keep it:
* Gives users a quicker overview about the error (eg. permission error/ resource error/ etc. or another type of categorization such as Markus
suggested with temporary/permanent/conflict/etc.)
* Can indicate the type of solution a user has to approach

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 23, 2019

Contributor

Maybe only categories should have a description, and not the errors themselves?

This comment has been minimized.

Copy link
@Piankero

Piankero Feb 24, 2019

Author Contributor

I think I don't know what you mean by that. Can you give me an example how you imagine it to look like?

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 24, 2019

Contributor

Please finish your concept first, then it is easier to discuss about such details.

* Can indicate the type of solution a user has to approach

Reasons against it:
* Extra line

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 23, 2019

Contributor

I think we should separate the discussions for which information needs to be available and which information should be presented to (end)users.

This comment has been minimized.

Copy link
@Piankero

Piankero Feb 24, 2019

Author Contributor

But which information should be present also influences the concept in my opinion

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 24, 2019

Contributor

Yes of course, but the concept should differentiate between these two cases/viewpoints.


Reasons against it:
* Extra line
* What to do with "uncategorizable" errors

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 23, 2019

Contributor

Examples?

This comment has been minimized.

Copy link
@Piankero

Piankero Feb 24, 2019

Author Contributor

They usually are very generic and could have multiple sources of errors:
199, 196, 151, 147

Also kind difficult: 188, 177, 162


Suggestion:
This line is by far the most discussable one. I thought about categorizations such as
* Permission/ Resource/ Parse/ External(C++, Ruby, Java)/.../ Miscellaneous:

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 23, 2019

Contributor

I agree with "Resource" and "Parsing" Errors.

Permissions is too specific, I would put it to "Resource" and rather add other very general categories like: Validation, Transformation

"External" and "Miscellaneous" I don't like.

This comment has been minimized.

Copy link
@Piankero

Piankero Feb 24, 2019

Author Contributor

External is just meant for errors thrown by other programs (or programming languages) and caught be elektra.
"Miscellaneous" is for errors which are hard to classify

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 24, 2019

Contributor

Yes, but they are problematic as they are not helpful for the API user.

* What to do with "uncategorizable" errors
* Maintenance effort: introduce a new category once there are enough similar errors and change all old errors to this category?
* Awareness of developers: All devs have to know all categories and choose the correct one. Devs should also be prohibited to throw their error
into the "uncategorized" category due to lazyness.

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 23, 2019

Contributor

There should be no "other" category.

This comment has been minimized.

Copy link
@Piankero

Piankero Feb 24, 2019

Author Contributor

I agree, I would also like to avoid it if possible.

* Permission/ Resource/ Parse/ External(C++, Ruby, Java)/.../ Miscellaneous:
Most errors are resource errors (could not stat, no connection, could not close/open/find etc.) This again would not give users
a particularly useful error indication since he/she has to lookup in the reason anyway. Even subdividing the `Resource` category
would lead to the same problem. No matter how I look at it, the user

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 23, 2019

Contributor

I think only one division might be interesting for a user: Is it something that might get fixed by itself (connection problems) or something that a sysadmin can fix (permission, not enough space, ..)?

But even this division is shaky: some connection problems need to be fixed by admins, or some cron job might remove files (and thus "not enough space" might fix itself)

This comment has been minimized.

Copy link
@Piankero

Piankero Feb 24, 2019

Author Contributor

I agree on your division but got some practical concerns. Beside some connection/timeout problem I do not see a lot of temporary issues (especially when looking through the specification file which errors could be temporary). And even in theses cases, usually there is a reason behind it that need to be fixed by sysadmins (proxy settings/firewall settings/power outage of router/etc.) In by far the most cases I think the sysadmin has to fix something. For those cases in which it is a real temporary problem, the user will retry it anyway I think

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 24, 2019

Contributor

Yes, it is okay for me that we do not distinguish between temporary and permanent errors. Only the conflict, resource and validation errors need to be definitely different (as different reactions are required on these errors).

a particularly useful error indication since he/she has to lookup in the reason anyway. Even subdividing the `Resource` category
would lead to the same problem. No matter how I look at it, the user
will have to read the concrete reason in most cases which makes the Description line not that valuable any longer.
* temporary/permanent/conflict/validation:

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 23, 2019

Contributor

I agree, but see temporary/permanent in the discussion above.

This approach would lead to the same problem as we have now (many errors, difficult to maintain)

Personally I do not see how the benefit of a categorization outweighs the drawbacks because in by far the most cases users will have to look
into the reason in more detail anyway. It should be enough for users to deal with the error just by the reason.

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 23, 2019

Contributor

The categorization is for applications: if it is a conflict, they can merge and kdbSet again, if the problem is a validation error, they might drop the key and try again; if the error is temporary, they might retry (retry is quite problematic though as it is unclear how often to retry and so on...)

This comment has been minimized.

Copy link
@Piankero

Piankero Feb 24, 2019

Author Contributor

I think such solution approaches should be indicated by the "Solution: " part in the error message and not via a generic description. Users will have to look into the reason in more detail, no matter the conflict/validation error/ etc.
Therefore I do not see a big advantage of the Description field anymore and a possible categorization.

@markus2330
Copy link
Contributor

markus2330 left a comment

review part III (review finished)

* Possibility for automation (eg. if a certain error number occurred you could automatically trigger certain actions)

Reasons against it:
* Extra line

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 23, 2019

Contributor

It does not need an extra line, it is only a number.

This comment has been minimized.

Copy link
@Piankero

Piankero Feb 24, 2019

Author Contributor

Where does this message then come from:
Sorry, the error (#...) occurred ;(

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 24, 2019

Contributor

Yes, the line exists now, but it does not have to be that way.

Giving an error a number like it is done in C programs as return code seems to be outdated for elektra, especially when
the reason is given anyway. Personally I never read the error number because it has not much value.
The *reasons to keep* will also be very limited when the number of errors are significantly reduced.
I would also suggest to remove it.

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 23, 2019

Contributor

It is needed for automated checks for specific errors (especially unit and shell recorder tests).

What we could do is to not use numbers but some ID, like the macro name or alternatively something generated (which is guaranteed to be unique). This would avoid conflicts when adding errors and make it easier for external specifications.

Downside: string comparison is needed to check for which error it is. Maybe that is not really a problem.

This comment has been minimized.

Copy link
@Piankero

Piankero Feb 24, 2019

Author Contributor

I am confused by your message:
Is it absolutely needed or not? As of now the unit tests and shell recorder tests test for a very specific error number. If we reduce the number of errors (which is the main purpose here) we would have to change unit tests anyway.

Does it make such a difference when comparing for a String or Number? If the error message changes, the test too but do error messages get changes that much?

If we could get rid of the specification file we can let developers handle errors much more easier.

This comment has been minimized.

Copy link
@kodebach

kodebach Feb 24, 2019

Contributor

In my opinion an error code is absolutely essential. There has to be a way to infallibly check which error occurred, because some errors may be fixed automatically without user input. The high-level API for example retries kdbSet, if a conflict (error code 30 I think) occurred, with the user not even knowing about it.

Think of it this way:
Most modern languages (e.g. C++ and Java) support exceptions. Certain exceptions (e.g. a timeout for an HTTP request) can be caught and fixed in some way (retry HTTP request with exponential backoff). Others might be totally ignored/suppressed or wrapped in a more general exception. For this reason it is good practice to use different exception types. You could of course always use the same type and compare the error message string, but this is not reliable. A single character changed in the error message breaks the comparison. Not even a typo could be fixed without breaking backwards compatibility. And that doesn't even consider the very common case of dynamic information in an error message. If the error message contains a filename that depends on user input (e.g. mountpoint), your only chance is to partially compare the string. This partial comparison is even more error prone, because it might match other errors as well.

C doesn't have exceptions, so other values (integers in most cases) are used to uniquely identify one type of error. Too guarantee the error code is unique, a centralised specification is the easiest way.

However, I think this specification can be vastly improved:

  • Don't specify error codes explicitly. Simply start with 1 for the first error and go from there. This eliminates merge conflicts, as long as we only use macros (or other named symbols) in code.
  • Don't use macros. Use extern const int. This way the actual error codes we use are hidden from external code and are determined by the version of Elektra used. (@markus2330 please correct me if this is wrong)
  • If we also want to guarantee that the actual error codes never change, we could fixate them on each release. Either explicitly, or by using a simple test case that git pulls the last release and compares error codes.
  • Only put the names for the extern const ints and a short description/comment what this error code should be used for into the specification. This allows the error message to be fully customised and an error code would be more or less equivalent to an exception type in Java or C++. The format would basically be:
    name: FILE
    description: an error related to a file, e.g. missing permissions or missing file
    
    From that we would generate kdberror.h:
    extern const int ELEKTRA_ERROR_FILE;
    And a source file:
    #include <kdberror.h>
    
    /**
     * an error related to a file, e.g. missing permissions or missing file
     */
    const int ELEKTRA_ERROR_FILE = 1;
    

This comment has been minimized.

Copy link
@Piankero

Piankero Feb 24, 2019

Author Contributor

What you suggest though is basically what we have now, hundreds of error codes in a centralized file to distinguish them among each other. Many error codes even share the same text.

So in your opinion this task is just to squash some errors together? This would lead to the same problems again over the long run which we have now. You want to know if there has occurred a very specific error so you can actually do counter measures. My task though was to reduce the high granularity of the error messages.

Where should this be written?

name: FILE
description: an error related to a file, e.g. missing permissions or missing file

In the specification file? Or is the kdberror.h the new source of error codes to be checked for rather then the specification.

This comment has been minimized.

Copy link
@kodebach

kodebach Feb 24, 2019

Contributor

What you suggest though is basically what we have now

No, I agree with you on all parts of your concept except error codes. Like Markus said above, we don't have to use integers, but we do need some sort of unique ID.

hundreds of error codes

We don't need hundreds of codes, but we need unique codes for some errors. I personally would group the rest thematically: file errors, broken specification, invalid configuration (i.e. not in accordance to the specification), error in binding (e.g. Java exception) etc.

Looking at the current specification I see only three things that should definitely have separate error codes:

  • CONFLICT (30): can sometimes be resolved by retrying
  • TIMEOUT (195): similarly might be resolved by retrying (may be with bigger timeout)
  • NETWORK (205): An application could suppress this error, wait until a connection is available again and retry then. Of course some sort of message may still be produced to inform the user.

All of these basically come down to the same behaviour: "auto-retry". Especially, if an application is running in the background it is very nice, if the application doesn't require user input for things that might fix themselves. Auto-retry is simply not possible, if you don't have an infallible way of identifying an error.

in a centralized file to distinguish them among each other.

I really don't care whether the file is centralized, or whether there is a file at all. If you can find a different solution to ensure unique error codes, I'm okay with that too. We could even use code 1 for conflict, 2 for timeout, 3 for network, ... (any future "auto-retry" codes) and -1 for anything else. Basically defining positive error codes are auto-retry-able, negative ones are not. I really don't care, as long as the auto-retry-able errors are infallibly distinguishable from the rest and from each other.

Where should this be written?

Probably it would be better to just write kdberror.h and the corresponding source file by hand. It should only be a few lines anyway.

This would lead to the same problems again over the long run which we have now.

As long as we make sure that each new error code introduced in a PR is very well justified, I don't think this will happen.

This comment has been minimized.

Copy link
@markus2330

markus2330 Mar 11, 2019

Contributor

Thank you for the write-up! I think you can rework the whole error concept now. Maybe start with the goals, which should be very clear now. Furthermore it would be nice if you directly rework the error specification, the spread-sheet already looks good. I think we can split the categories a bit more:

  • Validation into syntactic and semantic (and maybe even others)
  • Temporary into conflicts and timeouts (watch out: only because "temporary" is in the description does not mean it is a temporary problem: 161 is a resource/permission error)
  • Permanent into Resource, Parsing, Installation errors (and maybe even others)

Maybe we want sub-categories, so that applications which e.g. do not care if a validation is syntactic or semantic can write error handling code easier.

I think there is little use of discussing this even more. Simply do it! (I would start with categorizing the errors directly in the error specification. It is a harmless thing and can be merged easily as it will not break anything.)

This comment has been minimized.

Copy link
@markus2330

markus2330 Mar 11, 2019

Contributor

I forgot a very important class of errors: Assertion failures! I.e. logical errors that should never happen and are bugs within Elektra.

This comment has been minimized.

Copy link
@kodebach

kodebach Mar 11, 2019

Contributor

I think I finally know why we disagree so much. Who is a user in your mind?

  • A user of Elektra, i.e. a developer using Elektra in their application.

or

  • A user of an application using Elektra (e.g. the kdb tool).

The first kind (let's call them "application developers") will probably find specific error codes useful. The more specific an error code is, the easier you can find documentation on how to fix it. While a good description might be enough on its own, an error code always helps, if you need look at the documentation.

The second kind of users (let's call them "end users") won't find that useful. They just want a single line that tells them want they need to do "to make it work". This includes users of kdb, so I'm going to use it as an example.

You rightfully pointed out that kdbs error messages are quite bad, mainly because of the huge amount of unnecessary (and partly redundant) information. But this has nothing to do with how Elektra itself reports errors to applications. Even if we kept everything the same, kdb could just print

Error 121: validation failed
Validation of key "<key>" with string "<value>" failed.

instead of the 9 lines you mentioned in your example. (-v is used, in which case the second verbose version makes sense)


So I think the problem is twofold and we need two separate decisions here. The questions are:

  1. How should Elektra (the library) report errors to applications?
  2. How should kdb display these errors?

The second question is easy to answer: kdb should print one or two lines of error description. kdb -v should print all the available information, so that diagnosing errors becomes easier.

The first question is not so easy. The important thing here is to get rid of all the duplicate or very similar errors (e.g. the GPG errors), while maintaining enough granularity, so that error codes are useful for testing, diagnosing and actually handling errors. The categories are a good start, but IMO they can't replace actual error codes. We don't need an separate code for each description (in fact that is exactly what we shouldn't do), but we need some distinction.

Another good step to get rid of unnecessary error codes is IMO to not have any kind of specification for warnings. The difference between a warning and an error is that a warning can (more or less) safely be ignored, so I don't see any reason to specify warnings. If we just use elektraAddWarning(Key * key, const char * format, ...), we would get rid of 84 error codes immediately. NOTE: warnings also shouldn't translate into exceptions or similar in bindings no need for specification there either

This comment has been minimized.

Copy link
@kodebach

kodebach Mar 11, 2019

Contributor

I think we can split the categories a bit more

Maybe we want sub-categories

More categories plus sub-categories, might be an alternative to error codes, but I'm not quite convinced it would be a good fit for testing purposes.

Assertion failures! I.e. logical errors that should never happen and are bugs within Elektra.

Should those even be errors? Shouldn't we just exit if an assertion fails? If this "should never happen," crashing the application shouldn't really be a problem, should it?

This comment has been minimized.

Copy link
@markus2330

markus2330 Mar 11, 2019

Contributor

I fully agree we should distinguish of which error information we emit and which parts of this information is shown to whom. @kodebach Thank you for the valuable input.

Should those even be errors? Shouldn't we just exit if an assertion fails?

The application should exit but a well behaved library should not exit. (The infallible APIs as exception.) So I think we need (assertion) errors for the situations where admins installed Elektra incorrectly or the KDB is completely broken (random stuff in system/elektra).

All in all this sentence can stay because it gives the user a first hint on what happened.
I would though change the wording and include more relevant information in it.
I would change it like this:
"Module <plugin> issued a <error/warning/fatal> while accessing the key database with the info:"

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 23, 2019

Contributor

Why having fatal? What is the difference?

This comment has been minimized.

Copy link
@Piankero

Piankero Feb 24, 2019

Author Contributor

There are 7 fatal messages in the specification. I do not know their right to exist in detail but it looks like those cause elektra to malfunction or useless (failed to open backend, could not initialize modules)

Some errors though do not seem to be worth a fatal like "could not compile regex" (you actually commited this 8 years ago)

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 24, 2019

Contributor

Yes, in retrospective I think the distinction is arbitrary. I would remove this severity.


All in all this sentence can stay because it gives the user a first hint on what happened.
I would though change the wording and include more relevant information in it.
I would change it like this:

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 23, 2019

Contributor

I would not remove the "Sorry,"

I would though change the wording and include more relevant information in it.
I would change it like this:
"Module <plugin> issued a <error/warning/fatal> while accessing the key database with the info:"
where plugin should have a different color (blue?) and warning/error/fatal (yellow, red, dark red) too depending on the severity.

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 23, 2019

Contributor

Yes, I would like to have colors.

would be changed to this:
```
1. Plugin enum issued a error while accessing the key database with the info:
2. Validation of key "<key>" with string "<value>" failed.

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 23, 2019

Contributor

Allowed values are missing.

This comment has been minimized.

Copy link
@Piankero

Piankero Feb 24, 2019

Author Contributor

This is a concrete error from the enum plugin, I agree that it should give you all allowed values in the message though.


with optionally a third line indicating a solution. Eg. for a permission related error there would be a third line:
```
3. Possible Solution: Retry the command as sudo (sudo !!)

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 23, 2019

Contributor

Possible solution in validation errors is to try a different value.

This comment has been minimized.

Copy link
@Piankero

Piankero Feb 24, 2019

Author Contributor

I agree, the solution suggestion is up to the developer though

Benefits:
* Much more relevant information in the error message
* Ease of development:
no merge conflicts, no extra line in the `specification` file leading to chaos (such as it is now)

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 23, 2019

Contributor

Please elaborate here what exactly the implication would be.

This comment has been minimized.

Copy link
@Piankero

Piankero Feb 24, 2019

Author Contributor

Which implications do you exactly mean? Those of having no specification file at all?

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 25, 2019

Contributor

Implications of your whole concept.

Description comes with too many drawbacks to be kept
* Possible misleading the user with the `Solution` line

The benefits do outweigh the drawbacks. I would suggest an overloaded method like the following:

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 23, 2019

Contributor

overloading in C is a bit of an hack (needs macros), so better to give different names or simply return null pointers if there is no possible solution.

The benefits do outweigh the drawbacks. I would suggest an overloaded method like the following:
```
// If a solution can be applied
elektraError(Key parentKey, const * char message, const * char solution)

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 23, 2019

Contributor

Please see #2432. What you propose is the only function we already have (which is not removed).

This comment has been minimized.

Copy link
@Piankero

Piankero Feb 24, 2019

Author Contributor

I looked through the PR and i am actually more confused than before.
Is it a naming problem or a public api visibility problem or what exactly?

This comment has been minimized.

Copy link
@markus2330

markus2330 Feb 25, 2019

Contributor

Your concept should describe the whole API.

@Piankero

This comment has been minimized.

Copy link
Contributor Author

Piankero commented Mar 17, 2019

@markus2330, I rewrote the concept. Please give your final review so I can finally start.
My plan are the following Milestones (also separate PRs to avoid millions of merge conflicts):

  • Remove warnings
  • Add the new Solution API
  • Migrate all errors according to the categories

@Piankero Piankero force-pushed the Piankero:error_message_concept branch from 10c7b9d to 5accf07 Mar 17, 2019

@kodebach
Copy link
Contributor

kodebach left a comment

I like this version a lot better than the first one, and it is obviously miles better than what we currently have.

The remaining ~130 errors will be categorizes into logical groups with subgroups.
Each group and subgroup will receive a range of numbers such as in the HTTP protocol.

* Permanent errors (1-500)

This comment has been minimized.

Copy link
@kodebach

kodebach Mar 17, 2019

Contributor

I think it is more user friendly to use this grouping:

  • Permanent errors (100 - 1000)
    • Resource (100 - 199)
    • Parsing (200 - 299)
    • Installation (300 - 399)
    • Logical (400 - 499)
  • Temporary (1000 - 1999)
    • Conflict (1100 - 1199)
    • Timeout (1200 - 1299)
  • Validation (2000 - 2999)
    • Syntactic (2100 - 2199)
    • Semantic (2200 - 2299)

The schema behind this is:

  • 4 digit error codes
  • 1st digit: logical group
  • 2nd digit: subgroup
  • 3rd & 4th digit: error number within subgroup

That way you could more easily determine the group and subgroup, similar to HTTP status codes. You could also use strncmp(errorCode, "22", 2) to check for semantic validation errors for example. In your system this wouldn't be as easy.

We could also reserve error codes ending in 00 to mean unknown or undefined error of this subgroup, if there is any use for that. The error code 0 could be used for "fatal assertion failure" (i.e. an error that should never happen)

If you think this doesn't allow enough future subgroups, we could also use 2 or even 3 digits for group and subgroup.

This comment has been minimized.

Copy link
@Piankero

Piankero Mar 18, 2019

Author Contributor

Thank you for that input. Your idea also works nice. But if I understood correctly, theses error codes would then be strings because permanent errors would still have digits but something like 0100 for Resource errors. My thaught is that devs check for error codes like >=100 && <200 instead of something like charAt(0) == 0 && charAt(1) == 1 (or startsWith(01)).

Lets wait if @markus2330 has another idea/input on this :). For me, both approaches are equally fine.

This comment has been minimized.

Copy link
@markus2330

markus2330 Mar 18, 2019

Contributor

I would prefer if users do not have to fiddle around with numbers. @Piankero did you already did a study what other FLOSS projects are doing? You can write the comparison as thesis text.

In the next meeting we can do a brain storm of which projects have nice error concepts.

This comment has been minimized.

Copy link
@Piankero

Piankero Mar 19, 2019

Author Contributor

Not yet, do you have a preference towards certain projects? Do they have to be similar to elektra (C/C++, Medium Large, Well Maintained, etc.)?

This comment has been minimized.

Copy link
@markus2330

markus2330 Mar 19, 2019

Contributor

I think we should look into projects that also have a generic plugin system where it is not so clear which kind of errors plugins might have to yield, e.g.: GStreamer, Apache http, Jenkins, Databases (MariaDB, PostgresQL).

Also relevant would be standards like SNMP.

Looking at projects that are to some extend alternatives to Elektra, like etcd, Windows Registry or Mac OS X plists also cannot harm.

Of course you should also look into POSIX errors (errno), although we definitely do not want to go back there as it is too primitive. (It does not give any reason or any further helpful information.)

As the error concepts are usually very short or non-existent, I think you can look at more than just a few.

The grouping of errors will allow developers to filter for specific as well as more general errors to correctly
react to them programmatically. Even though there are currently just 8 categories they will reserver up to 1500 error numbers.
This will permit additional subgrouping of errors in case it might be needed in the future. Imagine the case where
"Recourse" errors is too general because developers saw a need for splitting the errors in "permission" and "existence" errors.

This comment has been minimized.

Copy link
@kodebach

kodebach Mar 17, 2019

Contributor

"Recourse" errors

"Resource" errors


Warnings will be removed from the specification file. Any current warning will use the function
```
elektraAddWarning(Key * parentKey, const char * message)

This comment has been minimized.

Copy link
@kodebach

kodebach Mar 17, 2019

Contributor

Maybe add more arguments for, metadata. e.g. plugin/library, file, line etc. All of these should be optional, i.e. accept NULL. In addition der could be a macro ELEKTRA_ADD_WARNING (key, message) which sets them automatically.

In plugins the macro ELEKTRA_PLUGIN_NAME_C is injected by CMake, something similar could be done for this case, i.e. ELEKTRA_ERROR_MODULE which contains the plugin or library name or simply NULL for the rest.

- homepage URL
in the CMake [`project`](https://cmake.org/cmake/help/latest/command/project.html) command. *(René Schwaiger)*
- We fixed the detection of Python for the [Python 2 binding](https://www.libelektra.org/bindings/swig_python2) on macOS. *(René Schwaiger)*
- Worked out a new error concept. *(Michael Zronek)*

This comment has been minimized.

Copy link
@kodebach

kodebach Mar 17, 2019

Contributor

I think you made a mistake while merging/rebasing... The diff to master shouldn't contain any lines with other authors

@markus2330
Copy link
Contributor

markus2330 left a comment

Thank you, the concept is improving. But an analysis of other projects is definitely missing.


- Too verbose error message
- A lot of redundant errors
- Hard to manage specification file

This comment has been minimized.

Copy link
@markus2330

markus2330 Mar 20, 2019

Contributor

Write more about the goals. What does "Hard to manage" mean? Do we want the specification to be more static or do we want that adding errors is without merge conflicts?

- Hard to manage specification file
- No senseful way for application developers to use error codes from elektra

This concept should improve useability for end users and developers.

This comment has been minimized.

Copy link
@markus2330

markus2330 Mar 20, 2019

Contributor

Please also consider administrators, who will also read the error messages and are also capable of resolving resource errors and similar.

This comment has been minimized.

Copy link
@Piankero

Piankero Mar 21, 2019

Author Contributor

Where exactly is the difference between an administrator and an end user? They both use elektra via gui/cmd/etc and fix a problem once it occurs. A developer reacts to error codes or uses the elektra API to emit an error.

This comment has been minimized.

Copy link
@markus2330

markus2330 Mar 21, 2019

Contributor

Of course there is overlap but I would not see an end user in the position to fix most resource error (e.g. resizing disks, changing permissions). So an application might add the text "please contact your system administrator" if an resource error occurs.

Having resource/conflict/temporary errors in different categories, however, is maybe the only requirement we need to add in order to consider system administrators.


## Constraints

- Error numbers must stay because they are more reliable to check against than strings

This comment has been minimized.

Copy link
@markus2330

markus2330 Mar 20, 2019

Contributor

There are many more constraints, e.g.:

  • supporting multiple programming languages
  • constraints from the plugin system
  • ...

This comment has been minimized.

Copy link
@Piankero

Piankero Mar 25, 2019

Author Contributor

How does it exactly differ when supporting multiple programming languages?
What does the plugin system exactly constraint in this case?

This comment has been minimized.

Copy link
@markus2330

markus2330 Mar 26, 2019

Contributor

How does it exactly differ when supporting multiple programming languages?

The errors should be in a way so that the specification can be mapped to exceptions (or other ways different programming languages handle errors).

What does the plugin system exactly constraint in this case?

In many ways, for example, multiple plugins might try to set an error and the system needs to decide which is the error to be shown to the user. Hopefully you learn such constraints from related work.


## Considered Alternatives

- Removing the specification file without requiring error numbers

This comment has been minimized.

Copy link
@markus2330

markus2330 Mar 20, 2019

Contributor

Here can you be much more concrete after you looked into other projects. Basically every project you look into will be a "considered alternative".

## Considered Alternatives

- Removing the specification file without requiring error numbers
- Possible variations on what message should be displayed

This comment has been minimized.

Copy link
@markus2330

markus2330 Mar 20, 2019

Contributor

This sentence does not say much.

Furthermore a developer can use the command line argument `-d` (debug)
to show `At` for debugging purposes.

Warnings will be removed from the specification file as they are not needed there.

This comment has been minimized.

Copy link
@markus2330

markus2330 Mar 20, 2019

Contributor

You mean the information if something is a warning or error?

This comment has been minimized.

Copy link
@Piankero

Piankero Mar 21, 2019

Author Contributor

Warnings will be removed as a whole from the specification file. Warnings per se will still exist but will not receive error codes. Therefore it does not make sense to include it in the file, an API call is enough

This comment has been minimized.

Copy link
@markus2330

markus2330 Mar 21, 2019

Contributor

Unfortunately you were not here today to discuss this. I currently do not understand why we should throw away the categorization of warnings. It would destroy the ability to convert errors to warnings (and vice versa). I think we should keep warning and error compatible (or even have them more compatible: every error would be a valid warning and vice versa).


All "fatal" errors will be converted to "errors" as the distinguishment is not relevant.

Unused marked errors will be removed from the specification.

This comment has been minimized.

Copy link
@markus2330

markus2330 Mar 20, 2019

Contributor

It would be nice if we get a checker tool if warnings/errors are used (simple grep of macro should suffice).

The remaining ~130 errors will be categorizes into logical groups with subgroups.
Each group and subgroup will receive a range of numbers such as in the HTTP protocol.

- Permanent errors (1-500)

This comment has been minimized.

Copy link
@markus2330

markus2330 Mar 20, 2019

Contributor

I like @kodebach approach to use digits for the classification.

But even more important is some underlying logic why the classification is done how it is.

- Installation (101 - 150)
- Logical (151 - 200)
- Temporary (501 - 1000)
- Conflict (501 - 600)

This comment has been minimized.

Copy link
@markus2330

markus2330 Mar 20, 2019

Contributor

Conflict is not really temporary: it is permanent until the conflict is resolved...

This comment has been minimized.

Copy link
@Piankero

Piankero Mar 21, 2019

Author Contributor

A few days ago you wrote this:

I think we can split the categories a bit more:
...

  • Temporary into conflicts and timeouts...

What exactly should I take now. In one sentence you write I should split it and now you tell me that I should not do it?

This comment has been minimized.

Copy link
@markus2330

markus2330 Mar 21, 2019

Contributor

I do not see the contradiction you see here.

"[split] Temporary into conflicts and timeouts" means that there should be two categories: conflicts and timeouts.

This comment has been minimized.

Copy link
@Piankero

Piankero Mar 25, 2019

Author Contributor

I understood it the way that I should make a further subdivision but keep the overall category. I understand it now

This comment has been minimized.

Copy link
@Piankero

Piankero Mar 27, 2019

Author Contributor

I am a bit confused concerning "Conflicts" even though when reading through #1473. What kind of errors do you imagine under this category? Can you give some examples?

This comment has been minimized.

Copy link
@markus2330

markus2330 Apr 15, 2019

Contributor

Conflict is one kind of error: when kdbSet() fails because some external modification was done. The solution is that the application needs to do kdbGet() afterwards.

The conflict happens because we are optimistic: we assume we are the only one working with the configuration files. If this assumption does not hold, a conflict occurs.

information at all.

The grouping of errors will allow developers to filter for specific as well as more general errors to correctly
react to them programmatically. Even though there are currently just 8 categories they will reserver up to 1500 error numbers.

This comment has been minimized.

Copy link
@markus2330

markus2330 Mar 20, 2019

Contributor

reserver?

Just like Windows, plist uses standard Mac OS X errors which is a [huge catalog](http://krypted.com/lists/comprehensive-list-of-mac-os-x-error-codes/) of unordered
return codes as integers.
* [SNMP Standard](http://www.snmp.com/protocol/):
Being a standard network protocol, error codes are very specific to the domain itself. A list can be found [here](https://docs.microsoft.com/en-us/windows/desktop/snmp/snmp-error-codes) and would not meet the needs of elektra at all.

This comment has been minimized.

Copy link
@markus2330

markus2330 Mar 27, 2019

Contributor

Where do you see that the error codes are specific to the domain? With a few exceptions (e.g. SNMP_ERRORSTATUS_AUTHORIZATIONERROR) I think most of the error codes have some equivalent in Elektra.

This comment has been minimized.

Copy link
@Piankero

Piankero Mar 27, 2019

Author Contributor

UNDOFAILED, WRONGLENGTH, TOOBIG, WRONGENCODING, INCONSISTENTNAME, ... they might all could be mapped to some elektra class but in my opinion these are completely different domains. While SNMP emphasizes on errors in packets, elektra does on handling key value pairs.

@Piankero

This comment has been minimized.

Copy link
Contributor Author

Piankero commented Mar 27, 2019

I reworked the concept as you desired @markus2330 and also investigated other projects. I liked the SQLSTATES and incorporated it into the concept. The categories as of now did not change because I did not see the need for it when going through other error categories.
If we ever see the need for another category it is very easy to change.

@markus2330
Copy link
Contributor

markus2330 left a comment

The categories are fine for me now, I think you can implement this in the error specification.

For the API there are still discussions (SOLUTION) and extensions (high-level API?) needed.

I think you should split this to several decision files. In this PR please only add the error specification so that we can finally merge this.

Configfile: ...../<file>.25676:1549919217.284067.tmp
```

The new default error message will look like this:

This comment has been minimized.

Copy link
@markus2330

markus2330 Mar 27, 2019

Contributor

What is with warnings?

This comment has been minimized.

Copy link
@Piankero

Piankero Mar 27, 2019

Author Contributor

I slightly changed the message. Warnings and Errors are as u suggested interchangeable and should therefore also share the same message format

The new default error message will look like this:

```
Sorry, plugin <PLUGIN> issued error #<NR>:

This comment has been minimized.

Copy link
@markus2330

markus2330 Mar 27, 2019

Contributor

Is the # a good idea? It creates problems in GitHub as GitHub thinks it is an issue. Maybe "E" or "#E" would be better?

This comment has been minimized.

Copy link
@Piankero

Piankero Mar 27, 2019

Author Contributor

Since the new error is a String I would not prefix it with a letter.
Why not take a more straightforward approach and use ... issued error code XXXXX?

This comment has been minimized.

Copy link
@markus2330

markus2330 Apr 2, 2019

Contributor

Because people copy&paste only the XXXXX and sometimes the information if it was a warning/error gets lost then. Please also create an issue about that.

This comment has been minimized.

Copy link
@Piankero

Piankero Apr 2, 2019

Author Contributor

Well, the concept relies on the fact that errors and warning are interchangeable like you requested. It also does not make a considerable difference if it is a Resource "Warning" or "Error" for example. The message itself states if it was an error or warning and is even color highlighted. Honestly I would not prefix a code with "W" or "E" just because user could potentially lazy copy an error code. The error code itself also is primarily used for programmatic reaction rather than for users to copy. Users should rely on the message. This is one key aspect of the concept

This comment has been minimized.

Copy link
@markus2330

markus2330 Apr 3, 2019

Contributor

Ok.

The class of errors which require certain permissions (opening a file) or where some directory/file is missing.
One possible reaction to these kind of errors is to try a different directory/file or create a missing resource.
- Parsing
Errors while parsing files such as ini/yml/etc.

This comment has been minimized.

Copy link
@markus2330

markus2330 Mar 27, 2019

Contributor

yml?

This comment has been minimized.

Copy link
@Piankero

Piankero Mar 27, 2019

Author Contributor

yml and yaml are more or less interchangeable. But the official documentation suggests yaml so I will take it

This comment has been minimized.

Copy link
@markus2330

markus2330 Apr 2, 2019

Contributor

I heard of yml being a file extension (maybe to be compatible to DOS 8.3 file names) but never that yml is used to describe the file format.

ELEKTRA_ADD_WARNINGF(nr, parentKey, message, ...)
// Extended API
ELEKTRA_SET_ERROR_WITH_SOLUTION(nr, parentKey, solution, message)

This comment has been minimized.

Copy link
@markus2330

markus2330 Mar 27, 2019

Contributor

With this approach we would not have format specifiers for solutions and it would also blow up the size of kdberrors.h (maybe you even find a way how to make the file shorter).

What about having a macro ELEKTRA_ADD_SOLUTION which adds a solution message to the last error/warning? This would also allow something like src/tools/kdb/mountbase.cpp very bottom.

This comment has been minimized.

Copy link
@Piankero

Piankero Mar 27, 2019

Author Contributor

Concerning the "blow up" part I already talked about with @kodebach. With some refactoring we could drastically reduce the code duplication generated by the cpp file.

Yes, we could add another macro which could do that but would also break the atomicity then. Eg. plugin a and b throw an independent error. Plugin a also wants to add a solution. First plugin a writes to the parent key. Then plugin b follows immediately. Plugin a then adds its solution which would not fit together then

This comment has been minimized.

Copy link
@kodebach

kodebach Mar 27, 2019

Contributor

maybe you even find a way how to make the file shorter

I got some ideas about this while working on the highlevel code generation. I already sent them to @Piankero, but I will post it here again, in case you have any input:

Solution 1: 1 macro + 1 short static inline function per error code

We start with a header file that contains the basic macros (ELEKTRA_SET_ERROR etc):

#define ELEKTRA_SET_ERROR(code, key, message, <...>) elektraSetError##code (key, message, <...>)

For each error code we append something like the following:

#define ELEKTRA_ERROR_SOMETHING 01001

static inline void elektraSetError01001 (Key * key, const char * message, <...>) {
    keySetMeta(key, "error", "01001");
   [...]
    elektraSetErrorInformation (message, <...>);
}

elektraSetErrorInformation sets the common attributes of an error. It should be declared in the generated header (or one that is included in the generated file), and defined in a separate C file, so that we don't duplicate the code everytime kdberrors.h is included.

The <...> part is a placeholder for the remaining dynamic parameters needed for error creation. The [...] part is a placeholder for the remaining static properties of an error given in the specification.

Solution 2: 2(?) macros per error code

By abusing the preprocessor a bit, we could also do everything without the inline functions.
The reason the inline functions exist in the current (AFAIK) and the above solution, is so that we get compiler errors, if ELEKTRA_SET_ERROR is used with unspecified error codes. The following fulfills the same purpose:

Like above we start with a header containing the basic macros. But instead of simply calling the inline function we do something like:

#define ELEKTRA_ERROR_PROPS(code) ELEKTRA_ERROR_PROPS_##code

#define ELEKTRA_SET_ERROR(code, key, message, <...>) \
    do {                                             \
        ELEKTRA_ERROR_PROPS(code);                   \
        elektraSetErrorInformation(message, <...>);  \
   } while(0)
#define ELEKTRA_ERROR_SOMETHING 01001

#define ELEKTRA_ERROR_PROPS_01001(key)     \
        keySetMeta(key, "error", "01001"); \
        [...]

We might need additional helper macros so the substitution is done correctly

Beacuse ELEKTRA_ERROR_PROPS_##code only exists for specified we will still get a compiler error for undefined error codes. The advantage of using macros is that they are definitely inlined, although the static inline functions should also be inlined in optimized builds.

What about having a macro ELEKTRA_ADD_SOLUTION which adds a solution message to the last error/warning?

I think that is better than having ELEKTRA_SET_ERROR_WITH_SOLUTION, but I would prefer a function (should reduce binary size, if not static) instead of a macro, unless you know a reason why a function wouldn't work.

This comment has been minimized.

Copy link
@kodebach

kodebach Mar 27, 2019

Contributor

Eg. plugin a and b throw an independent error.

Plugins are always called sequentially by Elektra, so this situation cannot normally arise. Elektra is also explicitly not thread safe, so if parallelism occurs the developer has to handle all problems that may arise.

As long as the plugin doesn't try to do something stupid like setting an error in kdbGet and the solution in kdbSet, everything should be fine.

This comment has been minimized.

Copy link
@markus2330

markus2330 Mar 28, 2019

Contributor

Plugins are always called sequentially by Elektra, so this situation cannot normally arise.

There are some situations, e.g. if the plugins within the "error" position emit further errors or multiple notification fail (which happens after commits).

Of course we can prohibit to emit such errors in such situations but that should be explained in guidelines hopefully written by @Piankero

This comment has been minimized.

Copy link
@markus2330

markus2330 Apr 2, 2019

Contributor

I like @kodebach idea to reduce the size of kdberrors.h. This should also be part of the new concept. But as said below, we need a project or some other means to keep track of these proposals.

@markus2330

This comment has been minimized.

Copy link
Contributor

markus2330 commented Mar 27, 2019

Btw. the related work is nicely done!

Each error will be made up of 5 characters, where the first 2 character indicate the highest level
and character 3 to 5 will be used for subgrouping.

- Permanent errors (01000)

This comment has been minimized.

Copy link
@kodebach

kodebach Mar 27, 2019

Contributor

Use "01000" instead of 01000. Otherwise it looks like a C-style octal number.

@Piankero Piankero force-pushed the Piankero:error_message_concept branch from 8ec89c4 to f71a8cb Mar 27, 2019

@kodebach

This comment has been minimized.

Copy link
Contributor

kodebach commented Mar 27, 2019

For the API there are still discussions (SOLUTION) and extensions (high-level API?) needed.

The current error codes of the high-level API could simply integrated into the new concept. Either as their own category or otherwise marked with some attribute. The ElektraError struct of the high-level API would be replaced by one that contains all the information of the new errors. For all errors belonging to the high-level API we would also generate functions that directly create the appropriate ElektraError struct.

This would also be good test for whether the concepts is usable, if a different error reporting framework (other than an errorKey) is used.

@Piankero Piankero force-pushed the Piankero:error_message_concept branch from f71a8cb to 7412f93 Mar 27, 2019

@markus2330

This comment has been minimized.

Copy link
Contributor

markus2330 commented Mar 28, 2019

I approve the categories for now. Please split up the decision in further files in the next PR.

You can start working recategorize all errors.

@markus2330 markus2330 merged commit ae26c98 into ElektraInitiative:master Mar 28, 2019

5 of 6 checks passed

continuous-integration/jenkins/pr-merge This commit cannot be built
Details
LGTM analysis: C/C++ No code changes detected
Details
LGTM analysis: JavaScript No code changes detected
Details
LGTM analysis: Python No code changes detected
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage increased (+3.2%) to 77.023%
Details
@Piankero

This comment has been minimized.

Copy link
Contributor Author

Piankero commented Mar 30, 2019

For the API there are still discussions (SOLUTION) and extensions (high-level API?) needed.

The current error codes of the high-level API could simply integrated into the new concept. Either as their own category or otherwise marked with some attribute. The ElektraError struct of the high-level API would be replaced by one that contains all the information of the new errors. For all errors belonging to the high-level API we would also generate functions that directly create the appropriate ElektraError struct.

This would also be good test for whether the concepts is usable, if a different error reporting framework (other than an errorKey) is used.

After categorizing some errors I would kind of tend to have an own category because all errors from other languages could also be included there (python, java, etc.)

Maybe we could call is Language error or API error?

@Piankero Piankero deleted the Piankero:error_message_concept branch Mar 30, 2019

@markus2330

This comment has been minimized.

Copy link
Contributor

markus2330 commented Mar 30, 2019

There are already high-level API specific errors and other APIs should reuse these errors. Why do you want to also have API errors in the low-level errors?

@Piankero

This comment has been minimized.

Copy link
Contributor Author

Piankero commented Mar 30, 2019

Ok, well then the concept of the high level API is not clear yet to me how it works or is implemented.

I still wonder though where to put language errors into which come from java/python/lua/etc.
They do not fit anywhere as of now

@markus2330

This comment has been minimized.

Copy link
Contributor

markus2330 commented