Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate validation logs from operation logs #31

Closed
ydahhrk opened this issue Apr 17, 2020 · 10 comments
Closed

Separate validation logs from operation logs #31

ydahhrk opened this issue Apr 17, 2020 · 10 comments
Milestone

Comments

@ydahhrk
Copy link
Member

ydahhrk commented Apr 17, 2020

Some people have manifested annoyance over the large volume of complaints that Fort logs due to issues found in the RPKI tree:

(sample excerpt)
WRN: rsync://rpki.cnnic.cn/rpki/A9162E3D0000/2694/rXp9fDjcCAhxxza7mMddsMJ8uMc.roa: Serial number '661' is not unique. (Also found in 'rsync://rpki.cnnic.cn/rpki/A9162E3D0000/2694/5Rs-Pc07Y_zNasRoTeJVfORvmOE.roa'.)
WRN: rsync://rpki.cnnic.cn/rpki/A9162E3D0000/497/cOev3xoFWdNyUTYsMDI5Egl8PQ8.roa: Serial number '6BD' is not unique. (Also found in 'rsync://rpki.cnnic.cn/rpki/A9162E3D0000/497/4BqXb84YtfqpHX1W1oGcKGi_jYs.roa'.)
WRN: rsync://rpki.cnnic.cn/rpki/A9162E3D0000/561/R_8KCa9pc3fNftjCdSYROwMQB3c.roa: Serial number '6B9' is not unique. (Also found in 'rsync://rpki.cnnic.cn/rpki/A9162E3D0000/561/GqEqV9qIXXJK2ICJP14IjGwoQKQ.roa'.)
WRN: rsync://rpki.cnnic.cn/rpki/A9162E3D0000/561/URCkLU-hrbETGHt-zWzU-tZvi70.roa: Serial number '6B9' is not unique. (Also found in 'rsync://rpki.cnnic.cn/rpki/A9162E3D0000/561/GqEqV9qIXXJK2ICJP14IjGwoQKQ.roa'.)
ERR: rsync://rpki.cnnic.cn/rpki/A9162E3D0000/FE-4PMY9qqTI2aJ0xLDm7cD-fvw.cer: Could not open location 'tmp/rpki.cnnic.cn/rpki/A9162E3D0000/515/FE-4PMY9qqTI2aJ0xLDm7cD-fvw.mft'
ERR:   - No such file or directory
ERR: rsync://rpki.cnnic.cn/rpki/A9162E3D0000/FE-4PMY9qqTI2aJ0xLDm7cD-fvw.cer: Could not open location 'tmp/rpki.cnnic.cn/rpki/A9162E3D0000/515/FE-4PMY9qqTI2aJ0xLDm7cD-fvw.mft'
ERR:   - No such file or directory
WRN: rsync://rpki.cnnic.cn/rpki/A9162E3D0000/2329/01jyjWbE_vnmei3aHGlV_cCSXYA.roa: Serial number '732' is not unique. (Also found in 'rsync://rpki.cnnic.cn/rpki/A9162E3D0000/2329/-w8V5CMi6bHN88xgo8K9tma5MC8.roa'.)
WRN: rsync://rpki.cnnic.cn/rpki/A9162E3D0000/2329/70nRMOo1VzrcvxbocYKqzlcNbqU.roa: Serial number '732' is not unique. (Also found in 'rsync://rpki.cnnic.cn/rpki/A9162E3D0000/2329/-w8V5CMi6bHN88xgo8K9tma5MC8.roa'.)
WRN: rsync://rpki.cnnic.cn/rpki/A9162E3D0000/2329/9YAUV53TP8C8valffV_Pbl8k0Pc.roa: Serial number '732' is not unique. (Also found in 'rsync://rpki.cnnic.cn/rpki/A9162E3D0000/2329/-w8V5CMi6bHN88xgo8K9tma5MC8.roa'.)
WRN: rsync://rpki.cnnic.cn/rpki/A9162E3D0000/2329/Hi7elxud7qvaNilveqkgB0LmoHs.roa: Serial number '732' is not unique. (Also found in 'rsync://rpki.cnnic.cn/rpki/A9162E3D0000/2329/-w8V5CMi6bHN88xgo8K9tma5MC8.roa'.)
WRN: rsync://rpki.cnnic.cn/rpki/A9162E3D0000/2329/U2DxoZ7UrM_rju6WOJe1r5LpBgM.roa: Serial number '732' is not unique. (Also found in 'rsync://rpki.cnnic.cn/rpki/A9162E3D0000/2329/-w8V5CMi6bHN88xgo8K9tma5MC8.roa'.)
WRN: rsync://rpki.cnnic.cn/rpki/A9162E3D0000/2329/ZcvvMu_hxh0SyftR2-OPegdnTHA.roa: Serial number '732' is not unique. (Also found in 'rsync://rpki.cnnic.cn/rpki/A9162E3D0000/2329/-w8V5CMi6bHN88xgo8K9tma5MC8.roa'.)
WRN: rsync://rpki.cnnic.cn/rpki/A9162E3D0000/2329/yWD9OmpNgJA5rwfOF9JlXPfgHLw.roa: Serial number '732' is not unique. (Also found in 'rsync://rpki.cnnic.cn/rpki/A9162E3D0000/2329/-w8V5CMi6bHN88xgo8K9tma5MC8.roa'.)

The admin running this is unlikely to have the authority to correct these problems, and they tend to obfuscate the actual operation errors, which are much more important for them.

On the other hand, downgrading the priority of these messages to debug would be a bit extreme because they can be helpful for people trying to validate the compliance of their own RPKI objects.

It has been therefore proposed to separate these messages into a separate logging channel.

Update: This bug is independent of incidences:

  • Incidences were conceived (mainly) as a means to direct Fort not to drop RPKI objects due to minor, understood, common and specific profile validation violations.
  • This is (mainly) a means to prevent Fort from polluting the logs with potentially unimportant information. (While still likely dropping the offending objects.)
@ydahhrk
Copy link
Member Author

ydahhrk commented Apr 17, 2020

First draft by @pcarana (translated by me):

Requirements

  • Separate "validation" and "application" logs.
  • Allow users to silence one and/or the other.
  • Add local date and time (eg. "Feb 12 09:38:21: INF: Validation finished") for console logging.
  • Validation logs should have a dedicated prefix (eg. "[Validation log]"), so they can be filtered by syslog if both logs are enabled.

Log types

Application/execution/operation logs

  • Out of memory
  • Configuration (invalid values, inaccessible paths, configuration echo, etc)
  • Communication (rsync/http) infos/errors
  • Programming errors

Validation logs

  • Any message (error, warn, info, etc.) concerning RPKI tree files

Proposal

  • The current log argument set remains, but will refer only to operation logs.
  • Create new argument set validation-log, which refers to validations logs.
  • Both logs can be disabled.

Current:

{
	"log": {
		"color-output": true | false,
		"file-name-format": "global-url" | "local-path" | "file-name",
		"output": "console" | "syslog",
		"level": "error" | "warning" | "info" | "debug"
	}
}

Proposal:

{
	"log": {
		"enable": true | false,
		"color-output": true | false,
		"file-name-format": "global-url" | "local-path" | "file-name",
		"output": "console" | "syslog",
		"level": "error" | "warning" | "info" | "debug"
	},
	"validation-log": {
		"enable": true | false,
		"color-output": true | false,
		"file-name-format": "global-url" | "local-path" | "file-name",
		"output": "console" | "syslog",
		"level": "error" | "warning" | "info" | "debug"
	}
}

Definitions:

  • log.*: Configuration concerning operation logs.
  • validation-log.*: Configuration concerning RPKI tree validation logs.
  • log.enable: Enables or disables the corresponding logs.

Default values:

  • log.enable: true
  • log.output: console
  • log.level: warning
  • log.console.color: false
  • validation-log.enable: false
  • validation-log.output: console
  • validation.level: warning
  • validation.console.color: false
  • validation.file-name-format: global-url

@ydahhrk
Copy link
Member Author

ydahhrk commented Apr 17, 2020

Communication (rsync/http) infos/errors

I think these should be moved to the validation logs section.

Proposal:

Consider adding a prefix option to each log set so the prefix is not hardcoded:

{
	"log": {
		"enable": true | false,
		(...)
		"prefix": "[Operation log] "
	},
	"validation-log": {
		"enable": true | false,
		(...)
		"prefix": "[Validation log] "
	}
}

These should probably default to empty strings.

@pcarana
Copy link
Contributor

pcarana commented Apr 27, 2020

I think these should be moved to the validation logs section.

Maybe, but there's one reason that holds me for doing this: what if the "communication info/error" is the host's responsibility? e.g. the host where fort is running it's having network issues, or has a firewall that's blocking rsync/https. In that case, I believe it's important to notify the operator.

As of today, there are common errors at the global RPKI servers, so yes, this logs will be present at the operation logs even when this isn't the host fault; but I think that's better to notify on any error of this kind at such logs which will be enabled by default, rather than sending them to the validation logs (that will be disabled by default).

Consider adding a prefix option to each log set so the prefix is not hardcoded:
{...}
These should probably default to empty strings.

I like this, let the prefix to be configured rather than hardcoded. Using empty strings as default: I'm not so sure; I would prefer to use the default values just as you wrote: log.prefix="[Operation log]" and validation-log="[Validation log]".

What do you think regarding this comments?

@ydahhrk
Copy link
Member Author

ydahhrk commented Apr 28, 2020

Maybe, but there's one reason that holds me for doing this: what if the "communication info/error" is the host's responsibility? e.g. the host where fort is running it's having network issues, or has a firewall that's blocking rsync/https. In that case, I believe it's important to notify the operator.

As of today, there are common errors at the global RPKI servers, so yes, this logs will be present at the operation logs even when this isn't the host fault; but I think that's better to notify on any error of this kind at such logs which will be enabled by default, rather than sending them to the validation logs (that will be disabled by default).

Each TAL maps to a very small amounts of servers.

For example, a standalone run using Lacnic's TAL currently downloads files from repository.lacnic.net and rpki-repo.registro.br, and nowhere else. A standalone run using Afrinic's TAL downloads files from rpki.afrinic.net, and nowhere else.

Therefore, it doesn't make sense to flood the operation logs with messages like

Hey, I couldn't download repository.lacnic.net/cert3.cer
Hey, I couldn't download repository.lacnic.net/cert54.cer
Hey, I couldn't download repository.lacnic.net/cert129.cer
Hey, I couldn't download repository.lacnic.net/cert11.cer

Those belong to the validation logs. The operation logs should at most have one message per iteration like

Hey, I couldn't download 4 files from repository.lacnic.net

But I'd say that, as long as you could download at least one file from a server, then you can tell that any problems with that server are unlikely (impossible?) to be the host's responsibility.

Personally, I would only notify the admin if the entire server is unavailable.

Using empty strings as default: I'm not so sure; I would prefer to use the default values just as you wrote: log.prefix="[Operation log]" and validation-log="[Validation log]".

The validation log is disabled by default. That means that there is no need for a prefix by default.

As a compromise, how about

  • Default operation prefix: ""
  • Default validation log prefix: "[Validation] "

@pcarana
Copy link
Contributor

pcarana commented Apr 29, 2020

Personally, I would only notify the admin if the entire server is unavailable.

This seems the way, there's no need to flood the operation logs with individual messages. The operation logs will have one general message related to a communication error only if such error persists.

To achieve that, we could use a new argument to log this error (communication error) at the operation log only if the error has persisted after some time (e.g. if after 4 hours the download isn't successful). On the meanwhile, after that time hasn't passed yet, send this log the validation logs.

There's only one exception to consider: the first validation run: if there's an error during this run, log it as a warning at the operation logs.

The validation log is disabled by default. That means that there is no need for a prefix by default.

As a compromise, how about

  • Default operation prefix: ""
  • Default validation log prefix: "[Validation] "

True, I like your proposal. Let's do it that way

@pcarana
Copy link
Contributor

pcarana commented Apr 29, 2020

Add also the log.facility and validation-log.facility properties (see comments at #26).

@pcarana pcarana added this to the v1.3.0 milestone May 15, 2020
@pcarana pcarana closed this as completed in 0e0d367 Jul 7, 2020
@Iqbalmahmood99
Copy link

Can anyone help with the query: from which location of the server this log files can be viewed?

@ydahhrk
Copy link
Member Author

ydahhrk commented Jun 28, 2022

Logs are sent to syslog. By default, my Ubuntu places them in /var/log/syslog.

@Iqbalmahmood99
Copy link

Iqbalmahmood99 commented Jun 28, 2022 via email

@ydahhrk
Copy link
Member Author

ydahhrk commented Jun 28, 2022

AFAIK, If log.output and validation-log.output are console and you're running Fort as a service, then the logs are getting lost. Either don't run it as a service, or change the outputs to syslog.

If it's still not writing to /var/log/syslog, find the logs with

cd /var/log
sudo grep fort . -r

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants