Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Outreachy 2023] Improve user experience on early system boot failures #26586

Closed
bluca opened this issue Feb 24, 2023 · 76 comments · Fixed by #27830 or #28873
Closed

[Outreachy 2023] Improve user experience on early system boot failures #26586

bluca opened this issue Feb 24, 2023 · 76 comments · Fixed by #27830 or #28873
Labels
bsod journal 🙋🏽‍♀️ outreachy 🙋 pid1 RFE 🎁 Request for Enhancement, i.e. a feature request

Comments

@bluca
Copy link
Member

bluca commented Feb 24, 2023

Component

systemd/journald

Project description

Improve the user experience when something goes wrong early at system boot. We want to ensure that users are able to debug early boot failures by providing QR-encoded URLs pointing to useful description of errors that the system just experienced.

We also want to ensure that the system meets a minimum baseline level before it is allowed to continue booting, which for this project consists in checking that the battery level is adequate and above a minimum configurable threshold.

Implementing this project will require working on low-level components such as the systemd system manager, the journald logging service, and adding new early boot services that handle virtual terminals, consoles and QR encoding.

Tasks

  • detect battery level in initrd and if low refuse continuing to boot, print message and shut down
  • if boot fails for any other reason, display a QR code with a link to the journal message catalog explaining the error that can be scanned on a phone
    • audit error level logs that are relevant for early boot failures, and bump log level up
    • for each such log messages, have an entry in the message catalog, and that the entry has an URL
    • add a service that the journal can activate and that will encode the log message URL into a QR code and display it on the screen for the user
    • the new service should be implemented with socket connection activation: [Outreachy 2023] Improve user experience on early system boot failures #26586 (comment)
    • extend the boot assessment functionality to plug in this functionality by for example having a timing-out target
@bluca bluca added RFE 🎁 Request for Enhancement, i.e. a feature request 🙋🏽‍♀️ outreachy 🙋 labels Feb 24, 2023
@bluca bluca added the journal label Feb 24, 2023
@bluca bluca changed the title Improve user experience on early system boot failures [Outready 2023] Improve user experience on early system boot failures Feb 24, 2023
@poettering
Copy link
Member

implementation idea for "activation-by-journald":

* journald: add varlink service that allows subscribing to certain log events,
  for example matching by message ID, or log level returns a list of journal
  cursors as they happen.

* In .socket units, add ConnectStream=, ConnectDatagram=,
  ConnectSequentialPacket= that create a socket, and then *connect to* rather than
  listen on some socket. Then, add a new setting WriteData= that takes some
  base64 data that systemd will write into the socket early on. This can then
  be used to create connections to arbitrary services and issue requests into
  them, as long as the data is static. This can then be combined with the
  aforementioned journald subscription varlink service, to enable
  activation-by-message id and similar.

(from the TODO file)

@bluca bluca changed the title [Outready 2023] Improve user experience on early system boot failures [Outreachy 2023] Improve user experience on early system boot failures Feb 24, 2023
@yuwata yuwata pinned this issue Mar 12, 2023
@1awesomeJ
Copy link
Contributor

Good evening @bluca.

I'm extremely elated to have gotten this opportunity to be your protege over the next months.

I know it can be quite demanding on your schedule, but please, I'll like to start my mentorship immediately.

You mostly as well as the other team members here have made very huge impact on my tech journey this year sir.

Please find the time in your schedule-no matter how little it is that I get- to start with me immediately sir.

Thank you so so much. God bless you sir.

@1awesomeJ
Copy link
Contributor

First.
My last PR about tpm2 PCR has a small bug.
In its current state, users can attempt to set PCR[24] without getting any errors.

That's because of an issue of '>' rather than '>='

I'll open a PR to fix that tonight or early tomorrow.

I have just ordered for a new computer that I hope to pick up tomorrow morning.

If I can't set up a vagrant box on my old PC again tonight, then I'll push the PR tomorrow.

@1awesomeJ
Copy link
Contributor

Second.

I really do not understand PID1.

I'm thinking src/core/main.c is what I should be studying sir.

Is that right?

@1awesomeJ
Copy link
Contributor

Third.

What other modules should I start with, in order to get the actual flow of the init process?

Thank you sir.

@1awesomeJ
Copy link
Contributor

@bluca,
Here's my current update:
I've just digested the first function in "src/core/main.c" the manager_find_user_config_paths() function.
Going through it, and verifying the file and directory names on my local machine has reduced my confusion about the whole process flow significantly. I intend covering up to line 2000 of the module before the week runs out, I hope by then I would have learnt enough to know the point at which our feature comes in.

It's a really cool feature. Thank you for choosing it sir, and thank you for this big opportunity of a mentorship.
I'm grateful.

when I get excited about my learning again, I'll drop an update, that way I stay in touch, and you also monitor my progress.

@1awesomeJ
Copy link
Contributor

going through console_setup() function, I saw a comment that says we don't want to enforce text-mode on the console as initrd may have started a graphical process like plymouth.

up until now, I thought the boot process flow was:

  1. Bios/UEFI
  2. Grub/Systemd-boot
  3. kernel
  4. Systemd (PID 1)

It looks like it's actually:

  1. Bios/UEFI
  2. Grub/Systemd-boot
  3. kernel
  4. Initrd
  5. Systemd (PID 1)

I thought initrd is only accessed at some point during PID 1, but it looks like it actually runs before PID1.

So far, I've learnt that initrd is distribution dependent and isn't directly part of our project.

@1awesomeJ
Copy link
Contributor

@bluca @poettering,
Good evening sirs.
For the task "detect battery level in initrd and if low refuse continuing to boot, print message and shut down"

I'm confused about the "in initrd" part.

My understanding is the bootloader loads the kernel and initrd/initramfs into the ram, the initrd mounts a temporary file system through which the kernel gets the drivers it needs for the machine's hardware, after the necessary hardware checks, the actual root file system is mounted and the kernel starts systemd as PID 1.

So my confusion is initrd seems to be outside our project, how then do we detect battery level inside of it?

Is it that we can somehow make initrd mount the temporary file system and thereafter run a script for detecting battery level?

Or more in line with my understanding, is it that as soon as initrd completes and PID 1 starts, the first thing we do is check battery level to decide whether or not we should continue with the init process?

I'm sorry for being verbose and naive.

I often get good clarity after your guidance.

Thank you sirs.

@AdrianVovk
Copy link
Contributor

I suggest reading this man page. It will possibly help with your confusion about the boot process. But here's a quick summary of the initrd:

  1. User presses power button on their device
  2. BIOS/UEFI starts up (aka "firmware"; not under our control)
  3. Firmware starts a bootloader (examples: GRUB, systemd-boot)
  4. Bootloader finds and loads the kernel & initrd into memory
  5. Bootloader hands control into the kernel, and tells it where the initrd is loaded
  6. The kernel "mounts" the initrd as the root filesystem (it actually copies the files into a tmpfs, but its easier to think of it as mounting)
  7. Once the kernel is ready to execute programs, it runs /init
  8. /init does whatever it needs to find the real root filesystem and mount it somewhere (usually /sysroot)
  9. /init tells the kernel to "pivot root" into /sysroot. This replaces / with everything that was mounted to /sysroot. In other words, from this point forwards the initrd's files are gone and now / is the real root filesystem. /init keeps running from memory
  10. /init finds and executes systemd in the real system. On UNIX exec stops executing one program and turns it into another. Effectively, at this moment /init stops exsiting and it is completely replaced by systemd from the real system

Many distros choose to use systemd inside of the initrd as well as outside of it. This is the case that you should care about. This is what is meant by "in the initrd" in the context of systemd. Here's what that looks like

  1. User presses power button on their device
  2. BIOS/UEFI starts up
  3. Firmware starts a bootloader
  4. Bootloader finds and loads the kernel & initrd into memory
  5. Bootloader hands control into the kernel, and tells it where the initrd is loaded
  6. The kernel "mounts" the initrd as the root filesystem
  7. Once the kernel is ready to execute programs, it runs /init, which is actually systemd (I'll call this initrd-systemd)
  8. initrd-systemd detects that it is running in the initrd, so it goes into initrd mode instead of a normal bootup
  9. initrd-systemd executes units that find the real root filesystem and mount it to /sysroot
    • HERE is where you can include other services that do other things, like your battery check
    • Some of the units initrd-systemd executes load drivers that the kernel needs
    • One of the units initrd-systemd executes will by plymouth, which plays a boot animation and/or prompts for password
  10. initrd-systemd cleans up after itself: most of the units that were running are stopped, but some can ask to stick around if important
  11. Once /sysroot is ready, initrd-systemd then tells the kernel to pivot root. Again the initrd's files are now gone and / is replaced with whatever was mounted to /sysroot
  12. initrd-systemd then searches the new root filesystem to find the real systemd (I'll call that real-systemd)
  13. initrd-systemd replaces itself with real-systemd
  14. real-systemd detects that it is not running in an initrd, so it continues in normal bootup

@1awesomeJ
Copy link
Contributor

Wow, just like that, I mean you took out your precious time to explain this in such great details?

How awesome, how exciting.

Thank you so so so much @AdrianVovk.

I'm so so glad to be part of this community.

Thank you so so so much once again sir.

I'm super Thankful.

@1awesomeJ
Copy link
Contributor

Thank you so much sir.
It's really awesome to get your guidance.

@bluca
Copy link
Member Author

bluca commented May 29, 2023

@bluca @poettering, Good evening sirs. For the task "detect battery level in initrd and if low refuse continuing to boot, print message and shut down"

I'm confused about the "in initrd" part.

There are various units that run only in the initrd, they generally are named "initrd-something", check the units/ directory. There's already a helper function to check whether battery is critically low, battery_is_discharging_and_low() in src/shared/sleep-config.c. A simple way to implement this task is to create a new simple tool that just calls that, modeled after src/ac-power/ac-power.c but that it exits with an error (and logs an emergency alert) if battery is critical and success otherwise, and add an initrd unit that calls it, which also sets FailureAction=poweroff-force - see: https://www.freedesktop.org/software/systemd/man/systemd.unit.html#FailureAction=

@1awesomeJ
Copy link
Contributor

@bluca @poettering, Good evening sirs. For the task "detect battery level in initrd and if low refuse continuing to boot, print message and shut down"
I'm confused about the "in initrd" part.

There are various units that run only in the initrd, they generally are named "initrd-something", check the units/ directory. There's already a helper function to check whether battery is critically low, battery_is_discharging_and_low() in src/shared/sleep-config.c. A simple way to implement this task is to create a new simple tool that just calls that, modeled after src/ac-power/ac-power.c but that it exits with an error (and logs an emergency alert) if battery is critical and success otherwise, and add an initrd unit that calls it, which also sets FailureAction=poweroff-force - see: https://www.freedesktop.org/software/systemd/man/systemd.unit.html#FailureAction=

Yay!!!! Nice to hear from you sir.

@1awesomeJ
Copy link
Contributor

@bluca @poettering, Good evening sirs. For the task "detect battery level in initrd and if low refuse continuing to boot, print message and shut down"
I'm confused about the "in initrd" part.

There are various units that run only in the initrd, they generally are named "initrd-something", check the units/ directory. There's already a helper function to check whether battery is critically low, battery_is_discharging_and_low() in src/shared/sleep-config.c. A simple way to implement this task is to create a new simple tool that just calls that, modeled after src/ac-power/ac-power.c but that it exits with an error (and logs an emergency alert) if battery is critical and success otherwise, and add an initrd unit that calls it, which also sets FailureAction=poweroff-force - see: https://www.freedesktop.org/software/systemd/man/systemd.unit.html#FailureAction=

Awesome sir. Thank you so much, I'll get to work right away. Thank you sir.

@1awesomeJ
Copy link
Contributor

How has your day been sir? I look forward to a video meet with you soon. I hope my accent won't be terrible!
We were told to introduce ourselves to our community, Is there a place I can do that? I'm glad to have already interacted with @poettering , @keszybz , @medhefgo @mrc0mmand @yuwata @DaanDeMeyer @YHNdnzj. It's been a privilege.

@1awesomeJ
Copy link
Contributor

There are various units that run only in the initrd, they generally are named "initrd-something", check the units/ directory. There's already a helper function to check whether battery is critically low, battery_is_discharging_and_low() in src/shared/sleep-config.c. A simple way to implement this task is to create a new simple tool that just calls that, modeled after src/ac-power/ac-power.c but that it exits with an error (and logs an emergency alert) if battery is critical and success otherwise, and add an initrd unit that calls it, which also sets FailureAction=poweroff-force - see: https://www.freedesktop.org/software/systemd/man/systemd.unit.html#FailureAction=

I want to create a new tool "battery-check.c" should I place it in "src/battery-check/battery-check.c" or could it just stay in "src/ac-power/battery-check.c"?

Also is the name choice good enough? or should I use something like "early-boot-battery-check.c"

I appreciate the time you commit to guiding me very very much, knowing how busy your schedules can be. It's a super awesome blessing to have you. Thank you sir.

Thank you everyone.

@1awesomeJ
Copy link
Contributor

Also, I'm not seeing any other use cases for this tool that may warrant adding the "--help", "--version" and "--verbose" flags as with the ac-power tool.

should I leave those out sir?

@1awesomeJ
Copy link
Contributor

Here's what I've done sir:
#27830

@1awesomeJ
Copy link
Contributor

@bluca
Good evening sir,
I got an email from Outreachy that says I need to answer the question:
"Do mentor(s) meet with the intern over phone or video chat?"

Should I answer no?

Or as I would like, would you permit to schedule a video call sir?

Thank you.

@bluca
Copy link
Member Author

bluca commented Jun 5, 2023

@bluca Good evening sir, I got an email from Outreachy that says I need to answer the question: "Do mentor(s) meet with the intern over phone or video chat?"

Should I answer no?

Or as I would like, would you permit to schedule a video call sir?

Thank you.

We can continue using github PRs/issues for now, and re-evaluate if needed

@1awesomeJ
Copy link
Contributor

It's exciting though how much i'll get familiar with the codebase, given all these deep concepts Poettering has mentioned. With the knowledge, I'll be able to fix at least one issue/TODO per month going forward.
I may also get to make meaningful comments on PRs that you geniuses open. It'll be awesome.

@1awesomeJ
Copy link
Contributor

I opened a PR to add a new call to server_open_varlink() here:
#28420

Not quite ready for your reviews, but i'd rather just get started and push as i gain more knowledge.

@1awesomeJ
Copy link
Contributor

So journald has a varlink IPC API already, see server_open_varlink() in src/journal/journald-server.c. I think best would be to build on that:

  1. extend this varlink api with a new call io.systemd.Journal.Subscribe() or so, which should initially just take a single log level argument, and which will reply with a notificition reply every time a new message with the specified log level or higher is processed by journald. this notification would not contain the message payload though, clients are expected to check for that in the journal files direc

@poettering, @bluca , Good morning sirs.

I've been trying to implement this, but I'm not quite clear on it.
I added the "Io.systemd.Journal.Subscribe" to the valink_server_bind_method_many() call,
binding the call to a function I named vl_subscribe_var()

The implementation of this vl_subscribe_var() method is where I've gotten stuck.

I understand this method should take a single argument which is the log level a service wants to subscribe to: emerg, warning, debug etc.

But I'm not sure what it's prototype should be.

I have considered:
static int vl_subscribe_var( Varlink *link, JsonVariant *parameters, VarlinkMethodFlags flags, void *userdata); based on precedence in the codebase.
But:

  1. I'm not sure if all those parameters are needed

  2. I haven't quite pictured how a service like systemd-bsod will call this API, what would it be calling and how, how does it pass the log level as parameters since most of these Varlink functions take many parameters that aren't even strings.

  3. How does the vl_subscribe_var() fetch the log level and what does it do with it.

I would appreciate some sort of break down on this sirs.

@1awesomeJ
Copy link
Contributor

@poettering
Good afternoon sir.
Please I'm having issues implementing your guidance on the journald varlink API. I do not know how to implement the subscribe method in journald, I do not know how to get journald to send out notifications when it processes messages of the specified log level.

The much I could implement is in this PR:
#28420

Kindly take a look in your spare time sir.#
Thank you.

@bluca
Copy link
Member Author

bluca commented Aug 3, 2023

I think we should change strategy a bit and simplify things. The idea of having journald send out notifications saves from having a running service that watches it, but puts a bit more burden in journald itself. It is also more complex.

So instead, I suggest to simplify and change the bsod utility to get a new mode, where it waits for messages. This is very simple to do, you can see it from the example here: https://github.com/systemd/systemd/blob/main/man/journal-iterate-wait.c
It could gain a --wait option or so, that make sit wait instead of just checking if there is any message and exit. And a unit file to run it.

@1awesomeJ
Copy link
Contributor

Okay sir.

As a first approach at least.

Then the Varlink API at some point in the future.

@bluca
Copy link
Member Author

bluca commented Aug 4, 2023

Is the description above enough to get you started or do you need more details?

@1awesomeJ
Copy link
Contributor

😁😁😁
This is such an exciting question sir.
Thank you.

I haven't looked at the resource today, I'll probably do later in the night.

But I don't want to miss the chance of getting more guidance especially as we head into the weekend.

So yes sir I do need more details 😁😁

This is such an exciting question from you sir.
Thank you.

@1awesomeJ
Copy link
Contributor

@bluca
I finally saw your face, and heard your voice here:
https://youtu.be/aygDMB23U2s

It's awesome 😁😁😁

@1awesomeJ
Copy link
Contributor

Two things sir:
Please call me by name - "Josh" a few times before we end this internship it'll be awesome.
2. About the --wait option, what I picture is if someone ran ./systemd-bsod --wait, then we'll invoke the sd_journal_wait() call.
3. But now that we're running it as with a unit, is there a way to pass that --wait option in the unit file? I only know of setting ExecStart=path
But on taking another look now, I see we can actually do Execstart=systemd-bsod --wait
4. So everytime journald processes a new log message, regardless of the log level, the call to sd_journal_wait() returns and the iteration continues, doesn't that impact our resources footprint, running that many iterations. I estimate a few thousands per boot session.
5. Am I over estimating?

@1awesomeJ
Copy link
Contributor

If I'm picturing this correctly, This approach does simplify things by a factor of 100!😂😂

I do hope though that @poettering still finds time to come guide on those complex approach. The knowledge from those four things he wants us to implement would be awesome to gain.

@1awesomeJ
Copy link
Contributor

@bluca,
Good evening sir.
I added the --wait option here:
#28697

kindly review at your convenience sir.

Thank you.

@1awesomeJ
Copy link
Contributor

@bluca,

Good morning sir.

Kindly provide guidance on these sir:

->audit error level logs that are relevant for early boot failures, and bump log level up
->for each such log messages, have an entry in the message catalog, and that the entry has an URL

  1. They're outstanding tasks on my internship.
  2. They address my concern of our QR code in systemd-bsod needing to contain more than just the string already displayed. Having a url would be great.

I know your schedule may be very busy, But for as long as you can spare time to guide sir, I'll be available to refine these features(battery check and bsod) as much as possible.

Thank you sir.

@bluca
Copy link
Member Author

bluca commented Aug 15, 2023

->for each such log messages, have an entry in the message catalog, and that the entry has an URL

It's easier to start from this - you can look around for log_emergency() and log_emergency_errno() existing calls, and add a catalogue entry first. Then we can add a new page on docs/ so that it gets published on systemd.io and can list the same text as the catalogue entries

@1awesomeJ
Copy link
Contributor

->for each such log messages, have an entry in the message catalog, and that the entry has an URL

It's easier to start from this - you can look around for log_emergency() and log_emergency_errno() existing calls, and add a catalogue entry first. Then we can add a new page on docs/ so that it gets published on systemd.io and can list the same text as the catalogue entries

Okay sir.

I'm coming back to ask of the details.

I saw @poettering is back, let me go express my excitement first!!!

@1awesomeJ
Copy link
Contributor

@bluca,
I've gone through "log.c" and "log.h".
I attempted to trace the variables passed to log_emergency() for instance to find where it gets used, so I could understand better how to add a catalogue entry, and how it'll be processed.

I traced the calls like so: log_emergency(r, "emergency : %m") -> log_full(LOG_EMERG, r, "emergency: %m") (if PID =1),
then log_full_errno_zerook(LOG_EMERG, 0, r, "emergency: %m") then log_internal()thenlog_internalv()thenlog_dispatch_internal()thenwrite_to_journal()thenlog_do_header(), log_do_context(), and sendmsg()```

I didn't quite see any reference to the catalogue entry sir.

What I picture is that each message we process with log_emergency() and log_emergency_errno() should have some sort of message ID, which will be used to create an entry for them in the catalogue of all our journal messages, and so with this ID, we can fetch a particular message from the journal, and such a fetch would give us much details about the message including possible fixes to the problems they report.
This fetching from the catalogue could also somehow be done with an url, such that when that url is opened, a web page with full details on the emergency message is gotten.

But, first, I'm not even sure if what I picture is accurate as per your guidance sir. If it is, or close, how do I find the message catalogue, how do I amend existing log_emergency() calls to make sure those messages get pushed to the catalogue.

Sorry I'm verbose!

@1awesomeJ
Copy link
Contributor

Or could it be that you're not asking me to modify the existing calls to the logging functions,
But rather just add the messages in those calls as entries in the message catalog.
I.e where there's a call like log_emergency(r, "Failed to locate shared memory")
What you're saying is that I should add the message "Failed to locate shared memory" as an entry in the journal catalog?

  1. Is that right sir? If so:
  2. How do I find the journal catalog to see it's current form?
  3. How do I add new entries to it?

Thank you sir.

@1awesomeJ
Copy link
Contributor

I just noticed there are some functions in the sd-journal API for the message catalogue. I'll look them up.
Perhaps catalog_list_items() will be useful for fetching all existing entries?

@1awesomeJ
Copy link
Contributor

I see this entry:
-- f77379a8490b408bbe5f6940505a777b
Subject: The journal has been started
Defined-By: systemd
Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

The system journal process has started up, opened the journal
files for writing and is now ready to process requests.

  1. Does it mean every log message we generate should have the following fields: (i) SD_ID128 ID, (ii) Subject (iii) Defined-By (iv) support url (v) message description.

  2. On the org.freedesktop webpage, the purpose of the catalog is stated as follows:
    The message catalog has a number of purposes:

(a)Provide the administrator, user or developer with further information about the issue at hand, beyond the actual message text
(b) Provide links to further documentation on the topic of the specific message
(c) Provide native language explanations for English language system messages
(d) Provide links for support forums, hotlines or other contacts
If the fields listed in (1) above are all there are, then only (a) and (d) seem to be taken care of, (b) and (c) still need their fields added.

  1. Does this mean I have to manually check all calls to log_emergency() and log_emergency_errno() to add entries for messages which currently do not have an entry in the catalog? How would I go about that?

  2. Does this also mean that all those "A.C power messages" from systemd-battery-check and log messages for Failed function calls that I've so far made, should also have their own entries in the journal ideally? Or would it be more expedient to just do for emergency level messages?

@1awesomeJ
Copy link
Contributor

This entry seems to be quite cool, it has everything but the native language explanations:

-- a596d6fe7bfa4994828e72309e95d61e
Subject: Messages from a service have been suppressed
Defined-By: systemd
Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Documentation: man:journald.conf(5)

A service has logged too many messages within a time period. Messages
from the service have been dropped.

Note that only messages from the service in question have been
dropped, other services' messages are unaffected.

The limits controlling when messages are dropped may be configured
with RateLimitInterval= and RateLimitBurst= in
/etc/systemd/journald.conf. See journald.conf(5) for details.

@bluca
Copy link
Member Author

bluca commented Aug 17, 2023

Don't worry about translations, that is done separately. Catalog entries are added to catalog/systemd.catalog.in, you need to look at the emergency level messages, and for each one generate a new uuid with systemd-id128 new and add a catalog entry, and then change the logging call to include the id as a MESSAGE_ID so that they are tied together. This might require some modification to the log_emergency() and log_emergency_errno() functions

@1awesomeJ
Copy link
Contributor

Don't worry about translations, that is done separately. Catalog entries are added to catalog/systemd.catalog.in, you need to look at the emergency level messages, and for each one generate a new uuid with systemd-id128 new and add a catalog entry, and then change the logging call to include the id as a MESSAGE_ID so that they are tied together. This might require some modification to the log_emergency() and log_emergency_errno() functions

Okay sir.
It should be fun.

I'll reach out whenever I'm stuck.

@1awesomeJ
Copy link
Contributor

@bluca,

Currently, I see 23 calls total to both log_emergency() and log_emergency_errno

None of the messages are currently in "catalog/systemd.catalog.in"

This is the approach I'm considering:

step 1:
On my machine, I'll run systemd-id128 new I'll get an output like ba55201f2aaa47c7860460a14b13f457

step 2:
I'll go to src/systemd/sd-messages.h, and add a new macro like so:
#define SD_MESSAGE_FAILED_SELINUX_LOAD SD_ID128_MAKE(ba, 55, 20, 1f, 2a, aa, 47, c7, 86, 04, 60, a1, 4b, 13, f4, 57)

step 3:
I'll go to catalog/system.catalogue.in and add an entry like so:

-- ba55201f2aaa47c7860460a14b13f457
Subject: Failed to load SELinux policy
Defined-By: systemd
Support: %SUPPORT_URL%

Systemd was unable to load the SELinux policy for a system that has "enforce" set to true,
You may try the following to fix:
1. Set SELinux status to "permissive";
2. Restore your file context;
3. Rebuild your SELinux Policy.

(The last part will take lots of inputs during the PR review).

Is that all I need do as far as adding catalog entries for the messages is concerned?

@bluca
Copy link
Member Author

bluca commented Aug 17, 2023

You also need to change log_emergency*() so that it uses a message-id when logging, and takes it as a parameter, so that each call can specify the ID you are adding

@1awesomeJ
Copy link
Contributor

Okay sir.
I'll hope to get started on that early tomorrow so I can get some guidance before the weekend.

@bluca bluca linked a pull request Aug 18, 2023 that will close this issue
@bluca bluca unpinned this issue Sep 1, 2023
@yuwata yuwata added the bsod label Oct 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bsod journal 🙋🏽‍♀️ outreachy 🙋 pid1 RFE 🎁 Request for Enhancement, i.e. a feature request
5 participants