Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Showing status of service via systemctl is slow (>10s) if disk journal is used #2460

Open
XANi opened this issue Jan 28, 2016 · 36 comments
Open

Comments

@XANi
Copy link

XANi commented Jan 28, 2016

With big (4GB, few months of logs) on-disk journal, systemctl status service becomes very slow

 (!) [13:37:30]:/var/log/journal☠ time  systemctl status nginx
* nginx.service - A high performance web server and a reverse proxy server
   Loaded: loaded (/lib/systemd/system/nginx.service; enabled; vendor preset: enabled)
   Active: active (running) since Thu 2016-01-28 08:25:28 CET; 5h 12min ago
 Main PID: 3414 (nginx)
   CGroup: /system.slice/nginx.service
           |-3414 nginx: master process /usr/sbin/nginx -g daemon on; master_process on;
           |-3415 nginx: worker process
           |-3416 nginx: worker process
           |-3417 nginx: worker process
           `-3418 nginx: worker process

Jan 28 08:25:12 ghroth systemd[1]: Starting A high performance web server and a reverse proxy server...
Jan 28 08:25:28 ghroth systemd[1]: Started A high performance web server and a reverse proxy server.

real    0m12.505s
user    0m0.016s
sys 0m0.056s
 (!) [13:35:14]:/var/log/journal☠ du -h --max=1
4,2G    ./46feef66e59512fcd99c7ddc00000108
4,2G    .
 (!) [13:40:53]:/var/log/journal☠ strace systemctl status nginx 2>&1 |grep open |grep /var/log |wc -l
88

it is of course faster after it gets to cache... for that service, querying other one is still slow.

Dunno what would be right way to do it.. but opening ~80 log files to just display service status seems a bit excessive

@poettering
Copy link
Member

is this on hdd or ssd?

But yeah, we scale badly if we have too many individual files to combine. It's O(n) with each file we get...

@XANi
Copy link
Author

XANi commented Jan 28, 2016

This one was on HDD, but now that I've looked into it it can be a bit (~0.5s, HDD swap, 300MB of logs on tmpfs) slow even with tmpfs if part that was loaded happened to be swapped out (I was testing on system with 2 weeks of uptime)

Shouldn't there be some kind of index on journal files ? Or at the very least pointer in service entry to last log file that have relevant log entries .

@davispuh
Copy link

davispuh commented Oct 1, 2016

I think some kind of journal indexing is required because it's unbearably slow.

Right now I've 5411 (43 GiB) journal files and

$ time -p journalctl -b --no-pager > /dev/null
real 13.61
user 13.37
sys 0.22

it takes 13 seconds to just check current boot log while it's already cached in RAM.

When it's not cached

# echo 3 > /proc/sys/vm/drop_caches
$ time -p journalctl -b --no-pager > /dev/null
real 69.29
user 13.51
sys 12.34

this is on 2x 3TB HDD with RAID1 btrfs.

@XANi
Copy link
Author

XANi commented Oct 1, 2016

It is laggy even when it is on tmpfs and machine runs long enough to swap it out.

Why journald doesn't just use SQLite for storage ? It would be faster and other apps could actually use logfiles for something useful and have good query language instead of relying on a bunch of options in journalctl

@XANi
Copy link
Author

XANi commented Jun 21, 2017

It is still slow as hell:

time systemctl status kdm|cat
* kdm.service - LSB: X display manager for KDE
   Loaded: loaded (/etc/init.d/kdm; generated; vendor preset: enabled)
   Active: active (exited) since Wed 2017-06-21 12:03:26 CEST; 1h 42min ago
     Docs: man:systemd-sysv-generator(8)
    Tasks: 0 (limit: 4915)
   CGroup: /system.slice/kdm.service

Jun 21 12:03:25 ghroth systemd[1]: Starting LSB: X display manager for KDE...
Jun 21 12:03:26 ghroth systemd[1]: Started LSB: X display manager for KDE.

real	0m2.873s
user	0m0.008s
sys	0m0.016s

and it opens over hundred (on system that was up 2 hours)

strace systemctl status kdm 2>&1 |grep open |wc -l
133

@vcaputo
Copy link
Member

vcaputo commented Jul 7, 2017

@XANi @davispuh is there any chance you could run your slow cases under valgrind --tool=callgrind and supply us with the the output? I've seen some pathological cases where journalctl is spending exorbitant (~50% of its execution) amounts of CPU time in the siphash24 code when the simple context cache in the mmap-cache is experiencing high miss rates, I'm curious if that's also something you guys are observing here.

@davispuh
Copy link

davispuh commented Jul 8, 2017

with systemd 233.75-3 on ArchLinux

callgrind.out.systemctl-status-sshd.gz
callgrind.out.journalctl-b--no-pager.gz

@vcaputo
Copy link
Member

vcaputo commented Jul 8, 2017

@davispuh Thank you for quickly providing the profiles. The callgrind.out.systemctl-status-sshd.gz profile shows mmap_cache_get() somewhat prominently with the hashmap maintenance being a significant part of that.

It's not a panacea, but #6307 may improve the runtime of systemctl status sshd for you. Would you be up for some testing? Some time systemctl status sshd --no-pager comparisons before and after would be great.

For anybody reading, the kcachegrind utility works quite well for visualizing callgrind.out files.

@davispuh
Copy link

davispuh commented Jul 9, 2017

Emm, I can't test this anymore since after I compiled and reinstalled systemd it reset my /etc/systemd/journald.conf settings and so systemd deleted all old journals and it's fast now again. Basically it's slow only when you've a lot of journal files in /var/log/journal/

@davispuh
Copy link

davispuh commented Oct 5, 2017

With just complied from fdb6343

When files aren't cached it's really unusably slow

$ time systemctl status sshd --no-pager
0.28user 34.82system 4:34.32elapsed 12%CPU (0avgtext+0avgdata 1355416maxresident)k
3242224inputs+0outputs (10002major+17658minor)pagefaults 0swaps

2nd time when it's cached it's quick

$ time systemctl status sshd --no-pager
0.09user 0.33system 0:00.43elapsed 99%CPU (0avgtext+0avgdata 1303540maxresident)k
0inputs+0outputs (0major+23705minor)pagefaults 0swaps

callgrind when it's not cached and when it is

callgrind.out.systemctl-status-sshd_nocache.gz
callgrind.out.systemctl-status-sshd_cached.gz

I've 7542 *.journal files in /var/log/journal/<ID> which is on BTRFS RAID1 partition (2x 3TB HDD)

Basically to improve performance need to do less disk reading. Like use some kind of indexing or something like that.

@amishmm
Copy link

amishmm commented Mar 31, 2018

I have 240 system log files and 860 user log files.

systemctl status OR journalctl -f take 2-4 minutes just to display logs. (HDD drive)

I have added this in: /usr/lib/systemd/journald.conf.d/90-custom.conf

[Journal]
SystemMaxUse=100G
SystemMaxFileSize=200M
SystemMaxFiles=1100
MaxFileSec=1week
MaxRetentionSec=3year

Systemd generates 2 to 3 system journal files everyday each of about 150994944 bytes size.

Why doesnt journalctl -f (or systemctl) check only latest / current journal?

How do I make it efficient and fast? I need to preserve logs for long duration.

In most cases most people only have to check recent logs only.

May be some feature to have automatic archival of logs in different directory (/var/log/journal/ID-DIRECTORY/archive) and current logs (say past 3-7 days) kept in /var/log/journal/ID-DIRECTORY?

This will speed up journalctl and systemctl status a lot. Anyone want to check archived logs can use --directory option of journalctl

@Gunni
Copy link

Gunni commented Apr 20, 2018

I have the same problem, i'm on a vmware vm on a hdd san

example: time systemctl status systemd-journald.service

real    0m50.484s
user    0m0.070s
sys     0m3.492s

journal size is: 101.1GB right now.

@amishmm
Copy link

amishmm commented Jun 5, 2018

journalctl has --file parameter. I am able to use it to search faster.

while:
journalctl -f took ages

journalctl --file /var/log/journal/*/system.journal -f takes just 1-2 seconds.

Similarly can we make systemctl status use ONLY system.journal by default? Because in most cases administrator check status only after systemctl (re)start or when something unexpected happens (which is also likely to be logged in system.journal unless it rotated just recently)

This will drastically speed things up.

If admin wants older status he can supply --lines=N in which case systemctl will scan through older journals too. (may be in reverse order) OR admin can use journalctl -n N -u service instead

PS: I have no idea how data is stored in journal.

@poettering do u want me to create RFE for this?

PPS: Now every time I run systemctl status it becomes as good as DoS attack!

@Gunni
Copy link

Gunni commented Jun 5, 2018

Yes this is a problem for servers storing their logs.

My biggest problem is the centralized log server, i receive logs from network equipment using rsyslog, and it uses omjournal to pipe them directly into journal. It works fine to begin with but then degrades quickly (note i'm doing this as a test server, we have another server where rsyslog writes to files).

Maybe journal files could be made to contain specific timespans, and only get loaded if requested, i use stuff like --since and --until a lot but since they are affected by this, it doesn't help, but if the file method were used, journalctl could find them in fractions of seconds, -f should just be instant really, find the last 10 entries, and only search backwards in time if nothing happened maybe with a cutoff.

@XANi
Copy link
Author

XANi commented Jun 6, 2018

@amishxda IMO status by default should just... not have to touch on-disk logs, ever. Storing few last lines of log in memory per service wouldn't be that memory hungry, wouldn't trash OS disk cache on status and it would return something more useful than "welp, logs were rotated, too bad" that it currently does when server runs for some time.

It should only fetch more log lines from journalctl if explictly requested by user

@Gunni I don't think using systemd for centralized log server is intended, or good idea in the first place. ELK stack is much more useful for it, jankiness aside Logstash allows to do a lot of nice stuff, for example we use it to split iptables logs into fields (srcip/dstip/port etc) before putting in ES. And ES just works better for search

@Gunni
Copy link

Gunni commented Jun 6, 2018

@XANi I just expected it to be a supported use case since systemd-journal-remote is a thing but if this is the expected performance, ofcourse. But if it is improved, maybe add a better index or a database backend to journal then i'd be able to use it.

About the tools you mentioned, setting up all that stuff sounds like much more work, especially since we like to be able to watch the logs live journalctl SRC=net -f (maybe with a grep), the second they arrive, my expirience with what you mentioned has been long delay from event to user. But i'll look into it.

@XANi
Copy link
Author

XANi commented Jun 7, 2018

@Gunni I wish journald would just use sqlite instead of its current half-assed binary db. I feel like currently it is just trying to reimplement that but badly. And there is a plenty of tools for querying sqlite already.

ELK stack is definitely more effort to put in but in exchange it has a ton of nice features to look at, we for example made logstash do geoIP lookup on any IP that's not private so each firewall log gets that info added.

Querying is also very nice as you can make a queries on fields directly instead of text strings

@otisg
Copy link

otisg commented May 4, 2020

I found out about this issue via https://www.reddit.com/r/linuxadmin/comments/gdfi4t/how_do_you_look_at_journald_logs/ . Is this still a problem?

@XANi
Copy link
Author

XANi commented May 4, 2020

@otisg as of systemd 241 (that's just version I have on machine that actually keeps logs on disk), it is most definitely still a problem (all timing tests done right after dropping cache, ~800MB in journal):

$ time systemctl status collectd  2>&1 >/dev/null
real	0m5,921s
user	0m0,004s
sys	0m0,180s

$ strace -f    systemctl status collectd  2>&1 |grep open |grep /var/log/journal |wc -l
721

now the fun part :

find /var/log/journal |wc -l

Yes, you are reading this right, getting a last few lines of a currently running service opens every single fucking entry in the journal dir

Now if I do just

$ time grep -R collectd /var/log/*log >/dev/null

real	0m0,370s
user	0m0,010s
sys	0m0,009s

This shit manages to be order of magnitude slower than "just bruteforce grepping last logrotate's worth of text logs. This is amount of retardness that's fucking mind-boggling. The sheer fact developers decided to go with binary format yet not bother by even introducing time or service-based sharding/indexing and just bruteforce every existing file is just insane.

It is like someone, instead of considering reasonable options like, dunno:

  • naming log files by hash of the service name so it never needs to open more than 1/n files
  • naming log files by service name (or service group) so there is only 1 file needed to open, ever for recent stuff and more is needed only for history (you know, like logrotate + syslog combo ends up being)
  • using actual well tested database like SQLite and just create few bigger files with index/TOC

They decided one evening "you know, I always wanted to make a binary logging format", then got bored after few weeks and never touched it again.

@otisg
Copy link

otisg commented May 5, 2020

Ouch. I had not realized things were so slow. So I'm amazed why so many people at https://www.reddit.com/r/linuxadmin/comments/gdfi4t/how_do_you_look_at_journald_logs/ said they consume journal logs via journalctl. Are they all OK with slowness?!? Why don't more people get their logs out of journal, centralize them in an on-prem or SaaS service that is faster?

Anyhow, I see some systemd developers here. I wonder if they plan on integrating something like https://github.com/tantivy-search/tantivy ...

@PAStheLoD
Copy link

The first invocation is slow. It probably goes through the journal files and checks them. Then things are fast for a while.

journald is not great for log management, and it's simply not fit for any kind of centralized log management at scale. But it's very likely not a goal of the systemd project to handle even that too. The journal is a necessity, just as pid1, udev, and network setup (for remote filesystems) to manage a Linux system and its services reliably.

That said it's entirely likely that with a few quick and dirty optimizations this could be worked around. (Eg. if journal files are not in cache, don't wait for them when showing status; allow streaming the journal without looking up the last 10 lines, persisting some structures to speed up journal operations, enabling unverified journal reads by default, etc.)

@vcaputo
Copy link
Member

vcaputo commented Oct 27, 2020

@XANi Are you still suffering from these very slow uncached systemctl status collectd invocations?

If so, curious if you'd be willing to try out a build with #17459 applied and report back the results, thanks.

@XANi
Copy link
Author

XANi commented Oct 27, 2020

@vcaputo Correct me if I'm wrong but that patch only searches in current boot id, so if system is running for a long time it wouldn't change anything ?

I have encountered the problem on server machines in the first place (and on personal NAS) so in almost every case the current boot ID is only one in the logs. It certainly would help on desktop but that's not where I hit the problem in the first place (also AFAIK most desktop distros don't have /var/log/journal by default so journalctl doesn't log to HDD in the first place).

The other problem is that current implementation makes it really easy for one service to swamp the logs to the point you lose any other service's logs. It is especially apparent for services that don't emit much logs, and even tho there is zero actual logs for the service (as they get rotated out) it still takes about the same amount of time as for any other service.

For reference I have it happening on machine with just 1GB of journal and last few days of logs there.

@vcaputo
Copy link
Member

vcaputo commented Oct 27, 2020

@XANi Yes you're right, if all the journals are from the same boot the early exit never occurs.

FTR it already matches the current boot id, the patch just moves the order around. The change assumes there are probably multiple boots represented in the archived journals.

@amstan
Copy link

amstan commented Jan 3, 2021

Sorry if I'm stating the obvious, but I might not be understanding this right.
While I might understand that reading multi GB of systemd binary logs will take a little bit of time, and stuff that's cached will probably come instantly.

But this is a common pattern for me and I still have to wait, even though the logs were produced mere seconds ago.

alex@hypertriangle:~% date
Sun Jan  3 03:18:30 EST 2021
alex@hypertriangle:~% time sudo systemctl restart ntpd  
0.21s real  0.01s user  0.02s system  14% cpu  6kB mem $ sudo systemctl restart ntpd
alex@hypertriangle:~% time sudo systemctl status ntpd 
● ntpd.service - Network Time Service
     Loaded: loaded (/usr/lib/systemd/system/ntpd.service; enabled; vendor preset: disabled)
     Active: active (running) since Sun 2021-01-03 03:18:34 EST; 5s ago
    Process: 152305 ExecStart=/usr/bin/ntpd -g -u ntp:ntp (code=exited, status=0/SUCCESS)
   Main PID: 152308 (ntpd)
      Tasks: 2 (limit: 4621)
     Memory: 1.4M
     CGroup: /system.slice/ntpd.service
             └─152308 /usr/bin/ntpd -g -u ntp:ntp

Jan 03 03:18:34 hypertriangle.com ntpd[152308]: Listen and drop on 0 v6wildcard [::]:123
.....snip....
Jan 03 03:18:34 hypertriangle.com systemd[1]: Started Network Time Service.
5.58s real  0.01s user  0.36s system  6% cpu  41kB mem $ sudo systemctl status ntpd

5.6 seconds to output a few lines that were generated a few seconds ago by the restart call.

True, another call immediatelly after is instant, but by then it's too late.

alex@hypertriangle:~% time sudo systemctl status ntpd
● ntpd.service - Network Time Service
     Loaded: loaded (/usr/lib/systemd/system/ntpd.service; enabled; vendor preset: disabled)
     Active: active (running) since Sun 2021-01-03 03:18:34 EST; 41s ago
    Process: 152305 ExecStart=/usr/bin/ntpd -g -u ntp:ntp (code=exited, status=0/SUCCESS)
   Main PID: 152308 (ntpd)
      Tasks: 2 (limit: 4621)
     Memory: 2.2M
     CGroup: /system.slice/ntpd.service
             └─152308 /usr/bin/ntpd -g -u ntp:ntp

Jan 03 03:18:34 hypertriangle.com ntpd[152308]: Listen and drop on 0 v6wildcard [::]:123
.....snip....
Jan 03 03:18:34 hypertriangle.com systemd[1]: Started Network Time Service.
0.42s real  0.01s user  0.04s system  13% cpu  41kB mem $ sudo systemctl status ntpd
alex@hypertriangle:~% date
Sun Jan  3 03:18:53 EST 2021

This is particularly annoying when you don't know why some service didn't restart properly and checking up on it essentially punishes you with the delay.

@vcaputo
Copy link
Member

vcaputo commented Jan 3, 2021

If re-running systemctl status returns immediately, it's purely the cost of reading the journals from storage in the uncached case that's taking so long.

Unfortunately, substantially improving the uncached read performance is probably going to require changes to the journal file format. There might be some gains to be found in filesystem tuning, like trying to tune /var/log/journal's filesystem for granular mmap-oriented pagefault-based reads in random-ish access patterns. I haven't personally explored that avenue, but it might be worth exploring.

One reason uncached performance sucks is most journal accesses are far smaller than a 4KiB page, especially in search-like operations as systemctl status performs. But the accesses are fulfilled entirely via address space mappings setup via mmap(), so even when the code only tries to read 8 bytes to see some object's offset or seqnum, at least 4KiB get faulted in to fulfill the access; that's the granularity of memory mappings. If the accesses were performed with explicit read() calls, maybe it could be whittled down to 512-byte loads, assuming storage having 512-byte sector sizes on a complementarily configured filesystem, but then syscall overhead would be painful, an io_uring implementation might be interesting on that front. (though I think the linux page cache probably would still turn this into 4KiB reads, would have to verify, IIRC everything inode goes through the address_space struct in smallest units of pages)

Due to the substantial lack of clustering in the format to try ensure faulted-in pages bring in operation-relevant information in the baggage of the 4KiB+ page fault of something like the access of an 8-byte offset or seqnum, most of the baggage often isn't even immediately useful (though it's at least now warm in the page cache for subsequent journal operations). Due to the combination of the tiny objects and the inherent interleaving of types of objects in a largely appended-as-arrived layout, faulted in pages tend to not contain stuff immediately needed by the faulting operation.

It's not that the format lacks indexing as many have complained, as indicated by the cached performance; the results are immediate when not waiting for page faults from backing store. It tends to more often be a sort of read-amplification effect of the append-mostly tiny objects interleaved by type loaded in units of pages that just ends up reading a whole lot more crap than is relevant to the operation at hand, when uncached.

@HLFH
Copy link

HLFH commented Jan 4, 2021

Frankly, the search on my IMAP server (Dovecot) search was broken (too slow), and when I started to use a full-text search indexing plugin that uses ElasticSearch as a backend, it was so fast! Cannot systemctl & journalctl use ElasticSearch indexing capabilities in conjunction to the redesign of the journal file format as advised by @vcaputo?

@vcaputo
Copy link
Member

vcaputo commented Jan 4, 2021

In case it's unclear: systemd-journald is not anything remotely resembling a persistent database daemon, it's not really comparable to database "backends".

Whenever journalctl or systemctl status processes must access journal data, they do so in total isolation by faulting journal file pages into their own address space directly from the filesystem. This is sort of like a cold read-only database startup on behalf of every journal-accessing process. There is no persistent daemon servicing these clients, pre-warming the journal data and/or trying to keep prioritized parts of it memory resident.

I'm not the designer or decision maker of these things, but it seems clear based on what I've gleaned working with the code that this is all deliberate in an attempt to keep the overall constant journald footprint foisted on every systemd host to a minimum.

I imagine if we added a persistent database daemon to the architecture's read-side, the handful of people complaining about cold cache performance will be replaced with people complaining about some journal database daemon wasting resources on things which rarely occur.

Frankly, if people are generally satisfied with the cached performance of these commands, and the featureset offered, you could just schedule a systemctl status $foo or something (journalctl --verify? cat /var/log/journal/*/* > /dev/null?) to occur regularly as a cache warmer in lieu of a persistent database daemon serving readers and call it Good Enough.

For more elaborate search features, can't people already bolt on more sophisticated indexers entirely external to journald as-is?

@davispuh
Copy link

davispuh commented Jan 4, 2021

Frankly, if people are generally satisfied with the cached performance of these commands, and the featureset offered, you could just schedule a systemctl status $foo or something (journalctl --verify? cat /var/log/journal// > /dev/null?) to occur regularly as a cache warmer in lieu of a persistent database daemon serving readers and call it Good Enough.

It's not really practical if you keep logs for a really long time. Imagine 2 years worth of logs, it's pointless to have them cached. When you need only last month usually. But maybe once a year you want to look more in back.

For more elaborate search features, can't people already bolt on more sophisticated indexers entirely external to journald as-is?

This sounds like best solution. But I'm not aware of anything like that. Basically it looks like systemd journal can't handle our needs so we need something better but it would be really great if it could be plugged in journald so that we don't need to learn new tools. eg. use systemctl status, journalctl independently if there's a separate indexing/storage backend.

@marcor
Copy link

marcor commented Jan 5, 2021

For the record this is the reason I first considered switching to distros not using systemd, and now that I understand and see how the root cause is still here after many years, I'm reassured that it was a good decision after all. I'm talking about desktop environments with small size log files and relatively slow hardware.

@XANi
Copy link
Author

XANi commented Jan 8, 2021

I'm not the designer or decision maker of these things, but it seems clear based on what I've gleaned working with the code that this is all deliberate in an attempt to keep the overall constant journald footprint foisted on every systemd host to a minimum.

Well, except that the way it does it trashes cache. Cached performance is attained only in condition where a ton of useless logs are loaded into memory only to find last few lines

I imagine if we added a persistent database daemon to the architecture's read-side, the handful of people complaining about cold cache performance will be replaced with people complaining about some journal database daemon wasting resources

I imagine slapping an index that just stored "latest journal file service is in" would solve ~99% issues with getting status here.

I imagine if it had semi-sensible system of splitting logs between filenames (via name hash or whatever) would cut it down by order of magnitude or two, altho probably generate more random IO so probably not a tradeoff worth taking.

Current performance is worse than tailing a logfile.

on things which rarely occur.

Nonsense. Anything under configuration management and proper monitoring will have that run often, from once or twice per hour to multiple times per minute. So at best wasting cache on some files you don't even pick any data from.

@mbiebl
Copy link
Contributor

mbiebl commented Jan 8, 2021

When you run systemctl status from your monitoring system, you are most likely not really interested in the last few journal lines and you could use systemctl status --lines=0 to disable journal output.

@XANi
Copy link
Author

XANi commented Jan 8, 2021

@mbiebl if service is working, sure. If service isn't, I'm VERY interested in last few lines. Same for monitoring checks really, but I guess there it is always possible to do systemctl is-active xyz || systemctl status instead

But, if I'm checking in the first place, there is a good chance something is wrong with it so not that useful for CLI work.

@amstan
Copy link

amstan commented Jan 12, 2021

I imagine slapping an index that just stored "latest journal file service is in" would solve ~99% issues with getting status here.

Yes please, this is the solution. And I agree with the 99% number.

Current performance is worse than tailing a logfile.

True.

@jhass
Copy link

jhass commented Jan 12, 2021

Could systemctl status not flush the header and then block while fetching the log? Then one could just Ctrl-C it when not interested in the log anymore after checking the status.

halfdime-code added a commit to halfdime-code/efs-utils that referenced this issue Nov 29, 2022
systemctl has known performance issues with large numbers of journal files.

See systemd/systemd#2460.

Adding --lines=0 argument to call improves call speed when systemctl is called
for the first time. If this first call happens from mount.efs, this may result
in high latency for the mount as it waits for all of the journal files to be
indexed. Given that mount.efs is only looking for the exit code of the command
and does not care about the journal entries, there is no loss of usability by
adding the flag.
halfdime-code added a commit to halfdime-code/efs-utils that referenced this issue Dec 1, 2022
systemctl has known performance issues with large numbers of journal files.

See systemd/systemd#2460.

Changing to use `is-active` to improve call speed when systemctl is called. This
avoids first systemctl call issues that may result in high latency for the mount
as systemctl indexes all journal files. Given that mount.efs is only looking for
the exit code of the command and does not care about the journal entries, there
is no loss of usability by adding the flag.
Cappuccinuo pushed a commit to aws/efs-utils that referenced this issue Dec 2, 2022
systemctl has known performance issues with large numbers of journal files.

See systemd/systemd#2460.

Changing to use `is-active` to improve call speed when systemctl is called. This
avoids first systemctl call issues that may result in high latency for the mount
as systemctl indexes all journal files. Given that mount.efs is only looking for
the exit code of the command and does not care about the journal entries, there
is no loss of usability by adding the flag.
@mildred
Copy link

mildred commented May 4, 2023

Still having this problem on a HDD with journald. Using btrfs with nodatacow and frequent defragment. Journal files are few (63 files) and not fragmented, but this is unbareably slow. edit: I initially told that writes were slow but this is another issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests