Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use ps to count sessions instead of lsof #3511

Merged
merged 7 commits into from
Apr 22, 2024
Merged

Conversation

johrstrom
Copy link
Contributor

This uses ps to find the active sessions instead of lsof.

It should/could fix #3111.

@CSC-swesters
Copy link
Contributor

Thanks, this looks promising! I'll have time to test this later in the week and I can provide more feedback after that.

I'd like to see the lsof code removed at the same time. I quickly grepped around the nginx_stage directory's code, and to me it looks like it might still be used here:

https://github.com/OSC/ondemand/blob/nginx-session-count/nginx_stage/lib/nginx_stage/generators/nginx_show_generator.rb#L30-L35

I'm not well versed in Rails, so I don't see how that is used. But would it be possible to fix this in the same manner, i.e., using ps instead of lsof?

@johrstrom
Copy link
Contributor Author

I'm not well versed in Rails, so I don't see how that is used. But would it be possible to fix this in the same manner, i.e., using ps instead of lsof?

Sure thing! Updated in this PR.

module SessionFinder
def session_count(user)
`ps -o cmd -u #{user}`.split("\n").select do |command|
command.match?(/Passenger \w+App:/)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to leave comment about what this could match, as I presume you want to match like Passenger RubyApp: or Passenger NodeApp:. Would be interesting to see if this match works for non-Ruby and non-NodeJS apps, such as Python apps maybe.

@treydock
Copy link
Contributor

I'd recommend for the ps command to maybe do timeout 10 ps ... or setup a timeout some other way. There are cases where procfs can hang and it'd be good if that didn't also hang up nginx_stage so that when procfs does hang there aren't just a proliferation of ps commands hanging on the system forever.

@CSC-swesters
Copy link
Contributor

CSC-swesters commented Apr 16, 2024

I'd recommend for the ps command to maybe do timeout 10 ps ... or setup a timeout some other way. There are cases where procfs can hang and it'd be good if that didn't also hang up nginx_stage so that when procfs does hang there aren't just a proliferation of ps commands hanging on the system forever.

Yes, that's an excellent suggestion. +1 for that. If I'm not mistaken, these code paths are not part of the creation of new PUNs, right? So the only times it might hang should be when the nginx_stage nginx_clean command is run? (And the check_socket_for_active_sessions method, which I'll ask about below)

Sure thing! Updated in this PR.

Thanks! Would it still be possible @johrstrom to actually remove the old lsof method's code? It's supposed to be unused now, so in my opinion it's better to remove this (hopefully) dead code now, instead of leaving it in to cause confusion.

I still didn't understand whether the check_socket_for_active_sessions hook is used anywhere. Could you explain how this code is used? Or is it dead as well?

$ grep -r "check_socket_for_active_sessions"
nginx_stage/lib/nginx_stage/generators/nginx_show_generator.rb:    add_hook :check_socket_for_active_sessions do

If nothing uses it, maybe it would be better to remove this also altogether? 🙂

@@ -59,8 +62,9 @@ class NginxCleanGenerator < Generator
next if (user && user != u.to_s)
pid_path = PidFile.new NginxStage.pun_pid_path(user: u)
socket = SocketFile.new NginxStage.pun_socket_path(user: u)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that this would still run lsof as the SocketFile initializer runs @processes = get_processes, although the output itself is not used. There is the same case for nginx_show_generator.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is done. I believe @CSC-swesters also wanted it removed, which it is now.

Co-authored-by: treydock <tdockendorf@osc.edu>
@johrstrom
Copy link
Contributor Author

I still didn't understand whether the check_socket_for_active_sessions hook is used anywhere. Could you explain how this code is used? Or is it dead as well?

All hooks get run when you use a generator. So nginx_stage nginx_show --user jeff would trigger this hook.

So running that, I get something like this which is where you see those puts statements being issued in all the hooks from that generator.

[root@1e62ecd9be6d ~]# /opt/ood/nginx_stage/sbin/nginx_stage nginx_show --user jeff
User: jeff
Instance: 429
Socket: /var/run/ondemand-nginx/jeff/passenger.sock
Sessions: 1

Copy link
Contributor

@Oglopf Oglopf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@johrstrom johrstrom merged commit c5383fb into master Apr 22, 2024
23 checks passed
@johrstrom johrstrom deleted the nginx-session-count branch April 22, 2024 14:53
johrstrom added a commit that referenced this pull request May 10, 2024
Use the ps command to count sessions instead of lsof. This was
because lsof may cause issues in some environments with permissions
(like containers) and indeed may be quite slow.

So this changes that for simplicity and speed.
---------

Co-authored-by: treydock <tdockendorf@osc.edu>
@johrstrom johrstrom mentioned this pull request May 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

nginx_stage nginx_clean cleans up active PUN
6 participants