Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: PyroscopeAgent.stop #149

Merged
merged 9 commits into from
Apr 24, 2024
Merged

feat: PyroscopeAgent.stop #149

merged 9 commits into from
Apr 24, 2024

Conversation

korniltsev
Copy link
Collaborator

No description provided.

Copy link

@aleks-p aleks-p left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, I left a few minor remarks.

Something to think about now (and maybe add separately later) is how users interact with the start / stop methods. Right now, they do not return anything or throw exceptions which means user code has no way of knowing if they succeeded or not and whether one can continue interacting with the API.

logger.log(Logger.Level.ERROR, "Failed to start profiling - already started");
return;
}
if (sFailed) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be a bit restrictive if we have intermittent failures (one attempt to start fails but the next one could succeed). We could keep it this way though and make changes if this becomes relevant in the future.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at first I thought it was a good idea, but then while writing a test I immediately stumbled into this check , with no way around.

I removed sFailed check alltogether.

Thanks

return;
}
if (sFailed) {
logger.log(Logger.Level.INFO, "Not start profiling due to recent failure");
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a nit, maybe this should be logged at ERROR level for consistency with the above? Also, "Will not start profiling due to recent failure" might be more clear.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to ERROR

logger.log(Logger.Level.INFO, "Profiling started");
} catch (final Throwable e) {
sFailed = true;
logger.log(Logger.Level.ERROR, "Error starting profiler %s", e);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should probably set sOptions to null here, otherwise a subsequent start would say "profiling already started" which is not the case

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree, set to null

public static void stop() {
synchronized (sLock) {
if (sOptions == null) {
return;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps add a log statement here for consistency?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the problem here is that we dont have sOptions.logger, so we have to write to stderr, which is not ideal.

Added DefaultLogger.PRECONFIG_LOGGER.Log

return;
}
try {
sOptions.scheduler.stop();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could add a log statement here as well, to know that the profiler was stopped

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

Comment on lines 69 to 71
if (svc != null) {
awaitTermination(svc);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want this inside the synchronized block? Otherwise we can have another executor active while the old one is shutting down (which is maybe fine?)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left the comment in the code, and copypaste it here as well

        // shutdown here not under lock to avoid deadlock ( the task may block to wait for lock and
        // we are holding the lock and waiting for task to finish)
        // There is still synchronization happens from the PyroscopeAgent class,
        // so there are no concurrent calls to start/stop. So there is no lock here

Does it make sense?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, the synchronization in PyroscopeAgent will prevent the scenario I described. Someone could potentially misuse this class in the future, but it is unlikely and the impact is low.

stopSchedulerLocked();
this.executor = null;
}
if (svc != null) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we need this check here, we probably need it inside stopSchedulerLocked() since we access the same reference there.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the nil check. I think it is not needed.
Because of the logic in PyroscopeAgent, if the scheduler.start() was not successful or PyroscopeAgent.Start() was not called at all, then scheduler.stop would never be called, so if we get a call to scheduler.stop, then we have an executor
I would appreciate if you could confirm or deny :D

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've also removed null check on job.cancel

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense to me :)

this.executor = Executors.newSingleThreadScheduledExecutor(THREAD_FACTORY);
this.job = executor.scheduleAtFixedRate(this::schedulerTick,
firstProfilingDuration.toMillis(), config.uploadInterval.toMillis(), TimeUnit.MILLISECONDS);
this.started = true;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps a logging statement here could be helpful

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

Comment on lines 46 to 95
throw new IllegalStateException(throwable);
logger.log(Logger.Level.ERROR, "Error stopping profiler %s", throwable);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This used to throw but now it doesn't, is this intentional?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure I understand, there were no stop functionality before, so it did not throw.

However I see the inconsistency, start throws IllegalStateException on profiler.start fail, and stop does not. I will try to make it consistent.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made stopSchedulerLocked to rethrow IllegalStateException in case this.profiler.stop(); fails.
Moved a bunch of logic to finally blocks.
PTAL

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad, the GitHub diff fooled me but you got my concern anyway :)


@Override
public void stop() {
ScheduledExecutorService svc = this.executor;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this field reading should be under lock

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved under lock

@korniltsev
Copy link
Collaborator Author

@aleks-p Thanks for review! I addressed your comments and also added a small test

stopSchedulerLocked();
this.executor = null;
}
if (svc != null) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense to me :)

Comment on lines 46 to 95
throw new IllegalStateException(throwable);
logger.log(Logger.Level.ERROR, "Error stopping profiler %s", throwable);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad, the GitHub diff fooled me but you got my concern anyway :)

Comment on lines 69 to 71
if (svc != null) {
awaitTermination(svc);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, the synchronization in PyroscopeAgent will prevent the scenario I described. Someone could potentially misuse this class in the future, but it is unlikely and the impact is low.

@korniltsev korniltsev merged commit 54fdc53 into main Apr 24, 2024
38 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants