Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix callback registration for the RunState on_stop handler #513

Merged
merged 2 commits into from Apr 21, 2020

Conversation

Kami
Copy link
Contributor

@Kami Kami commented Apr 21, 2020

After #495 has been merged, I started to notice errors like this in the log:

2020-04-21 14:36:53.500Z ERROR [core] [scalyr_logging.py:603] [error="failedAgentMain"] Main run method for agent failed due to exception :stack_trace:
  Traceback (most recent call last):
    File "scalyr_agent/agent_main.py", line 992, in __run
      config_change_check_interval
    File "/home/kami/w/scalyr/scalyr-agent-2/scalyr_agent/util.py", line 708, in sleep_but_awaken_if_stopped
      self._wait_on_condition(timeout)
    File "/home/kami/w/scalyr/scalyr-agent-2/scalyr_agent/util.py", line 792, in _wait_on_condition
      self.__condition.wait(timeout)
    File "/home/kami/.pythonz/pythons/CPython-3.6.9/lib/python3.6/threading.py", line 299, in wait
      gotit = waiter.acquire(True, timeout)
    File "/home/kami/w/scalyr/scalyr-agent-2/scalyr_agent/platform_posix.py", line 627, in handle_interrupt
      handle_terminate(signal_num, frame)
    File "/home/kami/w/scalyr/scalyr-agent-2/scalyr_agent/platform_posix.py", line 641, in handle_terminate
      self.__termination_handler()
    File "scalyr_agent/agent_main.py", line 597, in __handle_terminate
      self.__run_state.stop()
    File "/home/kami/w/scalyr/scalyr-agent-2/scalyr_agent/util.py", line 751, in stop
      callback()
  TypeError: 'NoneType' object is not callable

It turns out, we didn't correctly register the callback - instead of registering the callable (function) we actually immediately called the function during registration phase and passed return value of that function to the register handler (that function returned None).

Instead of registering the callable, we invoked the function and as such
registered None.
@Kami Kami requested a review from yanscalyr April 21, 2020 14:46
@Kami
Copy link
Contributor Author

Kami commented Apr 21, 2020

Ideally, we would also have some unit tests for it, but it might be a bit more involved (@yanscalyr - can you please look into it).

On that note - it looks like CodeSpeed indeed caught this regression / bug:

Screenshot from 2020-04-21 16-47-32

Right now I manually check the dashboard every so often so we would have caught this issue sooner or later, but in the future we should probably also try to come up with a more automated approach.

@codecov
Copy link

codecov bot commented Apr 21, 2020

Codecov Report

Merging #513 into master will increase coverage by 0.02%.
The diff coverage is 94.11%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #513      +/-   ##
==========================================
+ Coverage   72.95%   72.97%   +0.02%     
==========================================
  Files         118      118              
  Lines       28253    28270      +17     
  Branches     3381     3381              
==========================================
+ Hits        20611    20631      +20     
+ Misses       7030     7029       -1     
+ Partials      612      610       -2     
Impacted Files Coverage Δ
scalyr_agent/agent_main.py 56.77% <ø> (+0.71%) ⬆️
scalyr_agent/tests/util/util_test.py 97.38% <94.11%> (-0.09%) ⬇️
scalyr_agent/builtin_monitors/syslog_monitor.py 80.41% <0.00%> (-0.48%) ⬇️
.../builtin_monitors/tests/kubernetes_monitor_test.py 96.71% <0.00%> (ø)
scalyr_agent/util.py 86.06% <0.00%> (+0.27%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d0707de...0efa670. Read the comment docs.

Copy link
Contributor

@yanscalyr yanscalyr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually had this fixed locally but I guess I got distracted by something because I never checked it in... My bad.

Copy link
Contributor

@yanscalyr yanscalyr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually had this fixed locally but I guess I got distracted by something because I never checked it in... My bad.

@Kami Kami merged commit 03ca114 into master Apr 21, 2020
@Kami Kami deleted the fix_on_stop_error branch April 21, 2020 21:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants