Skip to content

End of night shutdown improvements.#407

Merged
jamessynge merged 9 commits intopanoptes:developfrom
wtgee:no-obs-clean-shutdown-390
Jan 26, 2018
Merged

End of night shutdown improvements.#407
jamessynge merged 9 commits intopanoptes:developfrom
wtgee:no-obs-clean-shutdown-390

Conversation

@wtgee
Copy link
Copy Markdown
Member

@wtgee wtgee commented Jan 22, 2018

A number of fixes and changes to account for situation when no valid
observations are found but we don't want to shut down the state machine.

  • If scheduler has a list of observations then it has_valid_observations.
    which is True when starting.

Process:

  • scheduling state calls observatory.get_observation()
  • get_observation first checks if has_valid_observations is False
    if so it will reread the field list.
  • get_observation either returns an observation or, if none found,
    calls clear_available_observations, making has_valid_observations = False.
  • If no observation is returned to scheduling state, send to parking (then to sleeping).
  • In sleeping, if has_valid_observations is False then sleep for an hour
    then try again (or shut down if run_once is specified).
    If has_valid_observations is True and it is safe and dark, shut down because
    of unknown error.

Other changes:

  • Remove reread_fields_file from dispatch scheduler as it is handled at
    the observatory level by observatory.get_observation.
  • Misc small fixes

⚠️ Note: I think I've gotten all the logic correct so that Bad Things™ won't happen. I think. :)

Closes #390

A number of fixes and changes to account for situation when on valid
observations are found but we don't want to shut down the state machine.

* If scheduler has a list of `observations` then it `has_valid_observations`.
	which is True when starting.

Process:
* `scheduling` state calls `observatory.get_observation()`
* `get_observation` first checks if `has_valid_observations is False`
	if so it will reread the field list.
* `get_observation` either returns an observation or, if none found,
	calls `clear_available_observations`, making `has_valid_observations = False`.
* If no observation is returned to `scheduling` state, send to `parking` (then to `sleeping`).
* In `sleeping`, if `has_valid_observations is False` then sleep for an hour
	then try again (or shut down if `run_once` is specified).
  If `has_valid_observations is True` and it is safe and dark, shut down because
  	of unknown error.

Other changes:
* Remove `reread_fields_file` from dispatch scheduler as it is handled at
the `observatory` level by `observatory.get_observation`.
* Misc small fixes
@wtgee wtgee requested a review from a team January 22, 2018 01:24
@codecov
Copy link
Copy Markdown

codecov bot commented Jan 22, 2018

Codecov Report

Merging #407 into develop will increase coverage by 0.05%.
The diff coverage is 83.63%.

Impacted file tree graph

@@            Coverage Diff             @@
##           develop    #407      +/-   ##
==========================================
+ Coverage    69.25%   69.3%   +0.05%     
==========================================
  Files           60      60              
  Lines         4774    4806      +32     
  Branches       659     665       +6     
==========================================
+ Hits          3306    3331      +25     
- Misses        1292    1298       +6     
- Partials       176     177       +1
Impacted Files Coverage Δ
pocs/scheduler/dispatch.py 98.03% <100%> (ø) ⬆️
pocs/observatory.py 83.66% <100%> (+0.14%) ⬆️
pocs/core.py 80.32% <100%> (+0.48%) ⬆️
pocs/scheduler/scheduler.py 96.07% <100%> (+1.39%) ⬆️
pocs/state/states/default/sleeping.py 62.5% <50%> (-37.5%) ⬇️
pocs/state/machine.py 88.27% <66.66%> (+0.08%) ⬆️
pocs/state/states/default/parked.py 75% <70.58%> (-25%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2cff654...e943bad. Read the comment docs.

Copy link
Copy Markdown
Contributor

@jamessynge jamessynge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, besides some nit and questions.

Comment thread pocs/state/states/default/sleeping.py Outdated

# If it is dark and safe we shouldn't be in sleeping state
if pocs.is_dark() and pocs.is_safe():
if pocs.observatory.scheduler.has_valid_observations and \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pep8 strongly prefers wrapping with parens around expressions rather than using a backslash.

Comment thread pocs/state/states/default/sleeping.py Outdated
if pocs.is_dark() and pocs.is_safe():
if pocs.observatory.scheduler.has_valid_observations and \
pocs.is_dark() and pocs.is_safe():
if pocs.should_retry is False:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Property should_retry is very problematic. It looks like a property, but accessing it modifies its value!

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thanks for calling this out. It's also problematic in that it is never reset without stopping and starting whole system. So if it retries once per night by the third night it is out of retry attempts no matter what. Will fix.

'name': 'New Name',
'position': '20h00m43.7135s +22d42m39.0645s',
'priority': 50
'priority': 5000
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, why the change in priority?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out too. I'm not sure doing what I did is quite what we want. Changing now.

Comment thread pocs/observatory.py Outdated
self.logger.debug("Getting observation for observatory")

# If observation list is empty or a reread is requested
if self.scheduler.has_valid_observations is False or \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pep8 strongly prefers wrapping with parens around expressions rather than using a backslash.

wtgee added 3 commits January 23, 2018 13:35
Note: This changes some of the state behavior for how parking and retrying
is handled. Housekeeping state will now only be called at the very end
of an observing run (either in the morning when unsafe or when too many
retry attempts). This moves some behavior from `sleeping` into `parked`.

* Fix the `should_retry` so it does not modify values
* Create a `reset_observing_run` method. Currenlty only resets number
of retries
* Other PR fixes
* Adding back the ability to reread fields file from the `get_observation`
call so that a user can do it manually rather than relying on an empty
observations list. But do it cleaner.
* Test the rereading of a fields with same entires only with a new higher
priority target
* Skip duplicate observation entries rather than stopping with assertion error.
@wtgee wtgee mentioned this pull request Jan 23, 2018
3 tasks
it for no reason and is also not testing what it says it should be testing.
@wtgee
Copy link
Copy Markdown
Member Author

wtgee commented Jan 23, 2018

I've removed a whole test in test_pocs.py that was hanging. Looking at the test I can't figure out how it was any different from the test below it. Based off of the name of the test, it was designed to be testing some of the interrupt code that a student added about a year ago (via Huntsman) and which has been removed during repository merge and cleanup because it needs some independent work.

Feel weird about just removing a test but I think it is the right course of action.

wtgee added 2 commits January 23, 2018 21:38
Explicitly clear observations for test of no targets
Comment thread pocs/observatory.py
kwargs.get('reread_fields_file', False)):
self.scheduler.read_field_list()

self.scheduler.get_observation(*args, **kwargs)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it should return something. Why not return what get_observation returns?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea to make it explicit, but get_observation actually also sets current_observation, which is then returned. Should maybe think about renaming this.

Comment thread pocs/state/machine.py
state_changed = self.goto_next_state()
except Exception as e:
self.logger.warning("Problem going to next state, exiting loop [{}]".format(e))
self.logger.warning("Problem going from {} to {}, exiting loop [{!r}]".format(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine for now, but it would be nice to have a recovery mechanism, so that the system automatically tries to get to mount-parked/dome-closed/state-sleeping if things go bad, and then resumes... or waits until the next noon to restart.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The state_changed is False it will do as you describe, which should be what happens under normal operations. We should probably do as you describe although I would want to do more testing and focus on it. Created #419 to address.

Comment thread pocs/state/states/default/parked.py Outdated
pocs.next_state = 'housekeeping'
has_valid_observations = pocs.observatory.scheduler.has_valid_observations

if (has_valid_observations and pocs.is_safe() is False):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parentheses not needed here.

Comment thread pocs/state/states/default/parked.py Outdated
if (has_valid_observations and pocs.is_safe() is False):
pocs.say("Cleaning up for the night!")
pocs.next_state = 'housekeeping'
elif (has_valid_observations and pocs.is_safe()):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excess parens.

Comment thread pocs/state/states/default/parked.py Outdated
pocs.next_state = 'housekeeping'
has_valid_observations = pocs.observatory.scheduler.has_valid_observations

if (has_valid_observations and pocs.is_safe() is False):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps:

if has_valid_observations:
  if not pocs.is_safe():
    ...
  else:  # pocs.is_safe
    ...
else: # not has_valid_observations
  ...

* cleaning up code so parking logic is clearer
@jamessynge jamessynge merged commit e2dd720 into panoptes:develop Jan 26, 2018
@wtgee wtgee deleted the no-obs-clean-shutdown-390 branch January 26, 2018 21:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

End of night shutdown

2 participants