Skip to content
This repository has been archived by the owner on Sep 17, 2021. It is now read-only.

Replace AP Scheduler with Celery 馃ウ #911

Merged
merged 3 commits into from Jan 25, 2018
Merged

Replace AP Scheduler with Celery 馃ウ #911

merged 3 commits into from Jan 25, 2018

Conversation

mikegrima
Copy link
Contributor

@mikegrima mikegrima commented Jan 19, 2018

Migration away from APScheduler over to Celery tasks.

Need to:

  1. Complete the scheduling logic
  2. Update Unit Tests
  3. Update the documentation and include examples of running multiple SM workers.

Working on finishing touches. -- Also trying to properly fix for ephemeral detection as configured on each watcher.

@mikegrima mikegrima self-assigned this Jan 19, 2018
@Netflix Netflix deleted a comment from coveralls Jan 21, 2018
@Netflix Netflix deleted a comment from coveralls Jan 21, 2018
@Netflix Netflix deleted a comment from coveralls Jan 21, 2018
@Netflix Netflix deleted a comment from coveralls Jan 21, 2018
@mikegrima
Copy link
Contributor Author

Running internal testing now.

@mikegrima mikegrima changed the title EXPERIMENTAL WIP: Replace AP Scheduler with Celery 馃ウ WIP: Replace AP Scheduler with Celery 馃ウ Jan 22, 2018
@mikegrima
Copy link
Contributor Author

I've decided to make the ALB watcher (ELBv2)'s TargetGroupHealth ephemeral, since the output will include individual instance IDs -- which has a lot of churn.

@mikegrima
Copy link
Contributor Author

Hopefully we finally fixed the ephemerals...

@mikegrima mikegrima changed the title WIP: Replace AP Scheduler with Celery 馃ウ Replace AP Scheduler with Celery 馃ウ Jan 24, 2018
@mikegrima
Copy link
Contributor Author

mikegrima commented Jan 24, 2018

2 more things are needed before this is done:

  1. Fix jankiness with the SQS watcher (flips between deleted and present) (See Migrate SQS watcher to CloudAux聽#914 )
  2. Create docs for making an ElastiCache Redis queue for use with Celery.

@rayjanoka
Copy link

rayjanoka commented Jan 24, 2018

Hey @mikegrima,
You may have seen this already, but this script for docker that starts the scheduler will probably need to be updated...and also a solution to start redis for docker.
https://github.com/Netflix/security_monkey/blob/develop/docker/scheduler-start.sh

@mikegrima
Copy link
Contributor Author

@rayj-pgi Thank you for the reminder! Will need to add that to the list.

@mikegrima
Copy link
Contributor Author

The SQS flippage is due to the way that AWS handles throttling on the SQS API. It sends a 403, with the error code RequestThrottled, which is different than the other technologies.

I'm going to make a separate issue and PR to fix it by migrating it over to a CloudAux watcher.

@mikegrima mikegrima merged commit 94eeadd into Netflix:develop Jan 25, 2018
@imanandshah
Copy link

imanandshah commented Jan 26, 2018

The new change is crashing the scheduler

2018-01-26 12:59:52,886 INFO success: securitymonkeyscheduler entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-01-26 12:59:52,959 INFO exited: securitymonkeyscheduler (exit status 2; expected)
2018-01-26 12:59:53,962 INFO spawned: 'securitymonkeyscheduler' with pid 1686
2018-01-26 12:59:54,964 INFO success: securitymonkeyscheduler entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-01-26 12:59:55,037 INFO exited: securitymonkeyscheduler (exit status 2; expected)
2018-01-26 12:59:56,040 INFO spawned: 'securitymonkeyscheduler' with pid 1693
2018-01-26 12:59:57,041 INFO success: securitymonkeyscheduler entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-01-26 12:59:57,158 INFO exited: securitymonkeyscheduler (exit status 2; expected)

@imanandshah
Copy link

imanandshah commented Jan 26, 2018

monkey: error: invalid choice: 'start_scheduler' (choose from 'add_override_scores', 'backup_config_to_json', 'add_account_aws', 'create_user', 'sync_swag', 'add_account_openstack', 'add_override_score', 'delete_unjustified_issues', 'delete_account', 'run_change_reporter', 'enable_accounts', 'add_account_github', 'audit_changes', 'runserver', 'add_account_gcp', 'add_watcher_config', 'sync_jira', 'run_api_server', 'shell', 'db', 'fetch_aws_canonical_ids', 'drop_db', 'clean_stale_issues', 'find_changes', 'amazon_accounts', 'clear_expired_exceptions', 'disable_accounts')

@mikegrima
Copy link
Contributor Author

mikegrima commented Jan 26, 2018

@imanandshah Did you update your scheduler config?

It needs to call celery now. start_scheduler was removed.

Please see the new autostarting docs: https://github.com/Netflix/security_monkey/blob/develop/docs/autostarting.md

@mikegrima mikegrima deleted the celery branch February 7, 2018 23:10
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants