-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Target not available #690
Closed
Closed
Target not available #690
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…ification (linkedin#618) * create endpoint to build and render messages * update sender tests
* change cache plan delete log level (linkedin#619) * skip error logging if iris bot fails to send messages to a channel it is not in (linkedin#621) * skip error logging if iris bot fails to send messages to a channel it is not in * removed redundant slack api call and added a warning * added a return statement to keep the event from being counted as an error in the metrics Co-authored-by: Damian Durruty <ddurruty@ddurruty-mn2.linkedin.biz> * rackspace webhook support (linkedin#405) * Start rackspace webhook impl by copying alertmanager * Plumb plan name through URL parameter for rackspace webhook * add test for rackspace webhook (WIP) * fix rackspace webhook test data * rackspace webhook should parse plan inside its class This detail is only relevant to the rackspace webhook class and doesn't belong in api.py. This just reverts the changes in api.py and adds support for parsing the plan in the rackspace webhook class. * deduplicate webhook code with class inheritance authored-by: Patrick Baxter <pb@coreos.com> * Add support for custom Slack formatting via attachments/blocks (linkedin#624) * Update iris_slack.py * Update iris_slack.py * bumnp version * fix tests and incident id insert in slack message * remove extra arguments (linkedin#626) * added twillio number override mechanism (linkedin#627) * skip error logging if iris bot fails to send messages to a channel it is not in * removed redundant slack api call and added a warning * added a return statement to keep the event from being counted as an error in the metrics * added twillio number override mechanism * minor change in case application_override_mapping is not defined in the default config Co-authored-by: Damian Durruty <ddurruty@ddurruty-mn2.linkedin.biz> Co-authored-by: ddurruty-li <85372760+ddurruty-li@users.noreply.github.com> Co-authored-by: Damian Durruty <ddurruty@ddurruty-mn2.linkedin.biz> Co-authored-by: Patrick Baxter <patrickbx@gmail.com> Co-authored-by: Luke Young <bored-engineer@users.noreply.github.com>
* change cache plan delete log level (linkedin#619) * skip error logging if iris bot fails to send messages to a channel it is not in (linkedin#621) * skip error logging if iris bot fails to send messages to a channel it is not in * removed redundant slack api call and added a warning * added a return statement to keep the event from being counted as an error in the metrics Co-authored-by: Damian Durruty <ddurruty@ddurruty-mn2.linkedin.biz> * rackspace webhook support (linkedin#405) * Start rackspace webhook impl by copying alertmanager * Plumb plan name through URL parameter for rackspace webhook * add test for rackspace webhook (WIP) * fix rackspace webhook test data * rackspace webhook should parse plan inside its class This detail is only relevant to the rackspace webhook class and doesn't belong in api.py. This just reverts the changes in api.py and adds support for parsing the plan in the rackspace webhook class. * deduplicate webhook code with class inheritance authored-by: Patrick Baxter <pb@coreos.com> * Add support for custom Slack formatting via attachments/blocks (linkedin#624) * Update iris_slack.py * Update iris_slack.py * bumnp version * fix tests and incident id insert in slack message * remove extra arguments (linkedin#626) * added twillio number override mechanism (linkedin#627) * skip error logging if iris bot fails to send messages to a channel it is not in * removed redundant slack api call and added a warning * added a return statement to keep the event from being counted as an error in the metrics * added twillio number override mechanism * minor change in case application_override_mapping is not defined in the default config Co-authored-by: Damian Durruty <ddurruty@ddurruty-mn2.linkedin.biz> * remove experimental-sender changes from master * Update __init__.py * check for active mailing_list target (linkedin#630) * use app's category default if user's not defined * Iris-message-processor cluster management * incorporate suggestions * remove redundant line * remove internal app allowlist * Update __init__.py * remove non-inclusive language (linkedin#639) * remove non-inclusive language (linkedin#635) * master -> leader in config (linkedin#637) * remove non-inclusive language * master -> leader in config * restore sphinx required master_doc (linkedin#638) * remove non-inclusive language * master -> leader in config * restore sphinx required master_doc Co-authored-by: ddurruty-li <85372760+ddurruty-li@users.noreply.github.com> Co-authored-by: Damian Durruty <ddurruty@ddurruty-mn2.linkedin.biz> Co-authored-by: Patrick Baxter <patrickbx@gmail.com> Co-authored-by: Luke Young <bored-engineer@users.noreply.github.com>
* include custom sender addresses for build_message * add endpoint to fetch plan aggregation settings
…kedin#656) * return device ids with target contact * support multi-recipient msgs in external sender
* return device ids with target contact * support multi-recipient msgs in external sender * build messages for dynamic plans * remove unused variables
* return device ids with target contact * support multi-recipient msgs in external sender * build messages for dynamic plans * remove unused variables * link ui to external sender if enabled * fix wording of comment * add external sender into incident target search
* return device ids with target contact * support multi-recipient msgs in external sender * build messages for dynamic plans * remove unused variables * link ui to external sender if enabled * fix wording of comment * add external sender into incident target search * forward twilio deliver status to external sender * handle external message responses * set X-IRIS-INCIDENT header
* still display incidents if sender can't be reached * add ecternal sender peer count endpoint * flake8
linkedin#671) * still display incidents if sender can't be reached * add ecternal sender peer count endpoint * flake8 * split external incident & notification hadling cfg
* change cache plan delete log level (linkedin#619) * skip error logging if iris bot fails to send messages to a channel it is not in (linkedin#621) * skip error logging if iris bot fails to send messages to a channel it is not in * removed redundant slack api call and added a warning * added a return statement to keep the event from being counted as an error in the metrics Co-authored-by: Damian Durruty <ddurruty@ddurruty-mn2.linkedin.biz> * rackspace webhook support (linkedin#405) * Start rackspace webhook impl by copying alertmanager * Plumb plan name through URL parameter for rackspace webhook * add test for rackspace webhook (WIP) * fix rackspace webhook test data * rackspace webhook should parse plan inside its class This detail is only relevant to the rackspace webhook class and doesn't belong in api.py. This just reverts the changes in api.py and adds support for parsing the plan in the rackspace webhook class. * deduplicate webhook code with class inheritance authored-by: Patrick Baxter <pb@coreos.com> * Add support for custom Slack formatting via attachments/blocks (linkedin#624) * Update iris_slack.py * Update iris_slack.py * bumnp version * fix tests and incident id insert in slack message * remove extra arguments (linkedin#626) * added twillio number override mechanism (linkedin#627) * skip error logging if iris bot fails to send messages to a channel it is not in * removed redundant slack api call and added a warning * added a return statement to keep the event from being counted as an error in the metrics * added twillio number override mechanism * minor change in case application_override_mapping is not defined in the default config Co-authored-by: Damian Durruty <ddurruty@ddurruty-mn2.linkedin.biz> * remove experimental-sender changes from master * Update __init__.py * check for active mailing_list target (linkedin#630) * use app's category default if user's not defined * remove internal app allowlist * Update __init__.py * remove non-inclusive language (linkedin#639) * remove non-inclusive language (linkedin#635) * master -> leader in config (linkedin#637) * remove non-inclusive language * master -> leader in config * restore sphinx required master_doc (linkedin#638) * remove non-inclusive language * master -> leader in config * restore sphinx required master_doc * db setup instruction (linkedin#646) the `schema_0.sql` contains `drop table` statement which could be not obvious. I experienced this issue when using a docker image with `DOCKER_DB_BOOTSTRAP=1` (which does a schema and dummy data), and wondering why I lost all my custom plans and templates, after iris container restart. Additionally provide alternative way to removing ONLY_FULL_GROUP_BY. This is useful when running mysql docker images. I.e. from oracla/mysql or bitnami mariadb images. This is easier than modifying the config file, or changing global server configuration when running in the container (i.e. docker or kubernetes). * use DB port from the config in image entrypoint (linkedin#650) * until now, 3306 was hardcoded * port in config file stays optional for backward compatibility Co-authored-by: Michal Zubac <michal.zubac@inuits.eu> * rebasing container image to ubuntu:20.04 (linkedin#649) * rebasing container image to ubuntu:20.04 * switching to Python v3 * which was driven by rsa>=3.1.4 -> oauth2client==1.4.12, it no longer supports Python v2 * adding ops/packer Makefile * to automate steps only found in README.md * done just for the docker image build * instructing packer to clear ENTRYPOINT from original ubuntu image * CMD was not working with the /bin/sh in place of ENTRYPOINT * fixes to actually make it work with Python 3.8 * correct plugin references for uwsgi * fix of execv inputs to match validation, which changed in Python 3.6 * added missing mysql client package, which is used in the entrypoint * not using 'gevent' parameter for uwsgi * it causes "DAMN ! worker N (pid: NNN) died :( trying respawn ..." * not sure why this happens for uwsgi+python3+gevent Co-authored-by: Michal Zubac <michal.zubac@inuits.eu> * add max len to webhook context too long response (linkedin#653) * Update __init__.py * Create __init__.py * Support multiple custom incident handler hooks (linkedin#664) Refactor handling of custom incident handlers to support multiple handlers. Retain compatibility with previous config syntax * Calculate incident priority for incident creation hook - Generate priority by looking for the highest severity priority across all of the plan notifications - Set this as a new priority field within the incident details as passed to the process_create() hook * Make application name in hook incident data support overwritten app name When creating test incidents within the API, the application given to the incident creation hooks is "iris" rather than the application being used. Fix this such that the hooks receive the overwritten application name. Also: move definition of `app` to the parent block in the function so it is more obvious this variable is used later on. * Don't prepend https:// to proxy hostnames (linkedin#668) Historically, this approach has always worked because http forward proxies generally only listen on http:// (not https://) and urllib3 has not supported connecting to a http proxy via https:// so it has always ignored the scheme. However, as of urllib3 >= 1.26 or so, urllib3 does support and attempt connecting to proxies via https:// (if this schem is provided) and it raises an exception if the proxy only listens on http:// Fix this by no longer enforcing a http:// prefix to proxy hostnames. If the user desires connecting to a https:// proxy, this prefix can be provided within Iris's configuration. * remove old custom_incident_handler_dispatcher Co-authored-by: ddurruty-li <85372760+ddurruty-li@users.noreply.github.com> Co-authored-by: Damian Durruty <ddurruty@ddurruty-mn2.linkedin.biz> Co-authored-by: Patrick Baxter <patrickbx@gmail.com> Co-authored-by: Luke Young <bored-engineer@users.noreply.github.com> Co-authored-by: Witold Baryluk <witold.baryluk+github@gmail.com> Co-authored-by: mighq <contact@mighq.net> Co-authored-by: Michal Zubac <michal.zubac@inuits.eu> Co-authored-by: Joe Gillotti <joe@u13.net> Co-authored-by: Joe Gillotti <jgillotti@linkedin.com>
* change cache plan delete log level (linkedin#619) * skip error logging if iris bot fails to send messages to a channel it is not in (linkedin#621) * skip error logging if iris bot fails to send messages to a channel it is not in * removed redundant slack api call and added a warning * added a return statement to keep the event from being counted as an error in the metrics Co-authored-by: Damian Durruty <ddurruty@ddurruty-mn2.linkedin.biz> * rackspace webhook support (linkedin#405) * Start rackspace webhook impl by copying alertmanager * Plumb plan name through URL parameter for rackspace webhook * add test for rackspace webhook (WIP) * fix rackspace webhook test data * rackspace webhook should parse plan inside its class This detail is only relevant to the rackspace webhook class and doesn't belong in api.py. This just reverts the changes in api.py and adds support for parsing the plan in the rackspace webhook class. * deduplicate webhook code with class inheritance authored-by: Patrick Baxter <pb@coreos.com> * Add support for custom Slack formatting via attachments/blocks (linkedin#624) * Update iris_slack.py * Update iris_slack.py * bumnp version * fix tests and incident id insert in slack message * remove extra arguments (linkedin#626) * added twillio number override mechanism (linkedin#627) * skip error logging if iris bot fails to send messages to a channel it is not in * removed redundant slack api call and added a warning * added a return statement to keep the event from being counted as an error in the metrics * added twillio number override mechanism * minor change in case application_override_mapping is not defined in the default config Co-authored-by: Damian Durruty <ddurruty@ddurruty-mn2.linkedin.biz> * remove experimental-sender changes from master * Update __init__.py * check for active mailing_list target (linkedin#630) * use app's category default if user's not defined * remove internal app allowlist * Update __init__.py * remove non-inclusive language (linkedin#639) * remove non-inclusive language (linkedin#635) * master -> leader in config (linkedin#637) * remove non-inclusive language * master -> leader in config * restore sphinx required master_doc (linkedin#638) * remove non-inclusive language * master -> leader in config * restore sphinx required master_doc * db setup instruction (linkedin#646) the `schema_0.sql` contains `drop table` statement which could be not obvious. I experienced this issue when using a docker image with `DOCKER_DB_BOOTSTRAP=1` (which does a schema and dummy data), and wondering why I lost all my custom plans and templates, after iris container restart. Additionally provide alternative way to removing ONLY_FULL_GROUP_BY. This is useful when running mysql docker images. I.e. from oracla/mysql or bitnami mariadb images. This is easier than modifying the config file, or changing global server configuration when running in the container (i.e. docker or kubernetes). * use DB port from the config in image entrypoint (linkedin#650) * until now, 3306 was hardcoded * port in config file stays optional for backward compatibility Co-authored-by: Michal Zubac <michal.zubac@inuits.eu> * rebasing container image to ubuntu:20.04 (linkedin#649) * rebasing container image to ubuntu:20.04 * switching to Python v3 * which was driven by rsa>=3.1.4 -> oauth2client==1.4.12, it no longer supports Python v2 * adding ops/packer Makefile * to automate steps only found in README.md * done just for the docker image build * instructing packer to clear ENTRYPOINT from original ubuntu image * CMD was not working with the /bin/sh in place of ENTRYPOINT * fixes to actually make it work with Python 3.8 * correct plugin references for uwsgi * fix of execv inputs to match validation, which changed in Python 3.6 * added missing mysql client package, which is used in the entrypoint * not using 'gevent' parameter for uwsgi * it causes "DAMN ! worker N (pid: NNN) died :( trying respawn ..." * not sure why this happens for uwsgi+python3+gevent Co-authored-by: Michal Zubac <michal.zubac@inuits.eu> * add max len to webhook context too long response (linkedin#653) * Update __init__.py * Create __init__.py * Support multiple custom incident handler hooks (linkedin#664) Refactor handling of custom incident handlers to support multiple handlers. Retain compatibility with previous config syntax * Calculate incident priority for incident creation hook - Generate priority by looking for the highest severity priority across all of the plan notifications - Set this as a new priority field within the incident details as passed to the process_create() hook * Make application name in hook incident data support overwritten app name When creating test incidents within the API, the application given to the incident creation hooks is "iris" rather than the application being used. Fix this such that the hooks receive the overwritten application name. Also: move definition of `app` to the parent block in the function so it is more obvious this variable is used later on. * Don't prepend https:// to proxy hostnames (linkedin#668) Historically, this approach has always worked because http forward proxies generally only listen on http:// (not https://) and urllib3 has not supported connecting to a http proxy via https:// so it has always ignored the scheme. However, as of urllib3 >= 1.26 or so, urllib3 does support and attempt connecting to proxies via https:// (if this schem is provided) and it raises an exception if the proxy only listens on http:// Fix this by no longer enforcing a http:// prefix to proxy hostnames. If the user desires connecting to a https:// proxy, this prefix can be provided within Iris's configuration. * remove old custom_incident_handler_dispatcher * process ldap and oncall syncs concurrently (linkedin#677) Co-authored-by: ddurruty-li <85372760+ddurruty-li@users.noreply.github.com> Co-authored-by: Damian Durruty <ddurruty@ddurruty-mn2.linkedin.biz> Co-authored-by: Patrick Baxter <patrickbx@gmail.com> Co-authored-by: Luke Young <bored-engineer@users.noreply.github.com> Co-authored-by: Witold Baryluk <witold.baryluk+github@gmail.com> Co-authored-by: mighq <contact@mighq.net> Co-authored-by: Michal Zubac <michal.zubac@inuits.eu> Co-authored-by: Joe Gillotti <joe@u13.net> Co-authored-by: Joe Gillotti <jgillotti@linkedin.com>
* use locks with api cache * add ability to filter incidents by claimed * use gevent lock directly
* use locks with api cache * add ability to filter incidents by claimed * use gevent lock directly * import gevent.lock explicitly in cache
* use locks with api cache * add ability to filter incidents by claimed * use gevent lock directly * import gevent.lock explicitly in cache * use gevent BoundedSemaphore instead of Semaphore
* use locks with api cache * add ability to filter incidents by claimed * use gevent lock directly * import gevent.lock explicitly in cache * use gevent BoundedSemaphore instead of Semaphore * init api cache immediately
* use locks with api cache * add ability to filter incidents by claimed * use gevent lock directly * import gevent.lock explicitly in cache * use gevent BoundedSemaphore instead of Semaphore * init api cache immediately * set iris client init log to debug
…into experimental-sender
Experimental sender
Closed
bilbof
added a commit
to bilbof/iris
that referenced
this pull request
Apr 11, 2022
The api makes use of [gevent], a coroutine based networking library which relies heavily on monkey patching the stdlib. From the [gevent.monkey] docs: > Warning Patching too late can lead to unreliable behaviour > (for example, some modules may still use blocking sockets) or even errors. This appears to have happened here. Thanks to @allwyn-pradip for pointing me at the right file in PR linkedin#690. Resolves linkedin#686, linkedin#699, linkedin#644. Blog on gevent: https://eng.lyft.com/what-the-heck-is-gevent-4e87db98a8 > In the case of gevent — monkey patching has to be the absolute first thing a process does [gevent]: https://www.gevent.org/index.html [gevent.monkey]: https://www.gevent.org/api/gevent.monkey.html
diegocepedaw
pushed a commit
that referenced
this pull request
Apr 11, 2022
The api makes use of [gevent], a coroutine based networking library which relies heavily on monkey patching the stdlib. From the [gevent.monkey] docs: > Warning Patching too late can lead to unreliable behaviour > (for example, some modules may still use blocking sockets) or even errors. This appears to have happened here. Thanks to @allwyn-pradip for pointing me at the right file in PR #690. Resolves #686, #699, #644. Blog on gevent: https://eng.lyft.com/what-the-heck-is-gevent-4e87db98a8 > In the case of gevent — monkey patching has to be the absolute first thing a process does [gevent]: https://www.gevent.org/index.html [gevent.monkey]: https://www.gevent.org/api/gevent.monkey.html
@allwyn-pradip you should be able to close this PR now that #703 is in. Thanks - this PR showed how to fix the issue. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This would fix the priority not available when creating a plan