Skip to content

yasutakatou/IMS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

IMS

Incident management tool with slack.

v0.2

  • add hotline mode
    • Mention a specific message to make it an incident.
  • add check target id
    • You can specify the ID to check for messages.

v0.3

  • It can now be defined by name or channel name instead of ID.
    • Pre) U024ZT3BHU5 After) adminuser

v0.4

  • add Reacji Channeler mode.
    • Added support for visualized messages using Reacji Channeler.

20220409

Add how to respond to Private channel.

note) No change in code.

v0.5

  • I can now report mixed A messages.

image

v0.6

  • Added reminder function and ability to delete remarks on reminder channels.

image

v0.7

  • Fixed a bug in Reacji mode that prevented the system from responding to text-only messages.

v0.8

  • Allow defining user IDs to be forwarded by Reacji.

v0.9

  • Added the ability to check the speaker in the incident management channel.

v0.91

  • Alerts with referring URLs now forward content to the default channel
    • With Reaacji Channeler , it is difficult to understand the content of forwarded alerts, so we have made the forwarding at least include the content of those with links.

v0.92

  • Added mode to reverse definition behavior
    • The reverse action can be specified by prefixing the alert rule with "!".

v1.0

  • add No incident management mode
  • add No-reminder mode

Solution

As the center of communication at work has been replaced from e-mail to chat, you may have changed the alert notification destination of your monitoring tool to chat. But hasn't it changed that people are reading the all messages and making decisions and responses?
Let's change that!
This tool enables easy incident management through chat, accelerating ChatOps!!

Feature

This tool has three major functions.

  1. check the posted messages according to the rules, label them if they fit the rules, and repost them to the incidents management channel.

2

It is also possible to post actions that do not fit the reverse rule

  1. Check if there are any reactions in the incidents management channel, and output the ones that are not there.

1

  1. Periodically post a list of unsupported alerts to report channel.

image

In other words, you can use the following

  1. Invite this tool to the channel you are throwing the monitoring message into. The tool checks in the all messages
  2. Periodically, the report will run and display a list of messages that have not been reacted to, so check the unacted ones and leave a history of your teams action in the thread.

This makes it possible to

  1. Identify unanswered alerts
  2. Filter for known messages
  3. Your team can keep a history of responses to alerts.

All this can be done on slack!

installation

If you want to put it under the path, you can use the following.

go get github.com/yasutakatou/IMS

If you want to create a binary and copy it yourself, use the following.

git clone https://github.com/yasutakatou/IMS
cd IMS
go build .

or download binary from release page. save binary file, copy to entryed execute path directory.

uninstall

delete that binary. del or rm command. (it's simple!)

set up

Please follow the steps below to set up your environment.

  1. set tool like bot.
  • goto slack api
  • Create New(an) App
    • define (Name)
    • select (Workspce)
    • Create App
  • App-Level Tokens
    • Generate Token and Scopes
    • define (Name)
    • Add Scope
      • connections:write
    • Generate
      • Make a note of the token that begins with xapp-.
    • Done
  • Socket Mode
    • Enable Socket Mode
      • On
  • OAuth & Permissions
    • Scopes
    • Bot Token Scopes
      • channels:history
      • chat:write
      • files:write
      • reactions:write
      • users:read
        • v0.3 If you want to use -idlookup mode, you also need to define the following
          • channels:read
          • groups:read
          • im:read
          • mpim:read
    • Install to Workspace
    • Bot User OAuth Token
      • Make a note of the token that begins with xoxb-.
  • Event Subscriptions
    • Enable Events
      • On
    • Subscribe to bot events
    • Add Bot User Event
      • message.channels
    • Save Changes
  1. on Slack App
    • invite bot
      • @(Name)
    • invite

note) Bot have them participate in all the channels where you want to collect incidents.

20220409

If you want to use Private channnel, add the following settings

  • OAuth & Permissions
    • Scopes
    • Bot Token Scopes
      • groups:history
    • Install to Workspace
  • Event Subscriptions
    • Subscribe to bot events
    • Add Bot User Event
      • message.groups
    • Save Changes
  1. your OS terminal
    • set environment
      • windows
        • set SLACK_APP_TOKEN=xapp-...
        • set SLACK_BOT_TOKEN=xoxb-...
      • linux
        • export SLACK_APP_TOKEN=xapp-...
        • export SLACK_BOT_TOKEN=xoxb-...
    • run this tool

v0.4) set up for Reacji Channeler

about Reacji Channeler

Reacji Channeler
Slack 用リアク字チャンネラー

Define a forwarding reaction for the channel that collects the incidents.

image

If the rule is met, it will be automatically marked and forwarded.

3
4

Reports with links will go up on the channel for reporting.

image

The mode of -reverse is also supported

image
image
image

note) If you are in mode A, you will not be able to add tags to your report.

usecase

  1. What alert messages will you respond to? Decide with your team what alert messages you will respond to, or ignore. -> config [Label]
  2. Decide on a channel for message retrieval, a channel for incident management, and a channel for reporting. -> config [Incidents]
  3. ecide which reaction mark will be used to mark the item as handled. -> config [Label]
  4. Define the channel for report output.-> config [Report]

config file

Configs format is tab split values. The definition is ignore if you put sharp(#) at the beginning.

auto read suppot

config file supported auto read. so, you rewrite config file, tool not necessaly rerun, tool just this.

[Rules]

Define rules for detecting messages.

[Rules]
.*Error.*	.*:.*:.*	[RuleX]	CHANNEL1	Hot1
  1. strings define (can use regex.) note: The meaning of the string to be included.
  2. Date and time range (can use regex.)
  3. Give this label to messages that match the rule. (you use to analyze messages.)
  4. channel label. If detect rule, post message to channel defined.
  5. Mention it and make it an incident. Define the name of the [Hotline] label.

note) Date Format is "2006/01/02 15:04:05 Mon(-Sun)".
If you want to detect message include "Fault" and every day at 10:00-12:00, rule is

.*Fault.* .*/.*/.* 1[0-2]:.*:.* .*

note) not only single define but can write plural rules.

v0.92

.*Debug.*	!.*:.*:.*	[Debug]	CHANNEL1	No

The reverse action can be specified by prefixing the alert rule with "!".

note) Note that when used at the same time as option -reverse, the opposite is true.

[Incidents]

This config for incidents managed channel.

[Incidents]
CHANNEL1	C025FKF3QJV	20
  1. label for channel.
  2. channnel id for Incident manage.
  3. Number of message to go back reference.

note) 3. is too big, check more slowly..
note) not only single define but can write plural rules.
note) v0.3: You can also specify a channel name instead of an ID.

Special Definition

In the case of -reverse mode, it defines the default incident registration destination when all the rules are not match.

DEFAULT	C025FKF3QJV	[Alert]
  1. "DEFAULT" is static define.
  2. channnel id for Incident manage.
  3. message is use this header.

note) v0.3: You can also specify a channel name instead of an ID.

[Label]

Define which reactions are marked as resolved.

note) This page is a good reference for what marks can be used.

[Report]

Define the channel for report output.
The default cycle is once a day, but you can change it with option -loop.

note) v0.3: You can also specify a channel name instead of an ID.

[PostID]

Messages from the ID defined here will be checked.

note) not only single define but can write plural IDs.
note) You can also specify the ID of the bot.
note) v0.3: You can also specify a user name instead of an ID.

image

[Hotline]

Defines the destination for mailed incidents.

Hot1	U024ZT3BHU5	here
  1. label for define. 2-. mention ids

note) not only single define but can write plural rules.
note) Slack user ID or here, channnel, everyone can be defined.
note) v0.3: You can also specify a user name instead of an ID.

[Reacji]

Define which reactions for Reacji Channeler.
Forward the incident to the channel that collects it with this definition.

note) This page is a good reference for what marks can be used.

warning

[Reminder]

This function periodically picks up unaddressed incidents and notifies you of them.
Set the channel and time to be notified.

note) The first part is the channel name.
note) Specify the time you want to be notified in tab-delimited format using a regular expression.

alert	.*1.*	.*2.*

note) In the above example, We'll keep you posted on Channel A from 10-24.
note) not only single define but can write plural rules.

[ReacjiID]

The user ID defined here will be transferred by Reacji. Specifies primarily webhook bots.

datadog

note) not only single define but can write plural rules.

[MgmtReport]

Added the ability to check the speaker in the incident management channel.
Messages from the ID defined here will be checked and output to channel for report.
If empty, all submissions are forwarded to the reporting channel.

[MgmtReport]
U024ZT3BHU5

Similar to [PostID], but this function was implemented because there was a request that it would be better to handle manager's instructions as Issues when used by a team.

note) not only single define but can write plural IDs.
note) You can also specify the ID of the bot.
note) You can also specify a user name instead of an ID.

example

[Rules]
.*Error.*	.*:.*:.*	[RuleX]	CHANNEL1	Hot1
.*Warn.*	.*:.*:.*	[RuleX]	CHANNEL1	No
[Incidents]
CHANNEL1	C025FKF3QJV	20
DEFAULT	C025FKF3QJV	[Alert]
[Label]
white_check_mark
[Report]
C0256BTKP54
[PostID]
U024ZT3BHU5
[Hotline]
Hot1	U024ZT3BHU5	here

v0.3

[Rules]
.*Error.*	.*:.*:.*	[RuleX]	CHANNEL1	Hot1
.*Warn.*	.*:.*:.*	[RuleX]	CHANNEL1	No
.*Info.*	.*:.*:.*	[RuleX]	CHANNEL1	No
.*Debug.*	.*:.*:.*	[RuleX]	CHANNEL1	No
[Incidents]
CHANNEL1	incidents	20
DEFAULT	incidents	[Alert]
[Label]
white_check_mark
[Report]
report
[PostID]
ims
adminuser
[Hotline]
Hot1	adminuser	here

v0.4

[Rules]
.*Error.*	.*:.*:.*	[Error]	CHANNEL1	Hot1
.*Warn.*	.*:.*:.*	[Warn]	CHANNEL1	No
.*Info.*	.*:.*:.*	[Info]	CHANNEL1	No
.*Debug.*	.*:.*:.*	[Debug]	CHANNEL1	No
[Incidents]
CHANNEL1	incidents	20
DEFAULT	incidents	[Alert]	
[Label]
white_check_mark
[Report]
rep
[PostID]
user
adminuser
[Hotline]
Hot1	adminuser	here
[Reacji]
warning

v0.6

[Rules]
.*Error.*	.*:.*:.*	[Error]	CHANNEL1	Hot1
.*Warn.*	.*:.*:.*	[Warn]	CHANNEL1	No
.*Info.*	.*:.*:.*	[Info]	CHANNEL1	No
.*Debug.*	.*:.*:.*	[Debug]	CHANNEL1	No
[Incidents]
CHANNEL1	incidents	20
DEFAULT	incidents	[Alert]	
[Label]
white_check_mark
[Report]
rep
[PostID]
user
adminuser
[Hotline]
Hot1	adminuser	here
[Reacji]
warning
[Reminder]
alert	.*1.*	.*2.*

v0.8

[Rules]
.*Error.*	.*:.*:.*	[Error]	CHANNEL1	Hot1
.*Warn.*	.*:.*:.*	[Warn]	CHANNEL1	No
.*Info.*	.*:.*:.*	[Info]	CHANNEL1	No
.*Debug.*	.*:.*:.*	[Debug]	CHANNEL1	No
[Incidents]
CHANNEL1	incidents	20
DEFAULT	incidents	[Alert]	
[Label]
white_check_mark
[Report]
rep
[PostID]
user
adminuser
[Hotline]
Hot1	adminuser	here
[Reacji]
warning
[Reminder]
alert	.*1.*	.*2.*
[ReacjiID]
datadog

v0.9

[Rules]
.*Error.*	.*:.*:.*	[Error]	CHANNEL1	Hot1
.*Warn.*	.*:.*:.*	[Warn]	CHANNEL1	No
.*Info.*	.*:.*:.*	[Info]	CHANNEL1	No
.*Debug.*	.*:.*:.*	[Debug]	CHANNEL1	No
[Incidents]
CHANNEL1	incidents	20
DEFAULT	incidents	[Alert]	
[Label]
white_check_mark
[Report]
rep
[PostID]
user
adminuser
[Hotline]
Hot1	adminuser	here
[Reacji]
warning
[Reminder]
alert	.*1.*	.*2.*
[ReacjiID]
datadog
[MgmtReport]

options

  -auto
        [-auto=config auto read/write mode (true is enable)] (default true)
  -clearReminder
        [-clearReminder=clear reminder channel and exit mode.]
  -config string
        [-config=config file)] (default "IMS.ini")
  -debug
        [-debug=debug mode (true is enable)]
  -idlookup
        [-idlookup=resolve to ID definition (true is enable)] (default true)
  -log
        [-log=logging mode (true is enable)]
  -loop int
        [-loop=incident check loop time (Hour). ] (default 24)
  -noincident
        [-noincident=No incident management mode.]
  -noreminder
        [-noreminder=No-reminder mode.]
  -onlyReport
        [-onlyReport=incident check and exit mode.]
  -reacji
        [-reacji=Slack: reacji channeler mode (true is enable)]
  -reminder int
        [-reminder=Interval for posting reminders (Seconds). ] (default 30)
  -reverse
        [-reverse=check rule to reverse (true is enable)]
  -test string
        [-test=Test what happens when you set the message.]
  -verbose
        [-verbose=incident output verbose (true is enable)]

-auto

config auto read/write mode.

-clearReminder

Turn off messages in the Reminders channel.

-config

Specify the configuration file name.

-debug

Run in the mode that outputs various logs.

-idlookup

When enabled, it converts channel names and user names into IDs.

-log

Specify the log file name.

-loop

Interval between incidents checks (in Hours). Default is 24 Hour.

-noincident

No incident management mode.

-noreminder

No-reminder mode.

-onlyReport

If this option is specified, the tool will exit after the incidents report.

note) This can be used if you want to check the incident report manually.

-reacji

Activate the mode that uses Reacji Channeler.

-reminder int

The interval at which to check for reminders.

note) Units are in seconds.

-reverse

all check rules to reverse

1

It will be reversed as follows

2

note) Rules in hotline mode will be made incidental even if they are reversed.

-test

If this option is specified, the tool will exit after the message check.
This can be used if you want to check the message check manually.

>IMS -test="Error test"

[Test] Error test
this message include rule (1)!

note) The number in parentheses () indicates the order of rules that have been matched.

-verbose

Displays not only unsolved messages, but also solved ones.

[message] norml message [date] 2021/05/02 21:27:28
[message] test message [date] 2021/05/02 17:54:05

to

NG [message] norml message [date] 2021/05/02 21:27:28
OK [message] error and reactioned [date] 2021/05/02 21:25:40 [user]  yasutakato
NG [message] test message [date] 2021/05/02 17:54:05

note) In this mode, the name of the person who resolve will also be displayed.

license

Apache-2.0 License
BSD-2-Clause License
BSD-3-Clause License