A plugin for logstash, written in ruby, that will enable to forward collected and processed data, by logstash input and filter plugins, to Application Insights Analytics Open Schema
Ruby
Clone or download

README.md

Microsoft Application Insights Output Plugin for Logstash

GitHub version Gem Version

  • This project is a plugin for Logstash.

  • This plugin have to be installed on top of the Logstash core pipeline. It is not a stand-alone program.

  • This plugin outputs events to Microsoft Application Insights Analytics open schema tables.

Plugin Features

Supported Logstash Versions

  • Logstash 2.3.2
  • Logstash 2.3.4
  • Logstash 2.4.0
  • Logstash 5.0.0

Note:

  • x64 Ruby for Windows is known to have some compatibility issues.
  • the plugin depends on azure-storage that depends on gem nokogiri, which doesn't support Ruby 2.2+ on Windows.

Setting up

Install Logstash

Install logstash-output-application_insights output plugin

One command installation:

bin/logstash-plugin install "logstash-output-application_insights"

Create configuration file

Example (input from files output Application Insights):

input {
  file {
    path => "/../files/*"
    start_position => "beginning"
  }
}
filter {
    # some filters here
}
output {
  application_insights {
    instrumentation_key => "5a6714a3-ec7b-4999-ab96-232f1da92059"
    table_id => "c24394e1-f077-420e-8a25-ef6fdf045938"
    storage_account_name_key => [ "my-storage-account", "pfrYTwPgKyYNfKBY2QdF+v5sbgx8/eAQp+FFkGpPBnkMDE1k+ZNK3r3qIPqqw8UsOIUqaF3dXBdPDouGJuxNXQ==" ]
      #
      # if you want to allow Microsoft get telemtry data about this process please set it to true
    enable_telemetry_to_microsoft => false
  }
}

Run Logstash

bin/logstash -f 'file://localhost/../your-config-file'

Installation options

One command installation:

bin/logstash-plugin install "logstash-output-application_insights"

If above does not work, or you would like to patch code here is a workaround to install this plugin within your logstash:

Option 1: Run in a local Logstash clone

  • Edit Logstash Gemfile and add the logstash-output-application-insights plugin path:
gem "logstash-output-application-insights", :path => "/../logstash-output-application-insights"
  • Install plugin the plugin from the Logstash home
bin/logstash-plugin install --no-verify

Option 2: Run in an installed Logstash

  • Build your plugin gem
gem build logstash-output-application-insights.gemspec
  • Install the plugin from the Logstash home
bin/logstash-plugin install "logstash-output-application_insights"

Configuration parameters

storage_account_name_key

Array of pairs, storage_account_name and an array of acces_keys. No default At least one pair is required. If not defined, values will be taken (if exist) from Environment Variable: AZURE_STORAGE_ACCOUNT and AZURE_STORAGE_ACCESS_KEY examples:

storage_account_name_key => [ "my-storage-account", "pfrYTwPgKyYNfKBY2QdF+v5sbgx8/eAQp+FFkGpPBnkMDE1k+ZNK3r3qIPqqw8UsOIUqaF3dXBdPDouGJuxNXQ==" ]

storage_account_name_key => [ ["my-storage-account1", "key1"], "my-storage-account2", "key2"], ["my-storage-account3", "key3"] ]

storage_account_name_key => [ ["my-storage-account1", ["key11", "key12"]], ["my-storage-account1", "key2"], ["my-storage-account1", ["key3"] ]

Note: the storage account must be of "General purpose" kind.

azure_storage_table_prefix

A prefix for the azure storage tables name used by this Logstash instance. Default host name It is recommeded that each Logstash instance have a unique prefix, to avoid confusion and loss of tracking, although sharing tables won't damage proper execution. If not set, the host name is used (not alphanumeric characters are removed, and converted downcase), if host name available. The prefix string may contain only alphanumeric characters, it is case sensitive, and must start with a letter example:

azure_storage_table_prefix => "myprefix"

azure_storage_container_prefix

A prefix for the azure storage containers name used by this Logstash instance. Default host name It is recommeded that each Logstash prefix have a unique prefix, to avoid confusion and loss of tracking, although sharing containers won't damage proper execution. if not set, the host name is used (not alphanumeric characters are removed, and converted downcase), if host name available. The prefix string may contain only alphanumeric characters and dash, double dash are not allowed, it is case insesitive. example:

azure_storage_container_prefix => "myprefix"

azure_storage_blob_prefix

A prefix for the azure storage blobs name used by this Logstash instance. Default host name Each Logstash prefix MUST have a unique prefix, to avoid loss of data !!! If not set, the host name is used (not alphanumeric characters are removed, and converted downcase), if host name available string may include only characters that are allowed in any valid url example:

azure_storage_blob_prefix => "myprefix"

instrumentation_key

Default Application Insights Analytics instrumentation_key. No default It will be used only in case the key is not specified in the tables property associated to a table_id, or as field or metadata fields in the event example:

instrumentation_key => "5A6714A3-EC7B-4999-AB96-232F1DA92059"

table_id

Default Application Insights Analytics table_id. No default Will be used only in case it is not specified as field or metadata fields in the event example:

table_id => "C24394E1-F077-420E-8A25-EF6FDF045938"

table_columns

Specifies the list of the fields that will be filtered from the event, fields not specified will be ignored. No Default (event all fields) If not specified all fileds in events will be filtered, the order is kept. The order is essential in case of CSV serialization. example:

table_columns => [ "EventLogID", "AppName", "EnvironmentName", "ActivityID", "EventID", "Severity", "Title" ]

case_insensitive_columns

If set to true, events fields are refered as case insensitive. Default false (case sensitive) example:

case_insensitive_columns => true

blob_max_bytesize

Advanced, internal, should not be set. Default 4 GB. Azure storage maximum bytesize is 192 GB ( = 50,000 * 4 MB ) example:

blob_max_bytesize => 4000000000

blob_max_events

Specifies, maximum number of events in one blob. Default 1,000,000 events. Setting it too low may improve latency, but will reduce ingestion performance Setting it too high may damage latency up to maximum delay, but ingestion will be more efficient, and load on network will be lower example:

blob_max_events => 1000000

blob_max_delay

Specifies maximum latency time, in seconds. Defualt 60 seconds. The latency time is measured since the time an event arrived till it is commited to azure storage, and Application Insights is notified. The total latency time may be higher, as this is not the full ingestion flow example:

blob_max_delay => 3600

blob_serialization

Specifies the blob serialziation to create. Default "json". currently 2 types are supported "csv" and "json"" example:

blob_serialization => "json""

io_retry_delay

Interval of time between retries due to IO failures example:

io_retry_delay => 0.5

io_max_retries

Number of retries on IO failures, before giving up, and move to available options example:

io_max_retries => 3

blob_retention_time

Specifies the retention time of the blob in the container after it is notified to Application Insighta Analytics. Dfeauly 604,800 seconds (1 week). Once the retention time expires, the blob is the deleted from container example:

blob_retention_time => 604800

blob_access_expiry_time

Specifies the time Application Insights Analytics have access to the blob that are notifie. Default 86,400 seconds ( 1 day). Blob access is limited with SAS URL example:

blob_retention_time => 604800

csv_default_value

Specifies the string that is used as the value in a csv record, in case the field does not exist in the event. Default "" example:

csv_default_value => "-"

serialized_event_field

Specifies a serialized event field name, that if exist in current event, its value as is will be taken as the serialized event. No Default example:

serialized_event_field => "serializedMessage"

logger_level

Specifies the log level. valid values are: DEBUG, INFO, WARN, ERROR, FATAL, UNKNOWN. Default "INFO" example:

logger_level => "INFO"

logger_files

Specifies the list of targets for the log. may include files, devices, "stdout" and "stderr". Default "logstash-output-application-insights.log" example:

csv_default_value => [ "c:/logstash/dev/runtime/log/logstash-output-application-insights.log", "stdout" ]

logger_progname

Specifies the program name that will displayed in each log record. Default "AI" Should be modified only in case there is another plugin with the same program name example:

logger_progname => "MSAI"

logger_shift_size

Specifies maximum logfile size. No Default (no size limit) Only applies when shift age is a number !!! Not supported in Windows !!! example (1 MB):

logger_shift_size => 1048576

logger_shift_age

Specifies Number of old logfiles to keep, or frequency of rotation (daily, weekly or monthly). No default (never) Not supported in Windows !!! examples:

logger_shift_age => weekly
logger_shift_age => 5

resurrect_delay

Specifies the time interval, between tests that check whether a stoarge account came back to life, after it stoped responding. Default 10 seconds example (half second):

flow_control_delay => 0.5

flow_control_suspend_bytes

Specifies the high water mark for the flow control, that is used to avoid out of memory crash. Default 52,428,800 Bytes (50 MB) Once the memory consumption reach the high water mark, the plugin will stop accepting events, till memory is below the low water mark example (200 MB):

flow_control_suspend_bytes => 209715200

flow_control_resume_bytes

Specifies the low water mark for the flow control, that is used to avoid out of memory crash. Default 41,820,160 Bytes (40 MB) Once memory consumption reach the high water mark, the plugin will stop accepting events, till memory is below the low water mark example (10 MB):

flow_control_resume_bytes => 10455040

flow_control_delay

Specifies the amount of time the flow control suspend receiving event. Default 1 second It is to allow GC, and flush of event to Azure storage before checking whether memory is below low water mark example (half second):

flow_control_delay => 0.5

ca_file

File path of the CA file, required only if having issue with SSL (see OpenSSL). No default. example:

ca_file => "/path/to/cafile.crt"

enable_telemetry_to_microsoft

When set to true, telemetry about the plugin, will't be sent to Microsoft. Deafult false. Only if you want to allow Microsoft get telemtry data about this process set it to true example:

enable_telemetry_to_microsoft => true

disable_cleanup

When set to true, storage cleanup won't be done by the plugin (should be done by some other means or by another Logstash process with this flag enabled) Default false example:

disable_cleanup => true

disable_compression

When set to true, blobs won't be compressed (beware: it will require more storage, more memory and more bandwidth) Default false example:

disable_compression => true

delete_not_notified_blobs

When set to true, not notified blobs are deleted, if not set they are copied to the orphan-blobs container. Default false example:

delete_not_notified_blobs => true

validate_notification

When set to true, access to application insights will be validated at initialization and if validation fail, logstash process will abort. Default false example:

validate_notification => true

validate_storage

When set to true, access to azure storage for each of the configured accounts will be validated at initialization and if validation fail, logstash process will abort. Default false. example:

validate_storage => true

save_notified_blobs_records

When set to true, notified blobs records are saved in the state table, as long as blobs are retained in their containers. Default false. Used for troubleshooting example:

save_notified_blobs_records => true

disable_notification

When set to true, notification is not sent to application insights, but behaves as if notified. Default false. Used for troubleshooting example:

disable_notification => true

disable_blob_upload

When set to true, events are not uploaded, and blob not commited, but behaves as if uploaded and uploaded. Default false. Used for troubleshooting example:

disable_blob_upload => true

disable_truncation

When set to true, event fields won't be truncated to max 1MB (beware: The max allows bytes size per filed is 1MB, setting it to true, it will be just waste of bandwidth and storage) Default false. Used for troubleshooting example:

disable_truncation => true

stop_on_unknown_io_errors

When set to true, process will stop if an unknown IO error is detected. Default false. Used for troubleshooting example:

stop_on_unknown_io_errors => true

azure_storage_host_suffix

when set an alternative storage service will be used. Default "core.windows.net". example:

azure_storage_host_suffix => "core.windows.net"

application_insights_endpoint

when set blob ready notification are sent to an alternative endpoint. Default "https://dc.services.visualstudio.com/v2/track". example:

application_insights_endpoint => "https://dc.services.visualstudio.com/v2/track"

notification_version

Advanced, internal, should not be set, the only current valid value is 1. example:

notification_version => 1

tables

Allow to support multiple tables, and to configure each table with its own parameters, using the global parameters as defaults. It is only required if the plugin need to support mutiple table. Tables is Hash, where the key is the table_id and the value is a has of specific properties, that their defualt value are the global properties. The specific properties are: instrumentation_key, table_columns, blob_max_delay, csv_default_value, serialized_event_field, blob_serialization, csv_separator template:

tables => { "table_id1" => { properties } "table_id2" => { properties } }

Examples:

tables => { "6f29a89e-1385-4317-85af-3ac1cea48058" => { "instrumentation_key" => "76c3b8e9-dfc6-4afd-8d4c-3b02fdadb19f", "blob_max_delay" => 60 } }
tables => { "6f29a89e-1385-4317-85af-3ac1cea48058" => { "instrumentation_key" => "76c3b8e9-dfc6-4afd-8d4c-3b02fdadb19f", "blob_max_delay" => 60 }
            "2e1b46aa-56d2-4e13-a742-d0db516d66fc" => { "instrumentation_key" => "76c3b8e9-dfc6-4afd-8d4c-3b02fdadb19f", "blob_max_delay" => 120 "ext" => "csv" "serialized_event_field" => "message" } 
          }

Enviroment variables

###AZURE_STORAGE_ACCOUNT Specifies the Azure storage account name Will be used by the plugin to set the account name part in plugin property storage_account_name_key if it is missing Example:

AZURE_STORAGE_ACCOUNT="my-storage-account"

###AZURE_STORAGE_ACCESS_KEY Specifies the Azure storage account access key Will be used by the plugin to set the key part in plugin property storage_account_name_key if it is missing Example:

AZURE_STORAGE_ACCESS_KEY="pfrYTwPgKyYNfKBY2QdF+v5sbgx8/eAQp+FFkGpPBnkMDE1k+ZNK3r3qIPqqw8UsOIUqaF3dXBdPDouGJuxNXQ=="

Setting up Http/Https Proxy

If you use a proxy server or firewall, you may need to set the HTTP_PROXY and/or HTTPS_PROXY environment variables in order to access Azure storage and Application Insights. Examples:

HTTP_PROXY=http://proxy.example.org
HTTPS_PROXY=https://proxy.example.org
  • If the proxy server requires a user name and password, include them in the following form:
HTTP_PROXY=http://username:password@proxy.example.org
  • If the proxy server uses a port other than 80, include the port number:
HTTP_PROXY=http://username:password@proxy.example.org:8080

Setting up SSL certificates

When using SSL/HTTPS, typically log in or authentication may require a CA Authority (CA) certificate. If the required certificate is not already bundled in the system. it may be configured in the plugin (see above ca_file) example:

ca_file => "/path/to/cafile.crt"

Getting Started for Contributors

If you would like to become an active contributor to this project please follow the instructions provided in CONTRIBUTING.md and DEVELOPER.md

Provide Feedback

If you encounter any bugs with the library please file an issue in the Issues section of the project.

Code of Conduct

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.