## Logstash Installation

### Installing Logstash

##### By now we have a system with Elasticsearch, Kibana, and Nginx installed on it, allowing us to store data, visualize it and secure it throiugh a firewall. Nevertheless, as we collect data, you might want to pre-process it as it arrives to your node.

##### Logstash is an open source, server-side data processing pipeline that ingests data from a multitude of sources simultaneously, transforms it, and then sends it to your favorite "stash."

##### Logstash needs to be provided with a data input, filters, and a data output. Filters are used for processing data and creating new fields using many of the available filter plugins with logstash. As an example, GROK allows us to derive structure from unstructured data.



In [None]:
%%bash
sudo apt install -y openjdk-8-jdk
sudo apt install -y logstash

### Validate Installation

##### We will now check the version of Java that was just installed as well as the status of the logstash service.

In [None]:
%%bash
java -version
sudo systemctl status logstash
ps -aux | grep logstash

### Securing Communications using Transport Layer Security (TLS) 

##### Since we will be using multiple beats to ship logs from client servers to our Elastic Stack Node, we need to create an SSL certificate and a key pair. This ensure that logs (that may contain sensitive data) that you are shipping are encrypted before arriving to a central location. This certificate will be used by every beat to verify the identity of the Elastic Stack Node.

##### A certificate is a file containing public information, including a public key that is used for encrypting data between two parties where only the party with the matching private key is able to decrypt data from an initial handshake. The verification of the certificate is important, since the handshake taking place between the client and server requires the client to trust the signing authority of the server's certificate. In our case, the certificate is signed using the generated private key, making it easy for the logstash node to validate the certificate from the client. 

##### First, we will start by creating two directories where we will be storing our certificate and private key using the following commands:

In [None]:
%%bash
sudo mkdir -p /etc/pki/tls/certs
sudo mkdir /etc/pki/tls/private

##### Next we will add our server's private IP addres to the subjectAltName (SAN) field of the SSL certificate that we are about to create by adding this information into the OpenSSL configuration file under the [ v3_ca] section:

In [None]:
%%bash
temp_ens3=$(ip addr show ens3 | grep -Po 'inet \K[\d.]+')
sudo sed -i "254isubjectAltName = IP: $temp_ens3" /etc/ssl/openssl.cnf

##### Now we will proceed to create our certificate and key with the following command:

In [None]:
%%bash
sudo openssl req -config /etc/ssl/openssl.cnf -x509 -days 3650 -batch -nodes -newkey rsa:2048 -keyout /etc/pki/tls/private/logstash-forwarder.key -out /etc/pki/tls/certs/logstash-forwarder.crt
    

##### We can validate if the corresponding key and certificate where generated by listing the contents of the target directories. Now that we have validated this information, the "logstash-forwarder.crt" file will be copied to all of the client servers shipping information to logstash using a beat. In our case, since we are dpeloying an all-in-one node, there is no need to copy this file anywhere. We will just need to point our beats to use this certificate to ship information to logstash as well as configuring logstash to use the key to validate the certificate presented by the node shipping the information.

##### Note: 

In [None]:
%%bash

ls /etc/pki/tls/private/

ls /etc/pki/tls/certs/

### Configuring Logstash

##### Next we will copy the configuration files for logstash that we downloaded from the repository into logstash's configuration directory "/etc/logstash/conf.d". There are 3 configuration files -- namely, input, filters, output.

* input: This configuration file defines the port which logstash will be listening to receive data.
* filters: In this file we set the filters we want to use to pre-process incomming data using the installed plugin filters. In our case, we will be using GROK to parse teh data and derive structure from usteructured data and geoip to ontain geographical information from an IP address.
* output: In this configuration file we define the output to which logstash will send the processed data. In our case we will be sending it to elasticsearch.

In [None]:
%%bash
sudo cp /home/ubuntu/ml_and_big_data_in_cloud_environmnets/files/logstash_conf/* /etc/logstash/conf.d/

### Input Configuration

##### In this configuration file we define the port which logstash will be listening to for receiving data from the different data shipper (Beats). Let us chech our current configuration file:

In [None]:
%%bash
sudo cat /etc/logstash/conf.d/02-beats-input.conf

##### As we can observe, port 5044 is the default port logstash will be listening to for receivbing data from data shippers. Nevertheless, as we 

In [None]:
%%bash
sudo sed -i "4i\ \ \ \ ssl => true" /etc/logstash/conf.d/02-beats-input.conf
sudo sed -i '5i\ \ \ \ ssl_certificate => "/etc/pki/tls/certs/logstash-forwarder.crt"' /etc/logstash/conf.d/02-beats-input.conf
sudo sed -i '6i\ \ \ \ ssl_key => "/etc/pki/tls/private/logstash-forwarder.key"' /etc/logstash/conf.d/02-beats-input.conf
sudo sed -i "7i\ \ \ \ ssl_verify_mode => none" /etc/logstash/conf.d/02-beats-input.conf

In [None]:
%%bash
sudo cat /etc/logstash/conf.d/02-beats-input.conf

### Filters Configuration

##### In this file is used to set all the pre-processing we want to perform with the received data. In our case we use GROK, a plugin for parsing unstructured log data into something structured and queryable. Let us take a look at our current filters:

In [None]:
%%bash
sudo cat /etc/logstash/conf.d/10-syslog-filter.conf

### Output Configuration

##### In this configuration file we define the destination of the pre-processed data, the index in which the data will be stored, and the security credentials for "logstash_internal" user we will generate for received data in elasticsearch.

##### Note: "logstash_system" is a native user that is used only when enabling logstash monitoring. Logstash monitoring provides monitoring information of the node hosting logstash and the pipeline itself. The "logstash_internal" user is created in elasticsearch for receiving information from logstash. We could also use the "elastic" user for this purpose; nevertheless, this must be avoided since this is a superuser rith all the rights to change any data and configuration in elasticsearch.

##### Let us list the current users and roles registered in elasticsearch. Notice that native users are included in this list.

In [None]:
%%bash
sudo apt install -y jq 
curl -u elastic:elasticsiem -XGET <ELASTICSEARCH-IP-ADDRESS>:9200/_xpack/security/role/?pretty | jq -r 'keys'
curl -u elastic:elasticsiem -XGET <ELASTICSEARCH-IP-ADDRESS>:9200/_xpack/security/user/?pretty | jq -r 'keys'

### Create "logstash_writer" role

##### Before creating the "logstash_internal" user we need to define a role that has permission to write indexes for logstash in elasticsearch. We will use the API request shown below to perform this action. It can be observed that we define the management persmission with very few parameters.

In [None]:
%%bash
curl -u elastic:elasticsiem -H "Content-Type: application/json" -XPOST <ELASTICSEARCH-IP-ADDRESS>:9200/_xpack/security/role/logstash_writer -d '{
  "cluster": ["manage_index_templates", "monitor", "manage_ilm"], 
  "indices": [
    {
      "names": [ "logstash-*", "filebeat-*", "packetbeat-*", "metricbeat-*", "auditbeat-*", "heartbeat-*", "csv-*" ], 
      "privileges": ["write","delete","create_index","manage","manage_ilm"]  
    }
  ]
}'
#curl -u elastic:elasticsiem -H "Content-Type: application/json" -XPOST <ELASTICSEARCH-IP-ADDRESS>:9200/_xpack/security/role/logstash_writer -d '{"cluster": ["manage_index_templates", "monitor", "manage_ilm"], "indices": [{"names": [ "logstash-*" ], "privileges": ["write","delete","create_index","manage","manage_ilm"]}]}'
curl -u elastic:elasticsiem -XGET <ELASTICSEARCH-IP-ADDRESS>:9200/_xpack/security/role/logstash_writer?pretty
        
        

### Create "logstash_internal" user
##### Now that we have created this role. We can now proceed to generate a new user in logstash names "logstash_internal" and assign this user the role "logstash_writer" we just created.

In [None]:
%%bash
curl -u elastic:elasticsiem -XGET <ELASTICSEARCH-IP-ADDRESS>:9200/_xpack/security/user/?pretty | jq -r 'keys'

In [None]:
%%bash

curl -u elastic:elasticsiem -H "Content-Type: application/json" -XPOST <ELASTICSEARCH-IP-ADDRESS>:9200/_xpack/security/user/logstash_internal -d '{
  "password" : "elasticsiem",
  "roles" : [ "logstash_writer"],
  "full_name" : "Internal Logstash User"
}'

### Check the "logstash_internal" user policy

In [None]:
%%bash
curl -u elastic:elasticsiem -XGET <ELASTICSEARCH-IP-ADDRESS>:9200/_xpack/security/user/logstash_internal?pretty

### Saving credentials of logstash_internal user using Logstash Keystore

##### To connect logstash to elasticsearch, we need to provide the credentials for the logstash_internal user and since we could have one or many systems (under the control of external users) shipping information to elasticsearch, we want to protect those credentials from being visible to everyone. To achieve this, we will store the credentials on the node hosting logstash using logstash-keystore. In your terminal, type the following commands:

~~~
set +o history
export LOGSTASH_KEYSTORE_PASS=mypassword
set -o history
sudo -E /usr/share/logstash/bin/logstash-keystore --path.settings /etc/logstash create
sudo -E /usr/share/logstash/bin/logstash-keystore --path.settings /etc/logstash add ES_USER
sudo -E /usr/share/logstash/bin/logstash-keystore --path.settings /etc/logstash add ES_PWD
~~~

##### Now let us change the output configuration file by adding the saved variables in logstash-keystore into this file as shown below:

In [None]:
%%bash
sudo sed -i '6i\ \ \ \ user => "${ES_USER}"' /etc/logstash/conf.d/30-elasticsearch-output.conf 
sudo sed -i '7i\ \ \ \ password => "${ES_PWD}"' /etc/logstash/conf.d/30-elasticsearch-output.conf
sudo cat /etc/logstash/conf.d/30-elasticsearch-output.conf

### Test your Logstash Configuration

##### In your terminal, type the command below to test all the setup we have configured in logstash.

~~~
sudo -E /usr/share/logstash/bin/logstash --path.settings /etc/logstash -f /etc/logstash/conf.d/ -t
~~~

### Enable Logstash Monitoring

As of now we have completed our configuration settings for logstash; nevertheless, we will not have visibility on it's functionality unless we enable monitoring. To enable monitoring we will need to set the following X-pack monitoring configuration:

In [None]:
%%bash

sudo sed -i '219i xpack.monitoring.enabled: true' /etc/logstash/logstash.yml
sudo sed -i '220i xpack.monitoring.elasticsearch.username: logstash_system' /etc/logstash/logstash.yml
sudo sed -i '221i xpack.monitoring.elasticsearch.password: elasticsiem' /etc/logstash/logstash.yml
sudo sed -i '222i xpack.monitoring.elasticsearch.hosts: ["<ELASTICSEARCH-IP-ADDRESS>:9200"]' /etc/logstash/logstash.yml

### Run logstash

~~~
sudo -E /usr/share/logstash/bin/logstash --path.settings /etc/logstash -f /etc/logstash/conf.d/
sudo systemctl enable logstash
~~~

##### If you are in need to debug logstash, you can use the following:

~~~
sudo -E /usr/share/logstash/bin/logstash --path.settings /etc/logstash -f /etc/logstash/conf.d/ --log.level=debug --pipeline.unsafe_shutdown
~~~

### References

https://www.elastic.co/guide/en/logstash/current/filter-plugins.html

https://www.elastic.co/blog/introducing-the-elastic-common-schema

https://www.elastic.co/blog/tls-elastic-stack-elasticsearch-kibana-logstash-filebeat

https://www.elastic.co/guide/en/kibana/current/configuring-tls.html

https://www.elastic.co/guide/en/elastic-stack-overview/current/built-in-users.html

https://www.elastic.co/guide/en/logstash/current/ls-security.html

https://www.elastic.co/guide/en/logstash/current/plugins-inputs-beats.html

https://www.elastic.co/guide/en/logstash/current/keystore.html

https://github.com/elastic/stack-docs/blob/master/docs/en/stack/security/get-started-security.asciidoc

https://www.elastic.co/blog/getting-started-with-elasticsearch-security

https://www.elastic.co/guide/en/elastic-stack-overview/current/get-started-roles.html

### Listing Elasticsearch Templates for Input Data

In [None]:
%%bash
curl -u elastic:elasticsiem -XGET '<ELASTICSEARCH-IP-ADDRESS>:9200/_template?filter_path=*.order&pretty'
curl -u elastic:elasticsiem -XGET '<ELASTICSEARCH-IP-ADDRESS>:9200/_template?filter_path=*.version&pretty'