# What you'll learn

After watching this video, you'll be able to:
* Identify the major reasons to store log data.
* Recognize best practices for storing logs.

# Store log data

![image.png](attachment:3255dc0a-27ef-441c-ba05-c910b8cb7fe1.png)

* Monitoring and analyzing logs enhance the observability of the network, providing transparency and visibility into the cloud computing environment.
* Although observability is not the primary goal, it should be viewed as a means to accomplish real business objectives.
* This observability can be achieved by storing log data.

# Why store log data?

There are many reasons for storing log data.

Let's explore these reasons.

![image.png](attachment:9c0e5281-3668-4c70-81c1-ecbe1bb7d602.png)

Firstly, log data helps to improve the reliability of the systems.
* Log files contain information on system performance, which helps in determining the need for increased capacity to enhance the user experience.
* You can utilize log files to identify sluggish queries, errors that prolong transactions, or bugs that affect the application's performance.

![image.png](attachment:180d5c9f-3f76-41ce-a60b-705f5c495062.png)

Another reason to store log data is to maintain the security posture of the environment.
* Log files record events like failed login attempts, authentication failures, or unexpected server overloads, indicating a potential ongoing cyberattack to analysts.
* Advanced security monitoring tools can promptly send alerts and automate responses upon detecting these events on the network.

![image.png](attachment:e9ed8ddd-6575-417d-9879-676edfff9df1.png)

The next reason to store log data is to improve IT systems' decision making.
* The user's behavior with an application is recorded and stored in log files.
* This leads to an area of inquiry known as user behavior analytics.
* By analyzing user actions, developers can enhance the application to expedite users' achievements of their objectives, thereby enhancing customer satisfaction and driving revenue.

![image.png](attachment:5b67aba8-ae80-43da-bfaa-b8795b77974d.png)

Stored logs can also be used for auditing purposes.
* The log messages encompass important application events and management and finance information.
* While these may not offer daily advantages, they are crucial for meeting business requirements.


# Retention period for log data

![image.png](attachment:482029a6-6502-4d88-ad94-91c20edfa75c.png)

* To gain insight into an app's behavior and performance changes over time, it is necessary to visualize log data spanning days or weeks.
* This enables the identification of patterns and trends.
* Regarding log retention policies, complying with audit requirements and related rules often requires keeping log data for extended periods, even years.
* In such cases, prematurely purging old logs can result in significant consequences.

# Storing logs in cloud

![image.png](attachment:d940b036-91e3-4710-a23a-95d303358d54.png)

* Storing logs in the cloud enables scalable storage capacity that aligns with log data needs without sacrificing quick accessibility.
* Increasing capacity or log data does not impact retrieval speed.
* Storage services like Amazon S3, offer the ability to securely store log data in the cloud by employing AES-256 encryption for data at rest.

# Long-term log storage

![image.png](attachment:da884667-57d7-47b2-a735-bf06f61cb716.png)

* Collecting logs provides the capability for long term storage and retention.
* Many compliance mandates have log storage and retention requirements, making it crucial to include this in your log collection strategy.
* Generally, it's recommended to store log data for a minimum of one year to facilitate future investigations if needed.

# Back-up log data

![image.png](attachment:ce6f0539-f3f0-4e36-a742-e6ad5b985200.png)

* When storing logs, you can choose to backup data to on-premises servers or in the cloud.
* This decision is often associated with the company's digital transformation and the migration of resources to an online environment.

# Dimension for log retention: Criticality

For log retention policies, let us consider the analytical dimensions that give a relative idea of how long the retention period should be.

![image.png](attachment:4555da79-9126-479f-a790-849bdbece6e8.png)	

The first aspect to consider is criticality.
* Retention policies can vary for different parts of the system based on their importance.
* Critical components can have longer retention periods for added assurance.
* Services with minor value may have logs dumped in two days as an example.

# Dimension for log retention: Security

![image.png](attachment:6ed72a75-a3e0-48ab-b3ef-c40c675e7de6.png)

The second dimension to consider for log retention policies is security.
* Applications that involve sensitive or personal data and high risk processing should have an extended retention policy.
* Examples include services responsible for credit card authorization and user authentication.
* On the other hand, services that track customer behavior on the app for improvement purposes might not need a longer retention policy.

# Dimension for log retention: Maturity

![image.png](attachment:87776a16-aa1f-42ac-9191-1d89d2fb6835.png)

The next dimension to consider is the maturity of the system.
* In well-established systems with limited ongoing feature development, occurrences of new issues are infrequent.
* So, immature software systems, a short retention policy might be appropriate to reduce storage costs.

# Dimension for log retention: Frequency

![image.png](attachment:df7dcf8e-79c2-4e5e-aa3e-ef0118a6e6d8.png)

Another dimension to consider is frequency.
* Applications that run infrequently, for example, once a month, benefit from extended retention policies to help developers verify multiple executions to track down the source of an issue.
* For debugging due to the infrequent execution, it would be necessary to trace back in time.

# Dimension for log retention: Cost-effectiveness

![image.png](attachment:aac377d7-aa37-4666-b923-af24b9cac78f.png)

Let's also consider the cost-effectiveness dimension for a retention policy.
* You should consider the project's cost compatibility before deciding on the retention policy.
* Estimate data generation and storage expenses for the intended duration.
* While, prioritizing safety and long term storage may be desirable, opting for a more cost effective alternative could be reasonable.

# Dimension for log retention: Discovery and resolution

![image.png](attachment:c019392b-06f7-400f-a2bd-02b38d0732bc.png)

Finally, you should also consider the discovery and resolution dimension.
* You should monitor the average time it takes your development team to identify and fix problems.
* You should ensure that the log retention policy gives enough space for debugging.

# Log storage best practices

![image.png](attachment:84197b76-b0a9-4cf2-855b-e82087b5dc6a.png)

Moving further, let us consider a few best practices for storing logs.
* It is important to determine what information you need to log so that you can identify and diagnose issues quickly.
* Centralizing logs into a single system can make it easier to manage and analyze them, especially if you have multiple servers or applications.
* Depending on your needs, you may want to use a cloud based solution like AWS CloudWatch or a self-hosted solution like Elasticsearch.
* Rotating logs regularly can help prevent them from filling up your storage space and causing performance issues.
* Make sure only authorized personnel can access the logs to maintain security and confidentiality.
* Regularly reviewing log data can help identify trends or potential issues before they become major problems.
* Some organizations may be required by law or regulation to retain log data for a certain period, so you should know your obligations in this regard.

# Log storage tool

![image.png](attachment:6427e8fa-308b-46a9-9571-85a6a2f7135b.png)

For log storage, there are many tools available.

Let's explore a few of them.
* Elasticsearch is a distributed RESTful search and analytics engine that can be used for storing and analyzing logs.
* Splunk is a software platform used for searching, monitoring and analyzing machine generated big data via a web style interface.
* Graylog is an open-source log management platform that collects, indexes and analyzes structured and unstructured log data.
* Logstash is a tool by Elastic used for collecting, parsing and storing logs for later use with Elasticsearch or other analytics platforms.
* Fluentd is an open source data collector designed to unify logging infrastructure.
* Finally, Sumo Logic is a cloud-based log management platform that allows users to ingest, analyze and visualize logs in real time.

# Summary

![image.png](attachment:22a9ee46-36ce-4217-ae1d-5fc0c3204341.png)

In this video, you learn that:
* Monitoring and analyzing logs enhance the observability of the network, providing transparency and visibility into the cloud computing environment.
* Major reasons to consider storing log data include the reliability of systems, the security posture of environments, and auditing.
* Strategies for saving log data include retention period, storing logs in clouds, long term log storage, and backing up log data.
* Criticality, security frequency and cost effectiveness are the analytical dimensions that give a relative notion of how long the retention period should be.
* The salient log storage tools include Elasticsearch, Splunk, Graylog, Fluentd, and Sumo Logic.