Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Library for journald Logs Integration into Logcollector #22322

Closed
5 tasks done
Tracked by #12862
JcabreraC opened this issue Mar 4, 2024 · 4 comments · Fixed by #22359
Closed
5 tasks done
Tracked by #12862

Implement Library for journald Logs Integration into Logcollector #22322

JcabreraC opened this issue Mar 4, 2024 · 4 comments · Fixed by #22359

Comments

@JcabreraC
Copy link
Member

JcabreraC commented Mar 4, 2024

Wazuh version Component Install type Install method Platform
4.9.0 Logcollector Manager Packages/Sources OS version

Description

This issue addresses Stage 2A of our development plan, focusing on the implementation of a library to facilitate the integration of journald logs with Logcollector. The library will leverage the systemd-journald API to provide a seamless and efficient method for Logcollector to monitor and collect logs from journald.

Objectives

  • Library Development: Develop a library that interfaces with the systemd-journald API, enabling Logcollector to collect logs directly from journald in next steps.
  • Real-time and Historical Log Collection: Ensure the library supports collecting logs in real-time or from a specific moment in time, with efficient filtering capabilities (PCRE2).
  • Flexible Output Formats: Implement functionality to output logs in both JSON and syslog formats.
  • Comprehensive Testing: Conduct thorough testing of the library, particularly focusing on the iteration functions over the journald database, to ensure reliability and efficiency in log collection.

Tasks

  • Design and implement the library to interface with the systemd-journald API for log collection.
  • Develop functions within the library to filter logs based on user-defined criteria (e.g., PCRE2 over any field).
  • Implement functionality to output logs in both JSON and syslog formats, according to user configuration.
  • Test the library extensively, focusing on real-time and historical log collection capabilities, as well as the efficiency of filtering and output functions.
  • Document the library's API, usage examples, and best practices for integration with Logcollector.

Acceptance Criteria

  • A fully implemented library that allows Logcollector to collect journald logs through the systemd-journald API.
  • The library must support efficient log filtering, as well as outputting logs in JSON and syslog formats.
  • Comprehensive documentation of the library, including API details, usage examples, and integration guidelines.
  • Successful completion of tests demonstrating the library's functionality, efficiency, and reliability in various scenarios.

Additional Considerations

  • Performance Optimization: Focus on optimizing the library's performance to minimize the impact on system resources during log collection.
  • Extensibility: Design the library with extensibility in mind, allowing for future enhancements and additional features without significant refactoring.
  • Security: Ensure that the library follows best practices for security, particularly in handling log data and interacting with the systemd-journald API.
@juliancnn
Copy link
Member

Today's progress includes significant enhancements and critical observations in our ongoing development efforts for the journald logs integration into Logcollector. Here's a detailed breakdown:

  • Dynamic Loading of systemd: We've successfully eliminated the static dependency on systemd by adopting dlopen for dynamic loading. This change significantly enhances the flexibility of our integration, allowing Logcollector to operate without permanent alterations or dependencies on the host's operating system configuration.

  • Security Considerations for Library Loading: While dynamic loading is operational, we've identified a need to refine the process for searching and verifying the systemd library. It's crucial to implement a mechanism to check the library's authenticity and ownership to mitigate the risk of library spoofing. This improvement will ensure the security and integrity of the log collection process.

  • Real-time and Historical Log Collection: The functionality to fetch logs from any specified date to the present is now fully functional. This capability allows for versatile log extraction, supporting both historical data analysis and real-time monitoring.

  • Partial Implementation of PCRE2: The integration of PCRE2 (Perl Compatible Regular Expressions, version 2) is underway but not yet complete. Currently, we need to enhance memory management and the logic for grouped conditions (AND and OR). Addressing these areas will improve the efficiency and flexibility of log filtering.

  • JSON Log Dumping: The process of dumping logs in JSON format utilizes library functions introduced in systemd version 245, which are relatively recent. Given the wide range of systemd versions in use, it's necessary to revisit this implementation. We aim to ensure broader compatibility without sacrificing the functionality and efficiency of JSON log output.

In summary, while we've made considerable progress, especially in dynamically loading systemd and enhancing log collection capabilities, there are critical areas that require further development. Enhancing security measures for library verification, refining PCRE2 integration, and ensuring compatibility in JSON log dumping are our immediate next steps. We remain committed to advancing our work, addressing these challenges, and delivering a robust solution for journald logs integration.

@juliancnn
Copy link
Member

Today, we've made significant strides in enhancing the security and compatibility of our dynamic library integration for journald log processing. Here are the key updates:

  • Dynamic Library Path Detection and Ownership Verification: The library now autonomously detects the path of the dynamically loaded systemd library (libsystemd.so.0) and verifies its ownership. We've implemented a check to ensure that the library is owned by root, bolstering the security of our implementation and preventing potential spoofing attacks.

  • Adaptation to Operating System Specifics: We've transitioned from using libsystemd.so to libsystemd.so.0. This change addresses compatibility issues encountered on some Linux distributions, such as Ubuntu, where the symbolic link from libsystemd.so to libsystemd.so.0 does not exist by default. This discrepancy was observed to cause issues, as even system tools like logger are linked against libsystemd.so.0. The creation of this symbolic link is typically triggered by installing the libsystemd-dev package, a step not universally performed across installations. This adjustment ensures our library remains compatible across a broader range of systems.

  • Enhanced Systemd Compatibility: To further improve our library's compatibility with various versions of systemd, we have modified the library to limit its dependency to functions available up to systemd version 187. This decision was made to ensure that our solution can operate effectively even on systems running older versions of systemd, expanding the usability and reach of our implementation.

Additionally, to provide context on how our changes align with existing system configurations, here's the output of ldd on /usr/bin/logger, showing that it is linked against libsystemd.so.0, consistent with our adjustments:

╰─# ldd /usr/bin/logger
        linux-vdso.so.1 (0x00007ffd03bad000)
        libsystemd.so.0 => /lib/x86_64-linux-gnu/libsystemd.so.0 (0x00007f7c332c0000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f7c33097000)
        ...

These enhancements mark a significant step forward in our project, ensuring our implementation not only remains secure but also maximizes compatibility across diverse Linux environments. Our focus on addressing specific system dependencies and nuances is a testament to our commitment to delivering a robust and versatile solution for journald log processing.

@juliancnn
Copy link
Member

Today's progress brought several substantial improvements and refinements to our project, enhancing functionality, reliability, and code quality. Here's a comprehensive update on our advancements:

  • Enhanced Filter Creation Logic: We've significantly improved the logic for creating filters. A new entity, named filter list, has been introduced, alongside functions within the context to manage these lists. These improvements allow for more efficient searching and matching against new entries that meet any specified filters.

  • Implementation of OR Logic Between Filters: We've successfully implemented OR logic among filters. This addition provides greater flexibility in log filtering, enabling the inclusion of entries that match any one of a set of conditions, rather than requiring all conditions to be met.

  • Unified and Comprehensive Error Handling: Error handling across the codebase has been standardized and made more comprehensive. This initiative ensures that errors are consistently managed and reported, improving the stability and debuggability of our solution.

  • Proof of Concept (PoC) Adaptations: The PoC has been updated to incorporate and test these latest enhancements. These adaptations validate the functionality of the new features and improvements in a live environment, ensuring their effectiveness before wider implementation.

  • Syslog PID Corner Case Resolution: We've identified and fixed a corner case related to the handling of the PID in syslog format. This fix ensures accurate and consistent processing of syslog entries, enhancing the reliability of log data interpretation.

  • Resolution of Scan-Build Warnings: All warnings identified by scan-build have been addressed and resolved. This effort not only improves the overall code quality but also minimizes potential risks and ensures a higher standard of reliability.

These updates represent a significant stride toward refining our project's capabilities and reliability. By enhancing the filter logic, implementing flexible OR logic, standardizing error handling, and addressing specific issues and warnings, we're ensuring our project not only meets but exceeds the expectations for quality and functionality. We look forward to continuing this momentum and delivering a robust solution for our users.

@juliancnn
Copy link
Member

Today's progress update includes three key developments:

  1. Rebasing with Master:

    • Successfully rebased the feature branch with the master branch, resolving all conflicts to ensure our work remains compatible with the latest codebase changes.
  2. Revisions Based on Review Feedback:

    • Implemented changes suggested during code review, particularly focusing on enhancing the mechanism for loading functions. This involved refining our approach to ensure greater efficiency and reliability.
  3. Refinement of 'Ignore Missing' Functionality:

    • Addressed previously overlooked aspects of the 'ignore is missing' function. This included comprehensive testing and fine-tuning to ensure it behaves as expected under various conditions.

These steps mark significant progress towards the robust integration of journald logs into Logcollector, enhancing its functionality and reliability.

@JcabreraC JcabreraC changed the title Implement Library for journald Logs Integration into Logcollector Implement Library for journald Logs Integration into Logcollector May 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants