Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #170 Enhance ConnectorStore loading system #176

Conversation

NassimBtk
Copy link
Member

@NassimBtk NassimBtk commented May 17, 2024

Overview

  • Each connector object within the ConnectorStore defines a Map<String, EmbeddedFile> for storing associated external files (e.g., awk scripts).
  • The EmbeddedFile content is stored as a byte array, so that we can regenerate the original file if needed. When executing the awk compute, we convert the embedded file content (awk script) to text using UTF-8.
  • We keep the current behavior of the Connector Store reader, so that users retain full control over the community connectors in production.
  • Various enhancement and code factorization.

* Internalized external files.
* Updated Windows and OsCommand extensions in order to correctly fetch
embedded files.
* Updated http extension to manage header and body using embedded files.
* Formatted code.
* Updated unit tests in the agent to use real extensions.
fearture/issue-170-enhance-metricshub-connectorstore-loading-system
* Fixed JUnit tests on oscommand extension
* Fixed formatting and removed useless imports
* Decreased coverage on metricshub-agent due to code removal
fearture/issue-170-enhance-metricshub-connectorstore-loading-system
* Added unit tests for EmbeddedFile methods.

* Update README.md

* Enhanced log message in EmbeddedFileHelper
* Managed connector store extension providers
@@ -150,7 +150,7 @@
<limit>
<counter>COMPLEXITY</counter>
<value>COVEREDRATIO</value>
<minimum>0.51</minimum>
<minimum>0.50</minimum>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we decreased the test coverage percentage here ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even with new tests the coverage decreased, I will look into it.

*/
public static String getExtension(String filename) {
// Find the last index of '.' in the filename
int lastIndex = filename.lastIndexOf('.');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be final

*/
public static String getBaseName(String filename) {
// Find the last index of '.' in the filename
int lastIndex = filename.lastIndexOf('.');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be final

Copy link
Member

@bertysentry bertysentry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks beautiful! Well done! I only have a question about the id of embedded files, and whether we can ditch it. Thank you! 😊

try {
return File.createTempFile("SEN_Embedded_", extension);
return File.createTempFile("SEN_Embedded_" + baseName, extension);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't need SEN_Embedded_

Copy link
Member Author

@NassimBtk NassimBtk May 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, maybe we need metricshub_embedded_ as prefix.
The reason we have this prefix is to easily identify the files in case of an issue. The user or support can then efficiently clean everything in the user's directory, and possibly even on a production machine where files could fill up the filesystem, which could become critical.

void testDescription() {
assertEquals(
"EmbeddedFile 1: script.bat",
EmbeddedFile.builder().content("value".getBytes()).filename("script.bat").id(1).build().description()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the id really required? Wouldn't the name be sufficient? (I'm asking because I'm may be misunderstanding something)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name alone is insufficient since different files with the same name can exist in various subdirectories and be referenced in the connector. Therefore, a unique ID is necessary. Additionally, the global index of the embedded files should be represented using integer values instead of strings.

void testGetContentAsString() {
final EmbeddedFile embeddedFile = EmbeddedFile
.builder()
.content("value".getBytes())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You will need to specify the Charset in getBytes() to make sure the tests run properly without relying with the JRE and system configuration.

assertEquals(uriStrExpected, result.toString());
final String actualAwkScriptNormalized = result.getContentAsString().replaceAll("\r\n", "").replaceAll("\n", "");
final String expected =
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Urgl...

*/
private String reference;
private Integer id;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the requirement for this id? Can we avoid it?

Copy link
Member Author

@NassimBtk NassimBtk May 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the unique id of the embedded file, this will handle the case where different embedded files have the same name, and enhance slightly the memory of the global embedded file Map<Integer, EmbeddedFile>.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is generated by the connector parser 😉.

* Increased coverage on OsCommand and Http extension
* Fixed Timeout deserialization on the configuration objects.
@NassimBtk NassimBtk changed the title Issue #170 enhance ConnectorStore loading system Issue #170 Enhance ConnectorStore loading system May 17, 2024
* Cleaned-up Constants.java in the engine.
* Replaced temp file prefix `SEN_Embedded_` with `metricshub_embedded_`.
@NassimBtk NassimBtk merged commit c9678c3 into feature/EPIC-extension-system May 17, 2024
@NassimBtk NassimBtk deleted the fearture/issue-170-enhance-metricshub-connectorstore-loading-system branch May 17, 2024 15:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants