Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] audit log entries are written twice to the console for sample lineage connector #7229

Closed
davidradl opened this issue Dec 6, 2022 · 14 comments
Assignees
Labels
bug Something isn't working triage New bug/issue which needs checking & assigning

Comments

@davidradl
Copy link
Member

Existing/related issue?

No response

Current Behavior

Using the following configuration to define the Sample Lineage connector, log entries are written twice.

I tried removing the repositoryServicesConfig section, but this is invalid. I assume there is a default log and the one that is specified in the config. I do not know how t configure so that audit log entries are not duplicated.

{
"class": "OMAGServerConfig",
"versionId": "V2.0",
"localServerId": "02defb10-528c-4707-8049-ae7d82b88310",
"localServerName": "lineagesample1",
"localServerType": "Integration Daemon",
"localServerURL": "https://localhost:9443",
"localServerUserId": "OMAGServer",
"maxPageSize": 1000,
"integrationServicesConfig": [
{
"class": "IntegrationServiceConfig",
"integrationServiceId": 606,
"integrationServiceDevelopmentStatus": "IN_DEVELOPMENT",
"integrationServiceContextManagerClass": "org.odpi.openmetadata.integrationservices.lineage.contextmanager.LineageIntegratorContextManager",
"integrationServiceName": "Lineage Integrator",
"integrationServiceFullName": "Lineage Integrator OMIS",
"integrationServiceURLMarker": "lineage-integrator",
"integrationServiceDescription": "Manage capture of lineage from a third party tool.",
"integrationServiceWiki": "https://egeria-project.org/services/omis/lineage-integrator/overview/",
"integrationServicePartnerOMAS": "Asset Manager OMAS",
"defaultPermittedSynchronization": "FROM_THIRD_PARTY",
"integrationServiceOperationalStatus": "ENABLED",
"integrationConnectorConfigs": [
{
"class": "IntegrationConnectorConfig",
"connectorId": "ba6dc870-2303-48fc-8611-d50b49706f48",
"connectorName": "LineageIntegrator",
"metadataSourceQualifiedName": "TestMetadataSourceQualifiedName",
"connection": {
"class": "VirtualConnection",
"headerVersion": 0,
"qualifiedName": "Egeria:IntegrationConnector:Lineage:OpenLineageEventReceiverConnection",
"connectorType": {
"class": "ConnectorType",
"headerVersion": 0,
"connectorProviderClassName": "org.odpi.openmetadata.adapters.connectors.integration.lineage.SampleLineageEventReceiverIntegrationProvider"
},
"embeddedConnections": [
{
"class": "EmbeddedConnection",
"headerVersion": 0,
"position": 0,
"embeddedConnection": {
"class": "Connection",
"headerVersion": 0,
"qualifiedName": "Kafka Open Metadata Topic Connector for sample lineage",
"connectorType": {
"class": "ConnectorType",
"headerVersion": 0,
"connectorProviderClassName": "org.odpi.openmetadata.adapters.eventbus.topic.kafka.KafkaOpenMetadataTopicProvider"
},
"endpoint": {
"class": "Endpoint",
"headerVersion": 0,
"address": "legacyLineage"
},
"configurationProperties": {
"producer": {
"bootstrap.servers": "localhost:9092"
},
"local.server.id": "lineagesample1",
"consumer": {
"bootstrap.servers": "localhost:9092"
}
}
}
}
]
},
"refreshTimeInterval": 0,
"usesBlockingCalls": false
}
],
"omagserverName": "cocoMDS1",
"omagserverPlatformRootURL": "https://localhost:9443"
}
],
"repositoryServicesConfig": {
"class": "RepositoryServicesConfig",
"auditLogConnections": [
{
"class": "Connection",
"headerVersion": 0,
"qualifiedName": "Console- default",
"displayName": "Console",
"connectorType": {
"class": "ConnectorType",
"headerVersion": 0,
"type": {
"typeId": "954421eb-33a6-462d-a8ca-b5709a1bd0d4",
"typeName": "ConnectorType",
"typeVersion": 1,
"typeDescription": "A set of properties describing a type of connector."
},
"guid": "4afac741-3dcc-4c60-a4ca-a6dede994e3f",
"qualifiedName": "Egeria:AuditLogDestinationConnector:Console",
"displayName": "Console Audit Log Destination Connector",
"description": "Connector supports logging of audit log messages to stdout.",
"connectorProviderClassName": "org.odpi.openmetadata.adapters.repositoryservices.auditlogstore.console.ConsoleAuditLogStoreProvider",
"connectorFrameworkName": "Open Connector Framework (OCF)",
"connectorInterfaceLanguage": "Java",
"connectorInterfaces": [
"org.odpi.openmetadata.repositoryservices.connectors.stores.auditlogstore.OMRSAuditLogStore"
],
"recognizedConfigurationProperties": [
"supportedSeverities"
]
},
"configurationProperties": {
"supportedSeverities": [
"",
"Information",
"Event",
"Decision",
"Action",
"Error",
"Exception",
"Security",
"Startup",
"Shutdown",
"Asset",
"Types",
"Cohort"
]
}
},
{
"class": "Connection",
"headerVersion": 0,
"type": {
"typeVersion": 0
},
"guid": "5390bf3e-6b38-4eda-b34a-de55ac4252a7",
"qualifiedName": "DefaultAuditLog.Connection.lineagesample1",
"displayName": "DefaultAuditLog.Connection.lineagesample1",
"description": "OMRS default audit log connection.",
"connectorType": {
"class": "ConnectorType",
"headerVersion": 0,
"type": {
"typeVersion": 0
},
"guid": "4afac741-3dcc-4c60-a4ca-a6dede994e3f",
"qualifiedName": "Console Audit Log Store Connector",
"displayName": "Console Audit Log Store Connector",
"description": "Connector supports logging of audit log messages to stdout.",
"connectorProviderClassName": "org.odpi.openmetadata.adapters.repositoryservices.auditlogstore.console.ConsoleAuditLogStoreProvider"
},
"endpoint": {
"class": "Endpoint",
"headerVersion": 0,
"type": {
"typeVersion": 0
},
"guid": "836efeae-ab34-4425-89f0-6adf2faa1f2e",
"qualifiedName": "DefaultAuditLog.Endpoint.samplelineage1.auditlog",
"displayName": "DefaultAuditLog.Endpoint.samplelineage1.auditlog",
"description": "OMRS default audit log endpoint.",
"address": "samplelineage1.auditlog"
}
}
]
}
}

Expected Behavior

the sample lineage integration connector log should be able to be configured so there is only one console audit log message destination.

Steps To Reproduce

run the sample lineage connector

Environment

- Egeria:
- OS:
- Java:
- Browser (for UI issues):
- Additional connectors and integration:

Any Further Information?

No response

@davidradl davidradl added bug Something isn't working triage New bug/issue which needs checking & assigning labels Dec 6, 2022
@mandy-chessell
Copy link
Contributor

This behaviour is completely reasonable since it is configured to do so - the configuration shows two auditLogConnections defined - both to write to the console - therefore each will write each audit log record - resulting in 2 copies of each record on the console.

If you only want one copy of each message on the console then I suggest having only one auditLogConnection.

Details of how to configure the audit logs are in the administration guide. Here is a link:

https://egeria-project.org/guides/admin/servers/configuring-an-integration-daemon/#configure-the-audit-log

When an integration daemon is defined, a console audit log destination is added by default. You can add new destinations, with different severity lists. If you want to delete the default console audit log destination then use the delete method before adding the ones you want.

@mandy-chessell
Copy link
Contributor

If this documentation does not answer your question - please make changes to it so it is better for the next person.

@mandy-chessell
Copy link
Contributor

Can this be closed or are you going to transfer it to egeria-docs to make the changes to the documentation?

@davidradl
Copy link
Member Author

@mandy-chessell thanks for your comments they are really helpful. I want to understand how we got audit log entries first then will move to egeria-docs or close as required.

@davidradl
Copy link
Member Author

davidradl commented Jan 10, 2023

I have produced a set of rest calls to configure an integration server. I do

  1. set the server type as an Integration Daemon
    2)I set an audit log using...audit-log-destinations/connection
  2. I configure the integration daemon /{{serverName}}/integration-services/{{integration service name}}

If I query the configuration it has 2 audit logs.

If I miss out step 2 the server will not start as there are no repository services defined

In the logic to set the audit log in OMAGServerAdminServices addAuditLogDestination I see

  public VoidResponse addAuditLogDestination(String                userId,
                                               String                serverName,
                                               Connection            auditLogDestination)
        final String methodName = "addAuditLogDestination";

      ...
            if (auditLogDestination != null)
            {
                OMAGServerConfig serverConfig     = configStore.getServerConfig(userId, serverName, methodName);
                ...
                RepositoryServicesConfig repositoryServicesConfig = serverConfig.getRepositoryServicesConfig();

                if (repositoryServicesConfig == null)
                {
                    OMRSConfigurationFactory configurationFactory = new OMRSConfigurationFactory();

                    repositoryServicesConfig = configurationFactory.getDefaultRepositoryServicesConfig();
                }

                List<Connection>  auditLogDestinations = repositoryServicesConfig.getAuditLogConnections();

                if (auditLogDestinations == null)

So in my case there is a supplied auditLogDestination connection , and repositoryServicesConfig == null so we create a
repositoryServicesConfig which contains a default audit log destination. We then add the one that has been requested, so we end up with 2 audit log destinations.

davidradl added a commit that referenced this issue Jan 10, 2023
#7229 do not create a 2nd unneeded default audit log destination, when adding an audit log destination
@davidradl
Copy link
Member Author

@juergenhemelt fyi I have fixed the double log issue we both saw.

@mandy-chessell
Copy link
Contributor

@davidradl This fix need to be reversed as it is not correct behaviour. The original behaviour was correct.

@juergenhemelt
Copy link

I would expect audit log configurations to be in place the way I configured them. So if one configures explicitly a console audit log there should be only that one. A default audit log should be in place only if there is no other configured. I found it always strange that one has to configure a (default) audit log before configuring a connector. So I think the implementation of @davidradl is more intuitive than the original one.

@davidradl
Copy link
Member Author

@davidradl This fix need to be reversed as it is not correct behaviour. The original behaviour was correct.

@mandy-chessell what is your thinking here? As is, the incremental configuration ends up with 2 audit log destinations - this may be required in some use cases, but shouldn't logging to only one supplied destination be the standard way we configure. It is possible to supply a complete configuration with only one log destination or to remove the 2nd log destination with admin calls. I am unsure what the issue is with the fix. What am I missing?

@planetf1
Copy link
Member

planetf1 commented Jan 11, 2023

An audit log destination is either identified by severities + a simple name (ie console), or connection + properties.
Could we somehow apply de-duplication to configured destinations (severities+names+properties), for example by taking a hash, thus avoiding the specific console dup case whilst also preserving the existing default behaviour?

@davidradl
Copy link
Member Author

davidradl commented Jan 12, 2023

In the developer community call today, we talked about this issue. this issue stems from the postman collection I was using, it should have specified the default audit log destination, any other audit log destination will be added as a second.
The Developers Dojo was broken with this change, so I have reverted nd will close this issue

We should set the default audit log as documented here https://egeria-project.org/guides/admin/servers/configuring-an-integration-daemon/#configure-the-audit-log.

I will amend the postman collection that caused this in odpi/egeria-connector-integration-lineage-event-driven-sample#34

@juergenhemelt

@juergenhemelt
Copy link

I still think the default audit log should be in place when there is no other configured. It should not be required to configure an audit log explicitly if you are fine with the default one.

@davidradl
Copy link
Member Author

davidradl commented Jan 12, 2023

@juergenhemelt
Re-opening for the discussion
It is possible to run without audit logs, maybe if Egeria is running embedded and there is nowhere to write to. We also have had a requirement to not write to the console (which the default audit log would do).

I think you are suggesting that if we start an integration connector with a configuration that does not have an audit log, then we should not fail as we do at the moment, instead we should create an OMRS service with the default audit log at runtime.

I like the way that at the moment the config file is very declarative, and you get exactly what it specifies, failing the start in this case, means that the config author needs to consider what they want to do with audit logs. For development, default audit destination is likely to be ok, for production this is unlikely to be what is required.

The incremental construction of the config file is where we saw this double log file issue, in production often the complete config file is supplied.

@mandy-chessell what do you think?

@davidradl davidradl reopened this Jan 12, 2023
@mandy-chessell
Copy link
Contributor

Recap on current design that @davidradl knows but may be new to @juergenhemelt

Today, there is a strict separation between operations and configuration. At operations time, the configuration document is passed to the platform and it is read-only. It starts the services described in the configuration document.

This is allow a strict separation between users who are able to change configuration and those who are able to run the configuration (useful if the configuration needs to be audited).

On server-startup, the platform does check that the repository services are configured to be able to distinguish between an empty/default configuration document and one that someone has tried to set up. This was a common error in the early days and so the check was added. It also checks that the combination of services enabled is consistent with the server types we have defined. It is not strictly necessary but helped people initially create a consistent runtime environment.

The platform does not care how the configuration document is built. It can be built by an offline utility that creates the configuration document as required and posts it whole to the platform. This allows different tools (eg a vendor who has embedded Egeria) to configure the egeria servers through its own tooling for example). We have a vendor that does this.

The configuration REST services also provide helper services to set up the configuration document. If you remember, the whole point of the platform is to allow the governance operation team to set up and run Egeria without needing IT intervention each time they want a new server/sevice enabled.

There are two basic types of helper services. There are the course-grained helper services and the fine-grained helper services.

The course-grained helper services allow the goverance team to configure each section of the configuration document separately. This is useful for editing the configuration document, or if the governance team wants a different set of options to those offered in the fine-grained configuration services. You would use these for example, if you did not want to have any audit logs configured - or in the case of one of the organizations that runs Egeria in production, they do not want the console audit log at all, only the SLF4J audit log.

The fine-grained helper services are designed to offer simple configuration options that use a lot of defaults. They are aimed at developers, or people learning about Egeria. If the defaults are not what is needed, then the course-grained helper services are there. So these services are not intended to be comprehensive - just to help people get up and running. There is a fine balance between coverage of potential options and too many helper services to choose from.

The fine-grained services require the governance team to explicitly configure the audit log (even if it is just the default) to encourage them to think about the audit logging they need since they may not think about it in the way IT would. The assumption is that the users of these services are inexperienced.

Response to the suggestion above
The services to configure the audit log described above are from the fine-grained helper services. They are being discussed as if they are the only way to configure the servers - and that they must cover all of the options. This is misleading and not the design intent.

Always enabling the console audit log would annoy organizations that do not want the console audit log.

My suggestion is that if you do not want the defaults offered by the fine-grained services, then you use the course-grained options, or build the configuration document you want offline and post it to the platform as one. Then you do not risk disrupting other users and our education material that uses both the fine-grained and course-grained services.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage New bug/issue which needs checking & assigning
Projects
None yet
Development

No branches or pull requests

4 participants