<!-- SPDX-License-Identifier: CC-BY-4.0 -->
<!-- Copyright Contributors to the ODPi Egeria project 2024. -->

![Egeria Logo](https://raw.githubusercontent.com/odpi/egeria/main/assets/img/ODPi_Egeria_Logo_color.png)

### Egeria Workbook

# Enabling the Metadata Observability Harvesters

## Introduction

Egeria is exchanging metadata with many different types of tools, data platforms and engines, as well as coordinating their governance.  Its open metadata repositor(y|ies) accumulate metadata about these tools, their data and proceses as well as the governance actions taking place.  Therefore it provides a detailed insight into the workings of your data and AI landscape.   This notebook expains how to set up the havesters that extract open metadata from the live repositories and adds it to a PostgreSQL database ready for observability analysis and reporting.

Once configured, the harvesters run continuously giving you up-to-date data for dashboards and reports.

First lets initialize **pyegeria**.

In [None]:
# Initialize pyegeria

%run ../pyegeria/initialize-pyegeria.ipynb


In [None]:
view_server="qs-view-server"
egeria_tech = EgeriaTech(view_server, url, user_id, user_pwd)
token = egeria_tech.create_egeria_bearer_token()


---

## Loading support for Metadata Observability

The definition of the harvesting connectors, templates and associated reference data are loaded via a [Content Pack](https://egeria-project.org/content-packs/) called `ObservabilityContentPack.omarchive`.  This content pack is dependent on the definitions in the `PostgresContentPack.omarchive`.  The content packs can be loaded multiple times without ill-effect so run the following commands to make sure they are loaded.

---

In [None]:

egeria_tech.add_archive_file("content-packs/PostgresContentPack.omarchive", None, "qs-metadata-store")

print("PostgreSQL Archive loaded!")

egeria_tech.add_archive_file("content-packs/ObservabilityContentPack.omarchive", None, "qs-metadata-store")

print("Observability Archive loaded!")



----

These archives add the following integration connectors:

* HarvestSurveys - periodically extracts details of the survey reports found in the Open Metadata Ecosystem and maintains a set of tables in a PostgreSQL database schema that describe the survey reports.
* HarvestOpenMetadata - periodically extracts details about the activity going on in the Open Metadata Ecosystem and maintains a set of tables in a PostgreSQL database schema that describe the types of activity and who is engaged in it.

---

In [None]:
display_integration_daemon_status(['HarvestSurveys', 'HarvestOpenMetadata'], 
                                  view_server = 'qs-view-server', view_url = 'https://host.docker.internal:9443',
                                  integ_server ='qs-integration-daemon', integ_url = 'https://host.docker.internal:9443',
                                  width=150, paging = True)

----
The content packs also populate the following governance engines:

* MetadataObservability
* PostgreSQLGovernance
* PostgreSQLSurvey 

These governance engines are called during the processes that configure the integration connectors.

---

In [None]:
display_gov_eng_status(['MetadataObservability','PostgreSQLGovernance','PostgreSQLSurvey'],
                       status_filter=["*"],
                       engine_host = 'qs-engine-host',  view_server = 'qs-view-server',
                       paging = True, jupyter = True,width = 150,sort = True)

----

## Harvesting Survey Reports

The *HarvestSurveyReports:CreateAsCatalogTargetGovernanceActionProcess* governance action process is used to set up the *HarvestSurveys* integration connector.

----

In [None]:
harvestSurveysName="HarvestSurveyReports:CreateAsCatalogTargetGovernanceActionProcess"

process_guid = egeria_tech.get_element_guid_by_unique_name(harvestSurveysName)

process_graph = egeria_tech.get_gov_action_process_graph(process_guid)
print_governance_action_process_graph(process_graph)


----

The code below initiates this process to set up *HarvestSurveys*.  Notice that the request parameters match to properties in the process's specification.  The surveys will be harvested into the *harvest_surveys* schema in the *egeria* database, located in the PostgreSQL Server that is included in the workspaces.

----

In [None]:


requestParameters = {
    "serverName" : "LocalPostgreSQL1",
    "hostIdentifier" : "localhost",
    "portNumber" : "5442",
    "secretsStorePathName" : "loading-bay/secrets/default.omsecrets",
    "versionIdentifier" : "1.0",
    "schemaDescription" : "PostgreSQL database schema in egeria-workspaces.",
    "databaseName" : "egeria",
    "schemaName" : "harvested_surveys"
}

egeria_tech.initiate_gov_action_process(harvestSurveysName, None, None, None, requestParameters, None, None)



In [None]:
display_engine_activity_c()

In [None]:
#run list_gov_eng_status

----

## Harvesting Open Metadata Ecosystem Activity


----

In [None]:
harvestOpenMetadataName="HarvestOpenMetadataEcosystem:CreateAsCatalogTargetGovernanceActionProcess"

process_guid = egeria_tech.get_element_guid_by_unique_name(harvestOpenMetadataName)

process_graph = egeria_tech.get_gov_action_process_graph(process_guid)
print_governance_action_process_graph(process_graph)


In [None]:


requestParameters = {
    "serverName" : "LocalPostgreSQL1",
    "hostIdentifier" : "localhost",
    "portNumber" : "5442",
    "secretsStorePathName" : "loading-bay/secrets/default.omsecrets",
    "versionIdentifier" : "1.0",
    "schemaDescription" : "PostgreSQL database schema in egeria-workspaces.",
    "databaseName" : "egeria",
    "schemaName" : "harvested_om_activity"
}

egeria_tech.initiate_gov_action_process(harvestOpenMetadataName, None, None, None, requestParameters, None, None)



In [None]:
display_engine_activity_c()