![Egeria Logo](https://raw.githubusercontent.com/odpi/egeria/master/assets/img/ODPi_Egeria_Logo_color.png)

### Egeria Hands-On Labs
# Welcome to the Overview of the Egeria Hands-on Labs

Egeria is an open source project that provides open standards and implementation libraries to connect tools, catalogs and platforms together so they can share information (called metadata) about data and the technology that supports it.

The Egeria technology includes:

- libraries that can be embedded into technologies that need to share metadata
- an integration platform called the **Open Metadata and Governance (OMAG) Platform** for hosting connectors, metadata servers and governance servers

Collectively the Egeria capabilities serve to integrate a diverse range of technologies to create a coherent view of an organization's IT assets and a consistent implementation of governance across infrastructure, security, data, privacy and throughout the software development lifecycle.

The **Egeria Hands-on Labs** provide a practical introduction to the Egeria technology.  They work through different scenarios using the personas from a fictitious company called **Coco Pharmaceuticals**.   Each lab introduces the characters involved and the work they are engaged in.  It defines the key Egeria concepts that you need to understand in order to follow along and then issues calls to Egeria to complete the work, providing explanations of the results received.  **Bold text** is used to emphasize concepts and elements of special note.

You will notice that some of the concepts described include a hyperlink to more information.  This extra information is not required to understand and follow along with the lab.  They are provided for your convenience if you want to follow up on a topic in more detail.   For example, if you wish to understand more about [Coco Pharamceuticals](https://opengovernance.odpi.org/coco-pharmaceuticals/), follow the link.

The labs themselves are implemented using **Jupyter Notebooks**.  These are workspaces that can display text and images (like this section) as well as run code.

## Using the Jupyter Notebooks

This introduction is implemented in a Jupyter Notebook.  Notice along the top there are a number of buttons like this:

<img src="images/jupyter-notebook-play-button.png" style="float:left">

One of them is a triangle.  This is the **play** button needed to step through the notebook.

Press the play button and notice that the left-hand border has changed.

The blue line has moved so it is next to this text.  Press the play button again.

Now the blue line is here.  Each time you press the play button, the notebook advances to the next "cell".  Some cells have descriptions in them and some are code fragments.  When you play a cell with code in it, the code runs.  Often this is to issue a call to an Egeria server.

Some of the code can be complex.  However it is surrounded with explanations so it is not a problem if you are not a software developer.

This is about all you need to know about the Jupyter Notebooks.  (One other quick tip: you can also use `SHIFT-Enter` instead of the play button.)

## Starting up the Egeria platforms

Egeria's hands-on labs assume that there are three OMAG Server Platforms running, along with Apache Zookeeper and Apache Kafka.   Click the play button again to move to the code below, and then again to run the code below.  This will check if the platforms are running.  

You are looking to confirm that `CorePlatform`, `DataLakePlatform` and `Dev Platform` are running.
It has completed its check when you see `Done.`.

In [None]:
%run common/environment-check.ipynb

---
If you see that all platforms are running then you are ready to begin.  If they are not running then
there are a number of choices on how to start them.  Follow [this link to set up and run the platforms](https://egeria.odpi.org/open-metadata-resources/open-metadata-labs/).

## Configuring the servers (one time task)

Once the platforms are running, it is necessary to configure the Egeria servers that run on the platforms.
If the `%run common/environment-check.ipynb` command showed that the servers are not configured,
run through the [Configuring Egeria Servers Lab](./egeria-server-config.ipynb) to create the configuration for the
servers.

The details of this configuration lab may be of interest to IT and operations audiences to understand how Egeria is configured;
for others you may simply want to run all of the cells without spending much time reading through all of the detail of what they are doing. 

Once the servers are configured, you are ready to start the labs.
The configuration is used each time you start the servers and should only need to be run once, no matter how many of
the other labsthat you run.

## Choosing the hands-on labs to run

Egeria is a complex project with many options as it is attempting to integrate the huge variety of technology that runs in organizations today.  To make it easy for you to learn about the parts of Egeria that interest you, the labs are divided into topics focused around different personas at Coco Pharmaceuticals.

The different topics are as follows:

* **Administration Labs** - follow Gary Geeke (IT Infrastructure Lead) as he demonstrates how to manage the Egeria platforms and servers.
* **Asset Management Labs** - follow Peter Profile (Information Analyst) and Erin Overview (Information Architect) as they build and maintain a catalog of Coco Pharmaceutical's assets.  Other colleagues, such as Callie Quartile (Data Scientist), then use the catalog to locate and use the assets they need.
* **Information Architecture Labs** - follow Erin Overview as she works to set up standards around the
information used in Coco Pharmaceuticals.
* **Conformance Testing Labs** - follow Polly Tasker as she runs Egeria's conformance test suite to ensure a metadata server conforms to the Egeria open metadata protocols.
* **Asset UI Labs** - follow Callie Quartile as she utilises Egeria's UI to perform investigative operations in order to understand the metadata flow better.

The sections that follow provide more information on these labs.

## The Administration Labs

<img src="https://raw.githubusercontent.com/odpi/data-governance/master/docs/coco-pharmaceuticals/personas/gary-geeke.png" style="float:left">

Coco Pharmaceuticals is going through a major business transformation that requires them to drastically reduce their cycle times, collaborate laterally across the different parts of the business and react quickly to the changing needs of their customers. (See [https://opengovernance.odpi.org/coco-pharmaceuticals/](https://opengovernance.odpi.org/coco-pharmaceuticals/) for the background to this transformation).

Part of the changes needed to the IT systems that support the business is the roll out of a distributed metadata and governance capability that is provided by ODPi Egeria.

[Gary Geeke](https://opengovernance.odpi.org/coco-pharmaceuticals/personas/gary-geeke.html) is the IT Infrastructure leader at Coco Pharmaceuticals.

We first meet Gary in the labs to configure and start the Egeria OMAG servers.  In the administration
labs, Gary is exploring the OMAG Server Platforms and the servers' configurations and operations in more detail.

The administration labs (under **administration-labs**) include the following activities:

* [Managing Servers](./administration-labs/managing-servers.ipynb) covers how to start, restart and stop servers on the OMAG Server Platforms.
* [Understanding Server Configuration](./administration-labs/understanding-server-config.ipynb) takes a deeper look at the contents of a **Configuration Document** that controls which services are activated in an Egeria server.
* [Understanding Platform Services](./administration-labs/understanding-platform-services.ipynb) shows the commands to query the servers running on a platform plus other platform services.
* [Understanding Cohorts](./administration-labs/understanding-cohorts.ipynb) shows how to query a server to discover the open metadata repository cohorts that it is connected to and the other servers in that cohort.  A server's
cohort membership determines how much metadata it has access to.


## The Asset Management Labs

<img src="https://raw.githubusercontent.com/odpi/data-governance/master/docs/coco-pharmaceuticals/personas/peter-profile.png" style="float:left">
<img src="https://raw.githubusercontent.com/odpi/data-governance/master/docs/coco-pharmaceuticals/personas/erin-overview.png" style="float:right">

As part of the wide-spread business transformation that Coco Pharmaceuticals has embarked on, they
have created a data lake for managing data for research, analytics, and exchange between their internal organizations and business partners (such as hospitals).  As a result, the data lake has to be designed to handle a wide variety of data, including some highly sensitive and regulated data.

In the asset management labs (under **asset-management-labs** we look at how different roles work with the data lake and how Egeria's open metadata is captured, maintained and used.

The first lab called [Building a Data Catalog](./asset-management-labs/building-a-data-catalog.ipynb) explains how data is cataloged and used in the data lake.  The two main characters engaged in the first part of this lab are
[Peter Profile](https://opengovernance.odpi.org/coco-pharmaceuticals/personas/peter-profile.html) and
[Erin Overview](https://opengovernance.odpi.org/coco-pharmaceuticals/personas/erin-overview.html).

<img src="https://raw.githubusercontent.com/odpi/data-governance/master/docs/coco-pharmaceuticals/personas/callie-quartile.png" style="float:right">

Finally we show how visibility to the catalog is controlled through **governance zones**.
This involves [Callie Quartile](https://opengovernance.odpi.org/coco-pharmaceuticals/personas/callie-quartile.html), a data scientist, who is keen to get access to the data that Peter and Erin are cataloging.
She is unable to see the data until Peter and Erin have completed setting up the catalog entries for it.

The [Automated Curation Lab](./asset-management-labs/improving-data-quality.ipynb) shows Peter and Erin automating
the cataloguing procedure described in **Building a Data Catalog** so that files are made available to Callie as
soon as they arrive.

The [Improving Data Quality Lab](./asset-management-labs/improving-data-quality.ipynb) shows Peter setting a
new governance engine to validate the quality of measurements coming in from the hospitals
in support of the clinical trials.  This is incorporated into the automated process set up in **Automated Curation**.

The [Understanding an Asset Lab](./asset-management-labs/understanding-an-asset.ipynb)
has Callie demonstrating how to use the metadata from the catalog to help her with her data science.

The [Open Lineage Lab](./asset-management-labs/open-lineage.ipynb) shows how Peter and Erin can use Egeria to capture and access data lineage information for the data assets in the organization.

## The Information Architecture Labs

In the first information architecture lab,
[Working with Standard Models](infomation-architecture/working-with-standard-models.ipynb), Erin Overview is using the standard
[Cloud Information Model](https://cloudinformationmodel.org/about.html)
as a set of standard terms to clarify the new sales procedures needed for their personalized medicine business.


## The Conformance Testing Labs

In the [Conformance Test Suite Lab](./conformance-testing-labs/run-conformance-test-suite.ipynb), we show how to run the conformance test suite against different technologies.

## Asset UI Labs
In the [Asset Search Lab](./ui-labs/ui-asset-search.ipynb), we show how to create a few simple assets, and subsequently search for them in the Egeria UI.

----