DAT263 - Create and Manage End-to-End Data Pipelines with SAP Data Intelligence

Tutorial Description

This repository contains the material for the SAP TechEd 2020 workshop with Session ID - DAT263. This tutorial can be completed as part of an SAP guided workshop or on your own time by using your own SAP Data Intelligence instance.

Overview

This session introduces participants to use the SAP Data Intelligence Modeler to create data pipelines. We try to touch as many aspects as possible within an interactive 2h workshop. We will follow a use case that is based on a customer request in the area of IoT and quality management. The background story is quite simple.

If you are doing these tutorials as part of a workshop then please follow the 2h tutorials

If you are doing these tutorials on your own time then please follow the 3h tutorials which includes two additional exercises: File concatenation and Jupyter Notebook analysis)

Scenario description

On a daily basis a customer receives the configured values of several IoT device that reflect what nominal value that the device should produce. We refer to this as the configuration dataset. Throughout the day actual performance values of each device is received, we refer to this dataset as the performance dataset. All datasets are stored as files in separate subdirectories in an object store e.g. an Amazon S3 bucket.

Process

Append all configuration files and all performance files into corresponding single files and store them to another object store location. (3h tutorials only)
Merge the 2 resulting files into a HANA table by using projections, aggregation and joining.
Do a simple data validation and create for the failed data a quality management service ticket.
In order to improve the quality check a data scientist should be able to do an analysis of the IoT data to eventually develop an early alert schema (3h tutorials only).
The central device configuration and performance table should be exposed via a webservice to retrieve the device status from outside.

Acquired Skills

After having done all the tasks you are familiar with the general concept of using operators in SAP Data Intelligence Modeler, how to read and ingest data to/from multiple data sources, and how to analyze this data using Jupyter Notebook

Requirements

One of the following SAP Data Intelligence versions:
- SAP Data Intelligence 3.1 On-premise edition, patch 0
- SAP Data Intelligence 3.1 Trial Edition
- SAP Data Intelligence Cloud Edition 2010 or newer
Chrome browser (Recommended)
[Workshop participants only] Login credentials to your SAP Data Intelligence Cloud instance
- See registration page.
[Self-guided users only] The following connections must be created in Connection Manager.
- A cloud storage connection e.g S3 / GCS / WASB / ADL
- A Smart Data Lake (SDL) connection
- A HANA database connection
Note that above connections are already predefined in SAP Data Intelligence 3.1 Trial Edition

Video walkthrough at SAP HANA Academy

If you do not have access to a instance of SAP Data Intelligence or want to review the tutorials then you can watch a video walkthrough on [SAP HANA Academy](https://www.youtube.com/playlist?list=PLkzo92owKnVyY89xEshp_cSQ0QF8EE927)

Exercises

2h Workshop (Guided workshop tutorials)

3h Workshop (Self-guided tutorials)

How to obtain support

Support for the content in this repository is available during the actual time of the online session for which this content has been designed. Otherwise, you may request support via the Issues tab.

Name		Name	Last commit message	Last commit date
Latest commit History 136 Commits
.reuse		.reuse
LICENSES		LICENSES
datasets		datasets
exercises		exercises
jnb		jnb
scripts		scripts
solution		solution
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
scripts.zip		scripts.zip

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DAT263 - Create and Manage End-to-End Data Pipelines with SAP Data Intelligence

Tutorial Description

Overview

Scenario description

Process

Acquired Skills

Requirements

Video walkthrough at SAP HANA Academy

Exercises

2h Workshop (Guided workshop tutorials)

3h Workshop (Self-guided tutorials)

How to obtain support

License

About

Releases

Packages

Contributors 7

Languages

License

SAP-archive/teched2020-DAT263

Folders and files

Latest commit

History

Repository files navigation

DAT263 - Create and Manage End-to-End Data Pipelines with SAP Data Intelligence

Tutorial Description

Overview

Scenario description

Process

Acquired Skills

Requirements

Video walkthrough at SAP HANA Academy

Exercises

2h Workshop (Guided workshop tutorials)

3h Workshop (Self-guided tutorials)

How to obtain support

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 7

Languages

Packages