Cloud Training and Cloud Inference on Amazon Sagemaker/Elastic Kubernetes Service

This is a workflow to demonstrate how users utilize the Intel’s hardware (Cascade Lake or above) and related optimized software to perform cloud training and inference on Amazon Sagemaker platform. The main software packages used here are: HuggingFace, Amazon SageMaker, Intel Optimized TensorFlow and Intel Neural Compressor

Implementation Details

This workflow ultilizes the infracsture provided by the Amazon. There are two versions provided by this workflow - EKS and SageMaker.

Architecture

Sagemaker:

EKS:

Model Spec

The uncased BERT base model is used to demonstrate this workflow.

bert-base-uncased-config = {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 128,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "position_embedding_type": "absolute",
  "transformers_version": "4.21.1",
  "type_vocab_size": 2,
  "use_cache": true,
  "vocab_size": 30522
}

Dataset

Microsoft Research Paraphrase Corpus is used as the dataset for training and testing.

Type	Format	Rows
Training Dataset	HuggingFace Dataset	3668
Testing Dataset	HuggingFace Dataset	1725

Getting Started

Prerequisites

Hardware

Name	Description
CPU	Intel CPU - Cascade Lake or above
Usable RAM	16 GB
Disk Size	256 GB

Software

Docker is required to start this workflow. You will also need AWS credentials and the related AWS CLI installed on the machine to push data/docker image to the Amazon ECR.

For setting up the AWS credential account, you may refer to the following link:
https://aws.amazon.com/account/

For configuring the AWS credentials, you may refer to the following link:
https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html

Installation

Clone the repository

git clone https://github.com/intel/NLP-Workflow-with-AWS.git
cd NLP-Workflow-with-AWS

Build the docker image to prepare the environment for running the Juypter notebooks.

cd scripts
sh build_main_image.sh

Usage

Use the built docker image (by build_main_image.sh) to run the Juypter notebooks. Execute the following command:

cd scripts
sh start_script.sh

Sagemaker/EKS Training

After starting the container (i.e. executing the step in 'Usage'), execute the following command in the interactive shell.

cd /root/notebooks
jupyter notebook --allow-root

Start the notebook that is named as training and eks/sagemaker.

Sagemaker/EKS Inference

After starting the container (i.e. executing the step in 'Usage'), execute the following command in the interactive shell.

cd /root/notebooks
jupyter notebook --allow-root

Start the notebook that is named as inference and eks/sagemaker.

License

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
assets		assets
notebooks		notebooks
scripts		scripts
src		src
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

assets

assets

notebooks

notebooks

scripts

scripts

src

src

Dockerfile

Dockerfile

LICENSE

LICENSE

README.md

README.md

Repository files navigation

Cloud Training and Cloud Inference on Amazon Sagemaker/Elastic Kubernetes Service

Table of Contents

Implementation Details

Architecture

Model Spec

Dataset

Getting Started

Prerequisites

Hardware

Software

Installation

Usage

Sagemaker/EKS Training

Sagemaker/EKS Inference

License

About

Releases 2

Packages

Contributors 2

Languages

License

intel/NLP-Workflow-with-AWS

Folders and files

Latest commit

History

Repository files navigation

Cloud Training and Cloud Inference on Amazon Sagemaker/Elastic Kubernetes Service

Table of Contents

Implementation Details

Architecture

Model Spec

Dataset

Getting Started

Prerequisites

Hardware

Software

Installation

Usage

Sagemaker/EKS Training

Sagemaker/EKS Inference

License

About

Resources

License

Security policy

Stars

Watchers

Forks

Languages