Skip to content

mindis/AI_Conf_London

 
 

Repository files navigation

Attention Network for Text Summarization on Kubeflow

Presented at O'Reilly Artificial Intelligence Conference London : "Deep learning and attention networks all the way to production" https://conferences.oreilly.com/artificial-intelligence/ai-eu/public/schedule/detail/78072

Authors

  1. Dr. Vijay Agneeswaran
  2. Pramod Singh
  3. Akshay Kulkarni

Highlights of the session :

  1. Overview of Attention Networks ( What and Why )
  2. Set Up GCP Environment
  3. Attention Networks for text summarization
  4. How to leverage Kubeflow for Industrialization
    1. Setup Kubeflow on GCP
    2. Use TensorFlow 2.0 to create attention network
    3. Use Kubeflow Notebook Server for training
    4. Containerize model for training at scale
    5. Save the trained model at GCS
  5. Challenges and Future Work

Research Papers

  1. Pascal Vincent, Hugo Larochelle, Isabelle Lajoie, Yoshua Bengio, and Pierre- Antoine Manzagol. “Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion”

  2. Alan Akbik, Duncan Blythe and Roland Vollgraf ,”Contextual String Embeddings for Sequence Labeling”

Step-By-Step Guide for Running Attention Network on Kubeflow

  1. Access the Code - Clone the github repo into your google cloud shell
git clone https://github.com/pramodsinghwalmart/AI_Conf_London.git

  1. Navigate to code directory
cd AI_Conf_London

  1. Set current working directory
WORKING_DIR=$(pwd)

  1. Setup Kubeflow in GCP

Make sure you have gcloud SDK is installed and pointing to the right GCP PROJECT. You can use 'gcloud init' to perform this action.

gcloud components install kubectl

  1. Setup environment variables
export PROJECT=<PROJECT_ID>

export DEPLOYMENT_NAME=kubeflow

export ZONE=us-central1-a

gcloud config set project ${PROJECT}

gcloud config set compute/zone ${ZONE}

  1. Use one-click deploy interface by GCP to setup kubeflow using https://deploy.kubeflow.cloud/#/ . Just fill Deployment Name (kubeflow) and Project ID and select appropriate GCP Zone(us-central1-a) . You can select Login with username and password to access Kubeflow service.Once the deployment is completed, you can connect to the cluster.

  2. Connecting to the cluster

gcloud container clusters get-credentials \
    ${DEPLOYMENT_NAME} --zone ${ZONE} --project ${PROJECT}

  1. Set context and create a namespace
kubectl config set-context $(kubectl config current-context) --namespace=kubeflow

kubectl get all

  1. Install Kustomize Kubeflow makes use of kustomize to help manage deployments.We have the version 2.0.3 of kustomize available in the folder already.This tutorial does not work with later versions of kustomize, due to bug /kustomize/issues/1295.
cd kustomize

mv kustomize_2.0.3_linux_amd64 kustomize

chmod u+x kustomize

cd ..

add ks command to path

PATH=$PATH:$(pwd)/kustomize

check if kustomize installed properly

kustomize version

  1. Allow docker to access our GCR registry
gcloud auth configure-docker --quiet

  1. Create GCS bucket for model storage
cd training/GCS

export BUCKET=${PROJECT}-${DEPLOYMENT_NAME}-bucket

gsutil mb -c regional -l us-central1 gs://${BUCKET}

  1. Build Training Image using docker and push to GCR
export TRAIN_IMG_PATH=gcr.io/${PROJECT}/${DEPLOYMENT_NAME}-train:latest

build the tensorflow model into a container

docker build $WORKING_DIR -t $TRAIN_IMG_PATH -f $WORKING_DIR/Dockerfile

push the newly build image to GCR

docker push ${TRAIN_IMG_PATH}

  1. Prepare the training component to run on GKE using kustomize

Give the job a name so that you can identify it later

kustomize edit add configmap attention   --from-literal=name=attention-training

Configure the custom training image

kustomize edit add configmap  attention  --from-literal=imagename=gcr.io/${PROJECT}/${DEPLOYMENT_NAME}-train

kustomize edit set image training-image=${TRAIN_IMG_PATH}

Set the training parameters (training steps, batch size and learning rate). Note - We are going to declare these parameters using kustomize but we are not using any of these for this tutorial purpose.

kustomize edit add configmap attention --from-literal=trainSteps=200

kustomize edit add configmap attention --from-literal=batchSize=100

kustomize edit add configmap attention --from-literal=learningRate=0.01

Configure parameters and save the model to Cloud Storage

kustomize edit add configmap attention --from-literal=modelDir=gs://${BUCKET}

kustomize edit add configmap attention --from-literal=exportDir=gs://${BUCKET}/export

  1. Check the permissions for your training component You need to ensure that your Python code has the required permissions to read/write to your Cloud Storage bucket. Kubeflow solves this by creating a user service account within your project as a part of the deployment. You can use the following command to list the service accounts for your Kubeflow deployment
gcloud iam service-accounts list | grep ${DEPLOYMENT_NAME}

Kubeflow granted the user service account the necessary permissions to read and write to your storage bucket. Kubeflow also added a Kubernetes secret named user-gcp-sa to your cluster, containing the credentials needed to authenticate as this service account within the cluster

kubectl describe secret user-gcp-sa

To access your storage bucket from inside the train container, you must set the GOOGLE_APPLICATION_CREDENTIALS environment variable to point to the JSON file contained in the secret. Set the variable by passing the following parameters

kustomize edit add configmap attention --from-literal=secretName=user-gcp-sa

kustomize edit add configmap attention --from-literal=secretMountPath=/var/secrets

kustomize edit add configmap attention --from-literal=GOOGLE_APPLICATION_CREDENTIALS=/var/secrets/user-gcp-sa.json

  1. Train the model on GKE
kustomize build .

kustomize build . |kubectl apply -f -

kubectl logs -f attention-training-chief-0

  1. Check the saved model at the GCS bucket/export location

  2. Clean up resources

Delete the deployement

gcloud deployment-manager --project=${PROJECT} deployments delete ${DEPLOYMENT_NAME}

Delete the docker image

gcloud container images delete gcr.io/$PROJECT/${DEPLOYMENT_NAME}-train:latest

Delete the bucket

gsutil rm -r gs://${BUCKET}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 57.5%
  • Python 33.4%
  • Shell 7.3%
  • Dockerfile 1.8%