#In this Course
You will learn:

Overview of SageMaker
Algorithm Implementation
Training and Deploying Models
Performance Tuning of Models
Reinforcement Learning with SageMaker
Ground Truth - Automating Data Labeling
Scaling and Commercializing Models
Authentication, Monitoring, and Security
Limits and Integrating Frameworks
Swipe over to the next cards to start learning.

#What is SageMaker?
SageMaker provides a Platform for Data Scientists and Developers to train and build various machine learning models and deploy them into a production environment.
It offers instances backed by environments integrated with Jupyter Notebook to help in every step of the Data Science life cycle.
It provides optimized in-built algorithms along with support for external frameworks that can be operated on large amounts of data in a distributed environment.
It is flexible for any workflow, and billing is without commitments/minimum fees in use and pay manner.

![alt text](https://docs-cdn.fresco.me/system/attachments/files/004/823/891/large/95bf857c820ed273d46d2d03cc59a8d77c93dc7f/Datascience_lifecycle.jpeg)

SageMaker is used in the entire Data Science Life cycle.
It ranges from Data cleaning, preparation, pre-processing, training, evaluation an deployment.
Machine Learning is a continuous process, SageMaker facilitates this through inference collection and model retraining with new data.

#How Does This Work?
SageMaker leverages several Amazon/AWS Services to achieve its functionality.
It uses Amazon S3, API, Lambda, EC2, and CloudWatch.
These services are used to store, compute, monitor and serve the applications in real-time within milliseconds.
Considering the Data Science Lifecycle,
Data can be stored and fetched from Amazon S3.
Cleaning/Preprocessing of data is facilitated through Jupyter Notebooks integrated Instances.
Training can be done on both CPU and GPU Instances with pre-loaded or customized algorithms.
Evaluation and inference collection can be done by AWS SDK or SageMaker SDK and Jupyter Notebooks.
For Deployment, Models can be reengineered for application integration or can be deployed through Hosting Services involving simple API calls.
Move over to the next cards to learn it in detail.


#What's Next?
Once an Instance is created,

Data can be imported from S3 using Jupyter Notebook Scripts.
Data exploration, transformation and pre-processing are facilitated through Jupyter Notebooks.
Once the Data is prepared models are trained, deployed and validated as per required.
You can find Step by Step Instructions here for Data Download and Pre-processing, Model Training, Deployment, Validation, Cleaning Up and Considerations.

We will look in-depth about training and deployment in the next cards.

#Training a Model
Training a model through SageMaker requires a Training Job.
A Training Job is created using S3 bucket details for training data, EC2 registry with training code, compute resources and output destination.
Once a training job starts, artifacts are created and stored as output files in the S3 bucket mentioned as an output destination.
Training Job can be triggered by a management console, but considerations must be taken such that memory issues do not cause a system failure.

#Deploying a Model
A model can be deployed using Hosting Services.
Using this method, when a model is created, HTTPS end-point can be configured and created.
Different versions of end-points can be created and updated as per requirements.
Once a model is deployed, predictions from the model can be achieved through one prediction at once or through the batch transform.
Batch transform is preferred for inference generation and when real-time processing is not needed.
It is made possible through integration with Amazon S3.

#Validating Models
Validation can be done in two methods:

Offline: Using "hold-out" datasets where some part of training data is retained and used for evaluation. Typically 20-30% is kept aside for this practice.

Online: In this method, multiple models can be deployed using the same endpoint and endpoint can be configured to send small samples of live data to each model as specified. Once each model is evaluated, the required model can be configured to take in all the traffic.

SageMaker Console is the preferred method to perform the tasks related to training, deployment, and validation.


#Instances
Instances in SageMaker are EC2 instances with Jupyter Notebook application running in it. It is generally called a SageMaker Notebook Instance.
Jupyter Notebook can be considered as a REPL environment for code execution where ML models can be built, trained, evaluated, and deployed.
Creating a Jupyter Notebook:
Creates Virtual Private Cloud - Subnet to access instance by default.
Launches ML Compute Instance - Inside VPC subnet where Notebook Instance runs.
Installs Anaconda Packages - Anaconda common packages and packages like TensorFlow and Apache MXNet are installed along with other deep learning packages in an instance.
Creates Storage Volume - 5 GB (default) to 16TB of storage to store data in 1GB increments.
Copy Jupyter Notebooks Samples - Operating samples are loaded into the instance.


#Interacting with Instances
Instances can be accessed via Management Console or through API.
Access can be secured via IP Filter or VPC Interface.
Example Notebooks are available in the instance as well as GitHub to understand the working of SageMaker.
Kernel support is provided for Python 2/3, TensorFlow, PySpark and MXNet by default.
Kernels can be installed for Scala, R, and Theano.
They can further be integrated with Git Repositories.

#Instance Selection
EC2 instance types must be specified to run these training algorithms.
Recommeded instances are:
ml.m4.xlarge, ml.m4.4xlarge, and ml.m4.10xlarge
ml.c4.xlarge, ml.c4.2xlarge, and ml.c4.8xlarge
ml.p2.xlarge, ml.p2.8xlarge, and ml.p2.16xlarge
Based on algorithm, GPU instances can also be used but may incur additional costs.
Cost effective solution may be selected at the later point according to scale.

#Algorithm Selection
To build a model, SageMaker provides algorithms in three ways.
Built-In Algorithms: These algorithms are already present in SageMaker and are further tested extensively for performance and accuracy. SageMaker provides hyperparameter tuning service which can further improve the performance of these algorithms without much effort.
Custom Algorithms: We bring these algorithms from Git or built using the SageMaker Platform. They are custom built for our specific problem when in-built algorithms are not much useful.
AWS Market Place: Algorithms can be subscribed or can be paid for in AWS Market Place for from their respective publishers if they satisfy our needs.


#Built-In Algorithms
SageMaker provides 17 built-in algorithms that include algorithms for classification, text processing, image processing, Clustering, etc.
Based on the algorithm selected, it can be trained, tested, and validated accordingly.
For text-based algorithms, the csv/text data format is used for training.
Training data can be collected using Amazon Athena, AWS Glue, Amazon RDS, Amazon EMR, and Amazon Redshift, but it must be pre-processed before training job is called.
"Hold-out" datasets for fine-tuning models can be specified while training is initiated.
Training Logs can be monitored via Amazon CloudWatch.


#Available Built-in Algorithms
Algorithm	Usage	Algorithm	Usage
BlazingText	NLP	DeepAR Forecasting	Forecasting Scalar time series
Factorization Machine	Classification and Regression	Image Classification	Supervised Image Classification
IP Insights	Unsupervised IPv4 classification	K-Means	Unsupervised Clustering
K - Nearest Neighbors	Non-parametric Classification/regression	Latent Dirichlet Allocation	Text Processing
Linear Learner	Classification/Regression	Neural Topic Model	Unsupervised Text Processing
Object2Vec	General Purpose Neural Embedding	Object Detection	Image Classification
Principal Component Analysis	Unsupervised Feature Reduction	Random Cut Forest	Unsupervised Anamoly Detection
Semantic Segmentation	Computer Vision	Sequence-to-Sequence	Machine Translation
XGBoost	Classification/Regression		


#Custom Algorithms
SageMaker Algorithms are packed as Docker Images.
This facilitates flexibility in case of both built-in and custom algorithms.
Custom Algorithms can be implemented in multiple ways:
Using a custom Inference code with built-in algorithms.
Using custom algorithms with Sagemaker Inference code.
Using custom inference code and algorithms packed as Docker Image.
Writing code in Jupyter Notebook Instance using advanced frameworks in SageMaker and using it.


#Extending Custom Algorithms
Custom Algorithms once created can further be extended as:
Algorithm Resource: The Algorithm along with Inference Code(optional) can be created into an Algorithm Resource which can be used for training jobs. It can further be published into AWS Marketplace for monetizing it.
Model Package Resource: Along with inference code and/or algorithm resource, and artifact location, a Model Package Resource can be created and published which can be directly used for creating deployable models.
These resources can either be monetized, or we can even use/subscribe to resources of others using the AWS Market Place.
8 of 11

#From AWS Market Place:
Algorithm resources can be used to create training jobs, hyperparameter tuning, and model package resources.
Model package resources can be used to create models, publish endpoints through hosting services and perform batch transform/live inference jobs.


#Training Metrics
In the previous topics, we saw how training jobs can be performed on algorithm selection and how they can be validated.

Once a training job is run, the details such as training error and prediction accuracy are sent as logs to Amazon CloudWatch, where they can be analyzed and visualized. They can be monitored to analyze model performance.

Metrics are pre-defined for built-in algorithms. However, for custom algorithms, you must define what must be sent as metrics through the Management Console or SDK.

SageMaker uses regular expressions to capture details among the log files.

You can find more on this, here.

#Incremental Training
Incremental training helps you to use existing artifacts or part of them along with expanded dataset. This is to be considered when inference accuracy decreases over time.
This saves time and resources when we can use parts of public artifacts or our artifacts for training job as a training need not be done from scratch.
It can be used to continue paused/stopped training job and enables training using hyperparameter tuning or different datasets while saving resources and being cost effective.
It can be either done via a management console or SDK.

#Including Metadata
To include Metadata with the dataset for the training job, an Augmented Manifest file must be used.
This file must also be stored in S3, but the file type must be specified beforehand.
Also, the file must be in JSON Lines format.
This can be performed either via the Management console or SDK.

#Hyperparameter Tuning
Hyperparameters are parameters set on the model of a specific algorithm before training starts, which at a higher level decides the capacity and speed of learning of model.
For each algorithm, there can be multiple hyperparameters which have multiple ranges.
SageMaker supports automatic Hyperparameter tuning, where multiple models are trained by varying the hyperparameters in specified ranges to find the best possible model.
It is supported for both built-in and custom algorithms but must be specified for the custom algorithms.

#How this Works?
You must specify the metric that you are trying to improve using Hyperparameter tuning before using it in SageMaker.
For in-built algorithms, the metric specification is not required.
In SageMaker, Bayesian Optimization is used for Hyperparameter tuning.
The first set of hyperparameter values are tested, and the results are evaluated through regression to find the best possible combination of hyperparameters.
Sometimes, it may try to explore other values which are never tried to find possible improvement.
However, Hyperparameter tuning may improve productivity by selecting the best possible values but cannot always improve the model in case of complex algorithms or scenarios.


#Working with Hyperparameter Tuning
First, we must choose proper metrics (up to 20) to be monitored out of which one can be specified for improvement.
Ranges for these Hyperparameters must be specified to run the jobs.
The Best Practices include:
Selecting limited number of hyperparameters.
Limiting the ranges of hyperparameters.
Specifying if parameters are linear or log scaled.
Reduce compute resources through optimal selection of concurrent running jobs.
Use multiple instance to distribute tuning task.


#Pause, Continue, and Stop
Warm Start Hyperparameter Tuning jobs use the previous point of tuning as a start point.
Although they take more time, they might help in iterative tuning, tuning over new data, hyperparameter customization from previous runs and start a stopped tuning job.
Also, few algorithms such as XGBoost and Linear Learner support stopping of tuning job.
This is preferred when tuning job is ineffective in improving the chosen metric.
It saves time and compute resources and can also be continued once the results are analyzed and hyperparameters are re-customized.


#Deploying and Inferences
Once a model is trained and tuned, it can be deployed to get inferences i.e predictions/outputs for input data.
Publishing the model as HTTPS endpoint using hosted services can attain us this functionality.
However, there can be a few more customizations with respect to SageMaker to get inferences.
These include:
Inference Pipelines
SageMaker Neo
Elastic Inference


#Inference Pipelines
Inference Pipeline is a combination of 2 to 5 containers in a linear sequence which can be used for pre-processing, prediction and post-processing of input data.
These can be made from built-in models or custom models.
The pipeline is completely managed, and each container communicates with the next container in the pipeline using HTTP requests.
The final output can be sent to the client, and the pipelines can also be configured for the batch transform.
The containers run on the same EC2 instance reducing latency, and their order can be configured and changed using management console or API.


#Using the Pipeline
Inference Pipelines can be used to eliminate external pre-processing.
Spark and Sci-kit learn jobs, for example, can be run on containers before prediction using Amazon Glue and SageMaker packaging, respectively.
Few lines of code can help to eliminate external processing in the pipeline.
The containers and pipeline can be monitored continuously using Amazon CloudWatch.


#Suggested Practices
It is recommended to deploy multiple instances in different Availability Zones for the service endpoints to avoid outages in case of instance failure.
VPC used to access instance is also suggested to have atleast two subnets in two Availability Zones for the same.
Reliable performance can be attained with small instances over multiple Availability Zones.


#Moving Further
In this topic, you have learned how a model can be optimized while training, how you can add multiple steps in inference generation and improve performance, and manage resources efficiently.

Refer the following to learn more about Hyperparameter Tuning and Inferences, and get step-by-step guides for doing the same.

Now, move over to the next topics to explore, how models in Amazon SageMaker can be scaled, organized and monetized.


#Search in SageMaker
Search Functionality in SageMaker can be used for:

Finding, Organizing and Evaluating Training jobs. This can be done based on metadata, properties or hyperparameters.
Ranking of models based on metrics such as validation accuracy or training loss.
Tracing models to their roots such as training jobs and related resources. These resources can be datasets for example.
Additionally, tags can be used to group related resources.

Note: Search is currently in preview.


#Searching Related Resources
Amazon SageMaker Search Related Information on resources such as models and training jobs can be investigated to analyze possible issues in scenarios of degrading model performance.
Investigations can be done as following:
Find training job, metrics, algorithm, hyperparameters, and dataset associated with the model.
Find training job, model, and an endpoint associated with a dataset.
Investigate model containers and find models associated with them and the total pipeline they are connected in.
Search is supported on through both Management console and API.

#Auto Scale
SageMaker supports Automatic Scaling for production environments.
Based on workload, instances can be added or deleted as needed.
It is facilitated through Amazon CloudWatch Metrics and configured using Scaling Policy.
It can be configured using Management Console, CLI or Auto Scaling API through setting a pre-defined metric to be monitored for scaling.

#Configuring Scaling
Automatic Scaling requires the following components to be configured:

Permissions: SagemakerFullAccessPolicy IAM policy must be active to perform any actions related to auto-scaling.
Service Linked Role: AWS Services are linked to IAM policies via Service Linked Role and permission must be given to AWSServiceRoleForApplicationAutoScaling_SageMakerEndpoint.
Target Metric: Target Metric must be selected to trigger auto-scaling policy as it using target tracking scaling policy.
Maximum and Minimum Capacity: Limits must be set on maximum and minimum no.of instances that can be maintained for scaling.
Cool Down Period: The period for which the scaling activities must wait between consecutive scaling jobs. The values of scale-in (reduce instances) and scale-out (increase instances) time must be specified(default 300 seconds).


#Other Considerations
Auto Scaling can be configured, edited or deleted via API or Management console.
It is recommended to perform load testing to check if auto-scaling is working.
Updating and deleting auto-scaled endpoints may require some work based on Permissions given to Application Auto Scaling.
When there is no traffic to the auto-scaled endpoint, auto-scaling may not work as metrics cannot be analyzed without traffic.
Step Scaling is also supported by SageMaker where the scaling is based only on capacity rather than target tracking.


#Monetize on Your Models
In previous topics, we discussed how Algorithms and Model packages could be subscribed from AWS Market Place and used.
Once we create a model, it can be transformed into a Model Package Resource. Similarly, algorithm resources can be created from algorithms.
These resources can be published for sale on the AWS Market Place.
Other users can subscribe to them where they can use them in their models and pay the amount.
AWS Private Market Place is also a place for pre-approved authorized products.
You can register as a seller and leverage these services to monetize your work.

#Reinforcement Learning
Reinforcement Learning optimizes agent operation in an environment using policy.
The simplest example is a robot trying to get out of a maze in the lowest possible time.
In this method, the agent observes the environment, takes action and is rewarded based on the current state of the environment.
On a long term scenario, the goal is to get the maximum reward based on its actions.
It is suited in scenarios where an agent needs to take autonomous decisions.


#Where is it Used?
RL is used in particularly solving complex and large problems.
It is well suited for dynamic and uncertain environments as the agent learns continuously through reward and punishment for its actions.
It is currently being adopted in fields of Gaming AI, Heating, Ventilation, and Air-Conditioning (HVAC) systems, Supply Chains and Industrial Robotics.

#Frameworks Supported
RL is based on Markov Decision Processes (MDPs).
It requires a Deep Learning Framework, RL toolkit and RL environment.
SageMaker supports Apache MXNet and TensorFlow frameworks for Deep Learning.
The Interaction between environment and agent is managed by RL toolkit and SageMaker supports Ray RLlin and Intel Coach Toolkits providing Industry leading RL algorithms.
SageMaker supports a wide variety of open-source and custom RL environments.


#More on RL Environments
RL Environments are simulations of Real-World scenarios used to simulate the working of the agent.
They are useful when real-world training is not possible due to safety considerations (Drone piloting) or when decision making takes more time(gameplay).
SageMaker provides OpenAI Gym Environments by default.
Users can also open source environments such as environments from EnergyPlus, RoboSchool, etc.
Commercial environments can be used from building their containers but must be licensed by the user (e.g., Simulink, MATLAB).

#Markov Decision Process
An RL Problem must be defined in the form of MDP before it can be worked on. Consider a problem of auto-scaling where capacity must be changed based on a defined set of conditions. The conditions maybe thresholds, alarms or manual steps. The components of this problem include:

Objective - Scaling the instance capacity to get the required load profile of an application.
Environment - A simulation with data of daily/weekly variations. This must be a custom environment with a load profile. The system is such that when a new instance is created to facilitate scaling, the instance might take some time before it starts taking the jobs/requests.
State - It represents the current load, working instances, and failed jobs/requests.
Action - Add, Remove, or keep the instances at the same number.
Reward - Positive when transactions are successful, the penalty when failing transaction threshold is crossed.
This is an example of Markov Decision Processes for Reinforcement Learning.

#Workflow in SageMaker
For Reinforcement Learning in SageMaker, the following steps can be considered as the basic workflow.

Formulating the RL problem - The problem is defined as the components of the Markov Decision Process.
Defining the RL environment
Defining the presets such as hyperparameters for the algorithm.
Writing the training code for the training job
Training the RL Model using Amazon SageMaker RLEstimator. This can be done on local or via SageMaker(captures Metrics for CloudWatch).
Visualizing Metrics on CloudWatch and over time we can check if the reward is affecting the performance of the model. -Evaluating the model using checkpointed data of previous models.
Deploying RL models via hosted services or AWS IoT Greengrass.


#Distributed Training and Tuning
Amazon SageMaker RL supports distributed training via multi-core and multi-instances distributed training.
Training and Environment rollout can also be distributed.
Currently, it supports:
Single training instance and multiple rollout instances with the same instance type.
Single trainer instance and multiple rollout instances with different instance types.
Single training instance with multiple cores for the rollout for single threaded lightweight scenarios.
Multiple instances for training and rollouts.
Hyperparameter tuning is supported for all parameter selections.


#Distributed Training and Tuning
Amazon SageMaker RL supports distributed training via multi-core and multi-instances distributed training.
Training and Environment rollout can also be distributed.
Currently, it supports:
Single training instance and multiple rollout instances with the same instance type.
Single trainer instance and multiple rollout instances with different instance types.
Single training instance with multiple cores for the rollout for single threaded lightweight scenarios.
Multiple instances for training and rollouts.
Hyperparameter tuning is supported for all parameter selections.

#Moving Further
Now, you have a basic understanding of implementing Reinforcement Learning in Amazon SageMaker.

You can learn more about this, here.

Now, move on to the next topic to learn about Ground Truth, an advanced labeling service provided as part of Amazon SageMaker.

#What is Ground Truth?
Models require high-quality, large, labeled datasets. Ground Truth helps you to manage data labeling in an automated way.
Amazon can provide you it via Amazon Mechanical Turk, or through a vendor in Amazon Market Place or offer you a platform to manage your private workforce via Amazon SageMaker Ground Truth to label data.
The labeled data can be used for its models or SageMaker models.
There is an option for automated data labeling, where Ground Truth processes the data to be labeled to check if it requires manual work or not. This saves time and effort.
This can be done using pre-built or custom tools and templates.
Instructions can be given to workers based on labeling template.


#How Ground Truth Works?
First, the datasets must be stored on the Amazon S3 bucket.
The bucket contains the input data for labeling, input manifest file for ground truth to read data and output manifest file with the labeling job results.
A labeling job must be created configuring parameters like permissions and files.
The workers must be selected who must work on data.
The instructions to the workers must be configured.
Once this is done, the job can be monitored, cloned or stopped.
Find step-by-step instructions here.

#Data Labeling
Each data object to be labeled is considered a task and the labeling completes when every task is completed.
On workforce selection, data objects are divided into batches and sent to workers, one batch after other.
Batches check there is no overloading of a workforce and help in iterative training with respect to automated labeling models.
Annotation consolidation is a feature where multiple labels from different workers are consolidated for multi-labeling as well as for determining the best label using a probabilistic estimate.
This might increase cost and time but improves accuracy and helpful for text, image classification for example.
Automated Data labeling saves time and manual effort when enabled by using Machine Learning and taking samples of data labeled by the workforce.
A random sample is selected and worked upon by workers, and then an ML model is built validated, and if accuracy is within the mentioned threshold, jobs are run. However, it may increase training and inference instance costs.


#Data Management and Instructions
All the input Data is to be stored in Amazon S3 along with manifest files which helps GroundTruth to read data.
Parts of Input data can be filtered and selected to be sent for labeling.
Output data is stored in multiple directories separated by purpose in selected Amazon S3 bucket.
Instructions can be given in short or full format.
Short format instructions are shown along with data object while labeling and recommended for simple jobs.
For complex jobs, detailed Full format instructions which can be shown in the dialog box are better.

#More on Ground Truth
Ground Truth provides a service called Amazon Cognito to manage your workforce.
You can create teams among your private workforce for specific labeling jobs.
Also, Ground Truth provides flexibility for creating custom workflows for better application specific productivity.

#Authentication and Access Control
SageMaker resources such as instances can be configured for limiting access from unwanted personnel using AWS Identity and Access Management (IAM) in tandem with Amazon SageMaker.
IAM access depends on identities such as AWS Account Root User, IAM User, and IAM Role.
Resources that support fine-grained access control include Batch Transform Job, Endpoint, Endpoint Config, Hyperparameter Tuning Job, Model, Notebook Instance, Notebook Instance Lifecycle Configuration, and Training Job.
Generally, AWS IAM has resource-based policies and Identity-based policies. However, SageMaker only supports Identity-based policies.
Resource permission policies must be tagged to respective identities to give access.
Conditions to activate a particular policy can also be set.


#Using the Policies
Accessing the SageMaker Management Console and Ground Truth Console require a number of policies.
Based on the role the policies can be limited.
AWS managed policies include:
Administrator Access: Total Control over SageMaker
Data Scientist: Wide range of permissions covering the use cases of Data Scientists.
However, SageMakerFullAccess permission gives the identity total control over SageMaker functionality.
Also, tags on resources can be used to limit access to their identities.
Additional permissions must be given to all types of identities to use other services such as Amazon S3.


#CloudWatch Monitoring
Amazon CloudWatch is integrated with Amazon SageMaker and is currently the only way to monitor SageMaker based activities.
Instance Performance, Running Jobs, and Model Performance are all monitored by CloudWatch. Dashboards are provided to analyze and visualize metrics at an almost real time.
Metrics are refreshed with 1 min frequency and statistics are stored for up to 15 months.
Search in CloudWatch is limited to current resources whose metrics are updated at the latest point in time (less than two weeks ago). However, selecting a specific resource can give historical data.
All the logs from model/algorithm containers, notebook instances, training jobs, and endpoints are also managed/captured by Amazon CloudWatch.


#CloudTrail Monitoring
All the work happening in SageMaker is monitored via CloudWatch.
However, actions taken by user identities in SageMaker via triggering SageMaker API such as training jobs are captured to Amazon CloudTrail.
CloudTrail captures all the operations performed via the Management Console.
The logs from Non - CloudTrail also captures API jobs such as Automatic Model Tuning.
All the operations are shown as Event History and can be stored in a specified Amazon S3 bucket.


#Securing with VPC
Security in SageMaker can be managed using Amazon Virtual Private Cloud (VPC). It can be maintained at multiple levels using VPC. Once VPC is enabled, Notebook Instances can be accessed only via interface endpoint (PrivateLink) or a NAT gateway.

Let's check various levels where VPC can be configured.

Notebook Instance:

SageMaker Notebook Instances are by default Internet-enabled. This helps to download popular open-source packages without restrictions. By disabling Internet access to instances and enabling VPC, we can ensure malicious code from unknown users/locations do not run on the instance.

Training Jobs:

Training Jobs store artifact data on Amazon S3. Internet-enabled containers may compromise data.


#Securing with VPC
Communication between ML instances in distributed training job: By default, training jobs run on VPC. However, a Private VPC can help to comply with regulations if any. Data transfer between instances can also be encrypted at the cost of training time.

Endpoints: Models are by default hosted in VPC. However, communication between artifacts containing Amazon S3 and hosting service happens via the Internet. A Private VPC can provide security in such a case.
Batch Transform Jobs: Similar to Endpoints, batch transform jobs run on VPC. However, artifact containing Amazon S3 communication can be secured using Private VPC.

Training and Inference Containers: These are Internet-enabled by default to access open-source resources, if any, and can be secured necessary.

Note: Apart from this, all the resources in AWS Market Place are scanned for Common Vulnerabilities and Exposures (CVE) using The National Vulnerability Database (NVD).

#Limits
SageMaker is only limited by resource allocation.
It has limitations with respect to Notebook instances, Training Instances, Automatic Model Tuning - Parameters, runtime concurrency, Hosting Instances and Batch Transform Instances.
You can find a list of detailed limits, here.
Regarding availability, it is currently available in 14 regions. Find the list, here.


#Integrating Frameworks
SageMaker is deeply integrated with:

Apache Spark: Open-source cluster computing framework
Tensorflow: Open-source data flow programming and Neural networks library
Apache MXNet: Open-source Deep Learning Framework
Scikit-learn: Open-source Machine Learning library
Pytorch: Open-source Machine Learning library
Chainer: Open-source Deep Learning Framework
You can learn more on this, here.

