Branch: master
Find file History
Pull request Compare This branch is 1 commit ahead, 12 commits behind awslabs:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
..
Failed to load latest commit information.
blazingtext_hosting_pretrained_fasttext
blazingtext_text_classification_dbpedia
blazingtext_word2vec_subwords_text8
blazingtext_word2vec_text8
deepar_electricity
deepar_synthetic
factorization_machines_mnist
imageclassification_caltech
imageclassification_mscoco_multi_label
ipinsights_login
k_nearest_neighbors_covtype
lda_topic_modeling
linear_learner_mnist
ntm_synthetic
object2vec_movie_recommendation
object2vec_multilabel_genre_classification
object2vec_sentence_similarity
object_detection_birds
object_detection_pascalvoc_coco
pca_mnist
random_cut_forest
semantic_segmentation_pascalvoc
seq2seq_translation_en-de
xgboost_abalone
xgboost_mnist
README.md

README.md

Amazon SageMaker Examples

Introduction to Amazon Algorithms

These examples provide quick walkthroughs to get you up and running with Amazon SageMaker's custom developed algorithms. Most of these algorithms can train on distributed hardware, scale incredibly well, and are faster and cheaper than popular alternatives.

  • k-means is our introductory example for Amazon SageMaker. It walks through the process of clustering MNIST images of handwritten digits using Amazon SageMaker k-means.
  • Factorization Machines showcases Amazon SageMaker's implementation of the algorithm to predict whether a handwritten digit from the MNIST dataset is a 0 or not using a binary classifier.
  • Latent Dirichlet Allocation (LDA) introduces topic modeling using Amazon SageMaker Latent Dirichlet Allocation (LDA) on a synthetic dataset.
  • Linear Learner predicts whether a handwritten digit from the MNIST dataset is a 0 or not using a binary classifier from Amazon SageMaker Linear Learner.
  • Neural Topic Model (NTM) uses Amazon SageMaker Neural Topic Model (NTM) to uncover topics in documents from a synthetic data source, where topic distributions are known.
  • Principal Components Analysis (PCA) uses Amazon SageMaker PCA to calculate eigendigits from MNIST.
  • Seq2Seq uses the Amazon SageMaker Seq2Seq algorithm that's built on top of Sockeye, which is a sequence-to-sequence framework for Neural Machine Translation based on MXNet. Seq2Seq implements state-of-the-art encoder-decoder architectures which can also be used for tasks like Abstractive Summarization in addition to Machine Translation. This notebook shows translation from English to German text.
  • Image Classification includes full training and transfer learning examples of Amazon SageMaker's Image Classification algorithm. This uses a ResNet deep convolutional neural network to classify images from the caltech dataset.
  • XGBoost for regression predicts the age of abalone (Abalone dataset) using regression from Amazon SageMaker's implementation of XGBoost.
  • XGBoost for multi-class classification uses Amazon SageMaker's implementation of XGBoost to classifiy handwritten digits from the MNIST dataset as one of the ten digits using a multi-class classifier. Both single machine and distributed use-cases are presented.
  • DeepAR for time series forecasting illustrates how to use the Amazon SageMaker DeepAR algorithm for time series forecasting on a synthetically generated data set.
  • BlazingText Word2Vec generates Word2Vec embeddings from a cleaned text dump of Wikipedia articles using SageMaker's fast and scalable BlazingText implementation.
  • Object detection for bird images demonstrates how to use the Amazon SageMaker Object Detection algorithm with a public dataset of Bird images.
  • Object detection for Pascal VOC provides three sample notebooks that demonstrate how to use the Amazon SageMaker Object Detection algorithm with the Pascal VOC dataset. One uses the RecordIO format, and another uses JSON format. The third notebook shows how to use incremental training.
  • Object2Vec for movie recommendation demonstrates how Object2Vec can be used to model data consisting of pairs of singleton tokens using movie recommendation as a running example.
  • Object2Vec for multi-label classification shows how ObjectToVec algorithm can train on data consisting of pairs of sequences and singleton tokens using the setting of genre prediction of movies based on their plot descriptions.
  • Object2Vec for sentence similarity explains how to train Object2Vec using sequence pairs as input using sentence similarity analysis as the application.
  • IP Insights for suspicious logins shows how to train IP Insights on login events for a web server to identify suspicious login attempts.
  • Semantic Segmentation shows how to train a semantic segmentation algorithm using the Amazon SageMaker Semantic Segmentation algorithm. It also demonstrates how to host the model and produce segmentaion masks and probability of segmentation.