# Document Understanding Solution - Text Classification


---

This notebook's CI test result for us-west-2 is as follows. CI test results in other regions can be found at the end of the notebook. 

![This us-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-2/introduction_to_applying_machine_learning|identify_key_insights_from_textual_document|document_text_classification.ipynb)

---


Text Classification refers to classifying an input sentence to one of the class labels of the training dataset. In this notebook, we demonstrate how to use the [JumpStart API](https://sagemaker.readthedocs.io/en/stable/overview.html#use-prebuilt-models-with-sagemaker-jumpstart) for Text Classification. In particular, we demonstrate three use cases of Text Classification:

1. How to directly deploy a pretrained Transformer-based text classification model to perform Sentiment Analysis.
2. How to fine-tune a pre-trained Transformer model on a custom dataset, and then run inference on the fine-tuned model.
3. How to run [SageMaker Automatic Model Tuning](https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning.html) (a hyperparameter optimization procedure) to find the best model compared with the model fine-tuned in point 2. The performance of the optimal model and model fine-tuned in point 2 is evaluated on a hold-out test data. 

**Note**: When running this notebook on SageMaker Studio, you should make
sure the `PyTorch 1.10 Python 3.8 CPU Optimized` image/kernel is used. When
running this notebook on SageMaker Notebook Instance, you should make
sure the 'sagemaker-soln' kernel is used.

## 1. Set Up

Before executing the notebook, there are some initial steps required for setup. This notebook requires latest version of sagemaker and ipywidgets.

In [None]:
!pip install -U sagemaker ipywidgets

In [None]:
import sagemaker, boto3, json
import sys
import config


aws_region = boto3.Session().region_name
sess = sagemaker.Session()
aws_role = sagemaker.get_execution_role()
DEFAULT_BUCKET = sess.default_bucket()

## 2. Select a pre-trained text classification model

You can continue with the default model, or can choose a different model from the dropdown generated upon running the next cell. A complete list of JumpStart models can also be accessed at JumpStart Models.

In [None]:
model_id = "tensorflow-tc-bert-en-uncased-L-12-H-768-A-12-2"


You can also select a different JumpStart model. Here, we download jumpstart model_manifest file from the jumpstart s3 bucket, filter-out all the Text Classification models and select a model.

In [None]:
# download JumpStart model_manifest file.
boto3.client("s3").download_file(
    f"jumpstart-cache-prod-{aws_region}", "models_manifest.json", "models_manifest.json"
)
with open("models_manifest.json", "rb") as json_file:
    model_list = json.load(json_file)

# filter-out all the Text Classification models from the manifest list.
tc_models_all_versions, tc_models = [
    model["model_id"] for model in model_list if "-tc-" in model["model_id"]
], []
[tc_models.append(model) for model in tc_models_all_versions if model not in tc_models]

print(f"All the other available text classification models are as below.\n")
for each in tc_models:
    print(f"{each}")

## 3. Run inference on the pre-trained text classification model

This is a Text Classification model built upon a Text Embedding model from [TensorFlow Hub](https://tfhub.dev/tensorflow/bert_en_uncased_L-12_H-768_A-12/4). It takes a text string as input and classifies the input text as either a positive or negative movie review.

The Text Embedding model which is pre-trained on Wikipedia and BookCorpus datasets returns an embedding of the input text.

The model available for deployment is created by attaching a binary classification layer to the output of the Text Embedding model, and then fine-tuning the entire model on SST2 dataset. The [SST2](https://nlp.stanford.edu/sentiment/index.html) dataset comprises positive and negative movie reviews.

### 3.1. Retrieve jumpStart artifacts & deploy an endpoint
We retrieve the `deploy_image_uri`, `deploy_source_uri`, and `base_model_uri` for the pre-trained model. To host the pre-trained model, we create an instance of sagemaker.model.Model and deploy it.

In [None]:
import time
from sagemaker import image_uris, model_uris, script_uris
from sagemaker.model import Model
from sagemaker.predictor import Predictor
from sagemaker.utils import name_from_base

# model_version="*" fetches the latest version of the model.
infer_model_id, infer_model_version = model_id, "1.1.2"

endpoint_name_tc = f"{config.SOLUTION_PREFIX}-text-classification-endpoint"

inference_instance_type = config.HOSTING_INSTANCE_TYPE

# Retrieve the inference docker container uri.
deploy_image_uri = image_uris.retrieve(
    region=None,
    framework=None,
    image_scope="inference",
    model_id=infer_model_id,
    model_version=infer_model_version,
    instance_type=inference_instance_type,
)
# Retrieve the inference script uri.
deploy_source_uri = script_uris.retrieve(
    model_id=infer_model_id, model_version=infer_model_version, script_scope="inference"
)
# Retrieve the base model uri.
base_model_uri = model_uris.retrieve(
    model_id=infer_model_id, model_version=infer_model_version, model_scope="inference"
)
# Create the SageMaker model instance. Note that we need to pass Predictor class when we deploy model through Model class,
# for being able to run inference through the sagemaker API.
model = Model(
    image_uri=deploy_image_uri,
    source_dir=deploy_source_uri,
    model_data=base_model_uri,
    entry_point="inference.py",
    role=aws_role,
    predictor_cls=Predictor,
    name=endpoint_name_tc,
)
# deploy the Model.
base_model_predictor = model.deploy(
    initial_instance_count=1,
    instance_type=inference_instance_type,
    endpoint_name=endpoint_name_tc,
)

time.sleep(10)

### 3.2. Example input sentences for inference

These examples are taken from SST2 dataset downloaded from [TensorFlow](https://www.tensorflow.org/datasets/catalog/glue#gluesst2). [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0). [Dataset Homepage](https://nlp.stanford.edu/sentiment/index.html).

In [None]:
text1 = "astonishing ... ( frames ) profound ethical and philosophical questions in the form of dazzling pop entertainment"
text2 = "simply stupid , irrelevant and deeply , truly , bottomlessly cynical "

### 3.3. Query endpoint and parse response
Input to the endpoint is a single sentence. Response from the endpoint is a dictionary containing the predicted class label, and a list of class label probabilities.

In [None]:
newline, bold, unbold = "\n", "\033[1m", "\033[0m"


def query_endpoint(encoded_text):
    response = base_model_predictor.predict(
        encoded_text,
        {"ContentType": "application/x-text", "Accept": "application/json;verbose"},
    )
    return response


def parse_response(query_response):
    model_predictions = json.loads(query_response)
    probabilities, labels, predicted_label = (
        model_predictions["probabilities"],
        model_predictions["labels"],
        model_predictions["predicted_label"],
    )
    return probabilities, labels, predicted_label


for text in [text1, text2]:
    query_response = query_endpoint(text.encode("utf-8"))
    probabilities, labels, predicted_label = parse_response(query_response)
    print(
        f"Inference:{newline}"
        f"Input text: '{text}'{newline}"
        f"Model prediction: {probabilities}{newline}"
        f"Labels: {labels}{newline}"
        f"Predicted Label: {bold}{predicted_label}{unbold}{newline}"
    )

### 3.4. Clean up the endpoint

In [None]:
# Delete the SageMaker endpoint and the attached resources
base_model_predictor.delete_model()
base_model_predictor.delete_endpoint()

## 4. Finetune the pre-trained model on a custom dataset

Previously, we saw how to run inference on a pre-trained model, which was fine-tuned on SST dataset. Next, we discuss how a model can be finetuned to a custom dataset with any number of classes.

The Text Embedding model can be fine-tuned on any text classification dataset in the same way the model available for inference has been fine-tuned on the SST2 movie review dataset.

The model available for fine-tuning attaches a classification layer to the Text Embedding model and initializes the layer parameters to random values. The output dimension of the classification layer is determined based on the number of classes detected in the input data. The fine-tuning step fine-tunes all the model parameters to minimize prediction error on the input data and returns the fine-tuned model. The model returned by fine-tuning can be further deployed for inference. Below are the instructions for how the training data should be formatted for input to the model.

- Input: A directory containing a 'data.csv' file.
     - Each row of the first column of 'data.csv' should have integer class labels between 0 to the number of classes.
    - Each row of the second column should have the corresponding text.
- Output: A trained model that can be deployed for inference.

Below is an example of 'data.csv' file showing values in its first two columns. Note that the file should not have any header.

|   |   |
|---|---|
|0	|hide new secretions from the parental units| 
|0	|contains no wit , only labored gags| 
|1	|that loves its characters and communicates something rather beautiful about human nature| 
|...|...|


source: [TensorFlow Hub](model_url). License:[Apache 2.0 License](https://jumpstart-cache-alpha-us-west-2.s3-us-west-2.amazonaws.com/licenses/Apache-License/LICENSE-2.0.txt).
 
SST2 dataset is downloaded from [TensorFlow](https://www.tensorflow.org/datasets/catalog/glue#gluesst2).
 [Apache 2.0 License](https://jumpstart-cache-prod-us-west-2.s3-us-west-2.amazonaws.com/licenses/Apache-License/LICENSE-2.0.txt).
  [Dataset Homepage](https://nlp.stanford.edu/sentiment/index.html). 

### 4.1. Retrieve jumpStart training artifacts

Here, for the selected model, we retrieve the training docker container, the training algorithm source, the pre-trained model, and a python dictionary of the training hyper-parameters that the algorithm accepts with their default values. Note that the model_version="*" fetches the lates model. Also, we do need to specify the training_instance_type to fetch train_image_uri.

In [None]:
from sagemaker import image_uris, model_uris, script_uris, hyperparameters

model_id, model_version = (
    model_id,
    "1.1.2",
)  # all the other options of model_id are the same as the one in Section 2.
training_instance_type = config.TRAINING_INSTANCE_TYPE

# Retrieve the docker image
train_image_uri = image_uris.retrieve(
    region=None,
    framework=None,
    model_id=model_id,
    model_version=model_version,
    image_scope="training",
    instance_type=training_instance_type,
)
# Retrieve the training script
train_source_uri = script_uris.retrieve(
    model_id=model_id, model_version=model_version, script_scope="training"
)
# Retrieve the pre-trained model tarball to further fine-tune
train_model_uri = model_uris.retrieve(
    model_id=model_id, model_version=model_version, model_scope="training"
)

### 4.2. Set training parameters

Now that we are done with all the setup that is needed, we are ready to fine-tune our Text Classification model. To begin, let us create a `sageMaker.estimator.Estimator` object. This estimator launches the training job.

There are two kinds of parameters that need to be set for training.

The first one are the parameters for the training job. These include: (i) Training data path. This is S3 folder in which the input data is stored, (ii) Output path: This the s3 folder in which the training output is stored. (iii) Training instance type: This indicates the type of machine on which to run the training. Typically, we use GPU instances for these training. We defined the training instance type above to fetch the correct train_image_uri.

The second set of parameters are algorithm specific training hyper-parameters.

In [None]:
# Sample training data is available in this bucket
training_data_bucket = f"jumpstart-cache-prod-{aws_region}"
training_data_prefix = "training-datasets/SST/"

training_dataset_s3_path = f"s3://{training_data_bucket}/{training_data_prefix}"

output_bucket = DEFAULT_BUCKET
output_prefix = "TC"

s3_output_location = f"s3://{output_bucket}/{output_prefix}/output"

For algorithm specific hyper-parameters, we start by fetching python dictionary of the training hyper-parameters that the algorithm accepts with their default values. This can then be overridden to custom values.

In [None]:
from sagemaker import hyperparameters

# Retrieve the default hyper-parameters for fine-tuning the model
hyperparameters = hyperparameters.retrieve_default(model_id=model_id, model_version=model_version)

# [Optional] Override default hyperparameters with custom values
hyperparameters["batch-size"] = "64"
hyperparameters["adam-learning-rate"] = "1e-6"
print(hyperparameters)

### 4.3. Download, preprocess, and upload the training data

In [None]:
!aws s3 cp --recursive $training_dataset_s3_path data/sst2

In [None]:
import pandas as pd

data = pd.read_csv("data/sst2/data.csv", header=None)
data.columns = ["Target", "Sentence Input"]

View the first five observations of the training data

In [None]:
data.head(5)

In [None]:
from sklearn.model_selection import train_test_split

In [None]:
train_data, test_data = train_test_split(data, test_size=0.01, random_state=42)

In [None]:
train_data.to_csv("data/sst2/split_train.csv", header=False, index=False)

Upload the splitted training data into the S3 bucket. The training data is further splitted into training and validation data during training. The test data is used as hold-out data to evaluate the model performance.

In [None]:
import os
import boto3

prefix = "TC"
boto3.Session().resource("s3").Bucket(DEFAULT_BUCKET).Object(
    os.path.join(prefix, "train/data.csv")
).upload_file("data/sst2/split_train.csv")

### 4.4 Fine-tuning without hyperparameter optimization

We start by creating the estimator object with all the required assets and then launch the training job.

In [None]:
from sagemaker.estimator import Estimator
from sagemaker.utils import name_from_base
from sagemaker.tuner import HyperparameterTuner

from sagemaker import get_execution_role

role = get_execution_role()

training_job_name = f"{config.SOLUTION_PREFIX}-tc-finetune"

# Create SageMaker Estimator instance
tc_estimator = Estimator(
    role=role,
    image_uri=train_image_uri,
    source_dir=train_source_uri,
    model_uri=train_model_uri,
    entry_point="transfer_learning.py",
    instance_count=1,
    instance_type=training_instance_type,
    max_run=360000,
    hyperparameters=hyperparameters,
    output_path=s3_output_location,
    tags=[{"Key": config.TAG_KEY, "Value": config.SOLUTION_PREFIX}],
    base_job_name=training_job_name,
)

training_data_path_updated = f"s3://{DEFAULT_BUCKET}/{prefix}/train"
# Launch a SageMaker Training job by passing s3 path of the training data
tc_estimator.fit({"training": training_data_path_updated}, logs=True)

### 4.5. Deploy & run Inference on the fine-tuned model

A trained model does nothing on its own. We now want to use the model to perform inference. For this example, that means predicting the class label of an input sentence. We follow the same steps as in 3. Run inference on the pre-trained model. We start by retrieving the jumpstart artifacts for deploying an endpoint. However, instead of base_predictor, we deploy the tc_estimator that we fine-tuned.



In [None]:
import uuid

inference_instance_type = "ml.g4dn.2xlarge"

# Retrieve the inference docker container uri
deploy_image_uri = image_uris.retrieve(
    region=None,
    framework=None,
    image_scope="inference",
    model_id=model_id,
    model_version=model_version,
    instance_type=inference_instance_type,
)
# Retrieve the inference script uri
deploy_source_uri = script_uris.retrieve(
    model_id=model_id, model_version=model_version, script_scope="inference"
)
unique_hash = str(uuid.uuid4())[:6]
endpoint_name_tc_finetune = f"{config.SOLUTION_PREFIX}-{unique_hash}-tc-finetune-endpoint"

# Use the estimator from the previous step to deploy to a SageMaker endpoint
finetuned_predictor = tc_estimator.deploy(
    initial_instance_count=1,
    instance_type=inference_instance_type,
    entry_point="inference.py",
    image_uri=deploy_image_uri,
    source_dir=deploy_source_uri,
    endpoint_name=endpoint_name_tc_finetune,
)

time.sleep(10)

Next, we query each of the examples in the test data to get its predicted label.

In [None]:
ground_truth, test_examples = (
    test_data.iloc[:, 0].values.tolist(),
    test_data.iloc[:, 1].values.tolist(),
)

In [None]:
newline, bold, unbold = "\n", "\033[1m", "\033[0m"


def query_endpoint(encoded_text, predictor):
    response = predictor.predict(
        encoded_text,
        {"ContentType": "application/x-text", "Accept": "application/json;verbose"},
    )
    return response


def parse_response(query_response):
    model_predictions = json.loads(query_response)
    probabilities, labels, predicted_label = (
        model_predictions["probabilities"],
        model_predictions["labels"],
        model_predictions["predicted_label"],
    )
    return probabilities, labels, predicted_label


predict_prob, predict_label = [], []
for text in test_examples:
    query_response = query_endpoint(text.encode("utf-8"), finetuned_predictor)
    probabilities, labels, predicted_label = parse_response(query_response)
    predict_prob.append(probabilities)
    predict_label.append(predicted_label)

### 4.6. Compute evaluation metrics
Since it is a binary classification task, we use [accuracy score](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.accuracy_score.html) and [f1 score](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html) as the evaluation metrics.

In [None]:
from sklearn.metrics import accuracy_score, f1_score

f1 = f1_score(predict_label, ground_truth)
accuracy = accuracy_score(predict_label, ground_truth)
result = {"Accuracy": [accuracy], "F1 Score": [f1]}

In [None]:
result = pd.DataFrame.from_dict(result, orient="index", columns=["No HPO"])

In [None]:
result

For accuracy and F1 score, larger value indicates the better performance.

## 5. Finetune the pre-trained model on a custom dataset with automatic model tuning (AMT)

Amazon SageMaker automatic model tuning, also known as hyperparameter tuning, finds the best version of a model by running many training jobs on your dataset using the algorithm and ranges of hyperparameters that you specify. It then chooses the hyperparameter values that result in a model that performs the best, as measured by a metric that you choose. We use a [HyperparameterTuner](https://sagemaker.readthedocs.io/en/stable/api/training/tuner.html) object to interact with Amazon SageMaker hyperparameter tuning APIs.

### 5.1. Fine-tuning with hyperparameter optimization

In [None]:
from sagemaker.tuner import ContinuousParameter


# Define objective metric per framework, based on which the best model is selected.
metric_definitions_per_model = {
    "tensorflow": {
        "metrics": [{"Name": "val_accuracy", "Regex": "val_accuracy: ([0-9\\.]+)"}],
        "type": "Maximize",
    }
}

# You can select from the hyperparameters supported by the model, and configure ranges of values to be searched for training the optimal model.(https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning-define-ranges.html)
hyperparameter_ranges = {
    "adam-learning-rate": ContinuousParameter(0.00001, 0.01, scaling_type="Logarithmic")
}

# Increase the total number of training jobs run by AMT, for increased accuracy (and training time).
max_jobs = 6
# Change parallel training jobs run by AMT to reduce total training time, constrained by your account limits.
# if max_jobs=max_parallel_jobs then Bayesian search turns to Random.
max_parallel_jobs = 6

In [None]:
from sagemaker.estimator import Estimator
from sagemaker.utils import name_from_base
from sagemaker.tuner import HyperparameterTuner

tuning_job_name = f"{config.SOLUTION_PREFIX}-{unique_hash}-tc-hpo"

# Create SageMaker Estimator instance
tc_estimator = Estimator(
    role=role,
    image_uri=train_image_uri,
    source_dir=train_source_uri,
    model_uri=train_model_uri,
    entry_point="transfer_learning.py",
    instance_count=1,
    instance_type=training_instance_type,
    max_run=360000,
    hyperparameters=hyperparameters,
    output_path=s3_output_location,
    base_job_name=tuning_job_name,
    tags=[{"Key": config.TAG_KEY, "Value": config.SOLUTION_PREFIX}],
)


metric_definitions = next(
    value for key, value in metric_definitions_per_model.items() if model_id.startswith(key)
)

hp_tuner = HyperparameterTuner(
    tc_estimator,
    metric_definitions["metrics"][0]["Name"],
    hyperparameter_ranges,
    metric_definitions["metrics"],
    max_jobs=max_jobs,
    max_parallel_jobs=max_parallel_jobs,
    objective_type=metric_definitions["type"],
    base_tuning_job_name=tuning_job_name,
)

# Launch a SageMaker Tuning job to search for the best hyperparameters
hp_tuner.fit({"training": training_data_path_updated})

### 5.2. Deploy & run Inference on the fine-tuned model

In [None]:
# Retrieve the inference docker container uri
deploy_image_uri = image_uris.retrieve(
    region=None,
    framework=None,
    image_scope="inference",
    model_id=model_id,
    model_version=model_version,
    instance_type=inference_instance_type,
)
# Retrieve the inference script uri
deploy_source_uri = script_uris.retrieve(
    model_id=model_id, model_version=model_version, script_scope="inference"
)

endpoint_name_hpo = f"{config.SOLUTION_PREFIX}-tc-hpo-endpoint"

# Use the estimator from the previous step to deploy to a SageMaker endpoint
finetuned_predictor_hpo = hp_tuner.deploy(
    initial_instance_count=1,
    instance_type=inference_instance_type,
    entry_point="inference.py",
    image_uri=deploy_image_uri,
    source_dir=deploy_source_uri,
    endpoint_name=endpoint_name_hpo,
)

time.sleep(10)

In [None]:
predict_prob_hpo, predict_label_hpo = [], []
for text in test_examples:
    query_response = query_endpoint(text.encode("utf-8"), finetuned_predictor_hpo)
    probabilities, labels, predicted_label = parse_response(query_response)
    predict_prob_hpo.append(probabilities)
    predict_label_hpo.append(predicted_label)

In [None]:
f1_hpo = f1_score(predict_label_hpo, ground_truth)
accuracy_hpo = accuracy_score(predict_label_hpo, ground_truth)
result_hpo = {"Accuracy": [accuracy_hpo], "F1 Score": [f1_hpo]}

In [None]:
result_hpo = pd.DataFrame.from_dict(result_hpo, orient="index", columns=["With HPO"])

In [None]:
pd.concat([result, result_hpo], axis=1)

We can see results with hyperparameter optimization shows better performance on the hold-out test data.

## 5.3. Clean Up the endpoint

When you've finished with the summarization endpoint (and associated
endpoint-config), make sure that you delete it to avoid accidental
charges.

In [None]:
# Delete the SageMaker endpoint and the attached resources
finetuned_predictor.delete_model()
finetuned_predictor.delete_endpoint()

finetuned_predictor_hpo.delete_model()
finetuned_predictor_hpo.delete_endpoint()

## Next Stage

We've just looked at how you can query document for specific information.
Up next we look at a technique that can be used to query the document for
specifics, called Question Answering.

[Click here to continue with Question and Answering.](./3_question_answering.ipynb)

## Notebook CI Test Results

This notebook was tested in multiple regions. The test results are as follows, except for us-west-2 which is shown at the top of the notebook.

![This us-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-1/introduction_to_applying_machine_learning|identify_key_insights_from_textual_document|document_text_classification.ipynb)

![This us-east-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-east-2/introduction_to_applying_machine_learning|identify_key_insights_from_textual_document|document_text_classification.ipynb)

![This us-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/us-west-1/introduction_to_applying_machine_learning|identify_key_insights_from_textual_document|document_text_classification.ipynb)

![This ca-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ca-central-1/introduction_to_applying_machine_learning|identify_key_insights_from_textual_document|document_text_classification.ipynb)

![This sa-east-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/sa-east-1/introduction_to_applying_machine_learning|identify_key_insights_from_textual_document|document_text_classification.ipynb)

![This eu-west-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-1/introduction_to_applying_machine_learning|identify_key_insights_from_textual_document|document_text_classification.ipynb)

![This eu-west-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-2/introduction_to_applying_machine_learning|identify_key_insights_from_textual_document|document_text_classification.ipynb)

![This eu-west-3 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-west-3/introduction_to_applying_machine_learning|identify_key_insights_from_textual_document|document_text_classification.ipynb)

![This eu-central-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-central-1/introduction_to_applying_machine_learning|identify_key_insights_from_textual_document|document_text_classification.ipynb)

![This eu-north-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/eu-north-1/introduction_to_applying_machine_learning|identify_key_insights_from_textual_document|document_text_classification.ipynb)

![This ap-southeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-1/introduction_to_applying_machine_learning|identify_key_insights_from_textual_document|document_text_classification.ipynb)

![This ap-southeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-southeast-2/introduction_to_applying_machine_learning|identify_key_insights_from_textual_document|document_text_classification.ipynb)

![This ap-northeast-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-1/introduction_to_applying_machine_learning|identify_key_insights_from_textual_document|document_text_classification.ipynb)

![This ap-northeast-2 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-northeast-2/introduction_to_applying_machine_learning|identify_key_insights_from_textual_document|document_text_classification.ipynb)

![This ap-south-1 badge failed to load. Check your device's internet connectivity, otherwise the service is currently unavailable](https://h75twx4l60.execute-api.us-west-2.amazonaws.com/sagemaker-nb/ap-south-1/introduction_to_applying_machine_learning|identify_key_insights_from_textual_document|document_text_classification.ipynb)
