# Semantic Search with Amazon OpenSearch 

This is quick demo on how to use Amazon OpeSearch develop semantic search application.

![word vector](word2vec.png)


### Upgrade PyTorch and restart Kernel

In [None]:
!pip install --upgrade torch

In [None]:
from IPython.display import display_html
def restartkernel() :
    display_html("<script>Jupyter.notebook.kernel.restart()</script>",raw=True)
restartkernel()

### Verify PyTorch version

In [1]:
import torch
print(torch.__version__)

1.10.2+cu102


### Install required libarary, such as HuggingFace

In [2]:
!pip install -q transformers
!pip install -q boto3
!pip install -q requests
!pip install -q requests-aws4auth
!pip install -q opensearch-py
!pip install -q tqdm
!pip install -q install transformers[torch]
!pip install -U sentence-transformers rank_bm25



### Print SageMaker version

In [3]:
import boto3
import re
import time
import sagemaker
from sagemaker import get_execution_role

role = get_execution_role()

s3_resource = boto3.resource("s3")
s3 = boto3.client('s3')

print(f'SageMaker SDK Version: {sagemaker.__version__}')

SageMaker SDK Version: 2.106.0




## Difference between BM25 similiarity and Semantic similiarity
### BM25 similiarities

In [4]:
from rank_bm25 import BM25Okapi
from sklearn.feature_extraction import _stop_words
import string
from tqdm.autonotebook import tqdm
import numpy as np

passages=["does this work with xbox?",
          "Does the M70 work with Android phones?", 
          "does this work with iphone?",
          "Can this work with an xbox "
         ]

def bm25_tokenizer(text):
    tokenized_doc = []
    for token in text.lower().split():
        token = token.strip(string.punctuation)

        if len(token) > 0 and token not in _stop_words.ENGLISH_STOP_WORDS:
            tokenized_doc.append(token)
    return tokenized_doc


tokenized_corpus = []
for passage in tqdm(passages):
    tokenized_corpus.append(bm25_tokenizer(passage))

bm25 = BM25Okapi(tokenized_corpus)

bm25_scores = bm25.get_scores(bm25_tokenizer(passages[0]))

all_sentence_combinations = []
for i in range(len(bm25_scores)):
    all_sentence_combinations.append([bm25_scores[i], i])

all_sentence_combinations = sorted(all_sentence_combinations, key=lambda x: x[0], reverse=True)

print("Top most similar pairs:")
for score, i in all_sentence_combinations[0:4]:
    print("{} \t {} \t {:.4f}".format(passages[i],bm25_tokenizer(passages[i]),bm25_scores[i]))
    


  0%|          | 0/4 [00:00<?, ?it/s]

Top most similar pairs:
does this work with xbox? 	 ['does', 'work', 'xbox'] 	 0.0255
does this work with iphone? 	 ['does', 'work', 'iphone'] 	 0.0255
Does the M70 work with Android phones? 	 ['does', 'm70', 'work', 'android', 'phones'] 	 0.0198
Can this work with an xbox  	 ['work', 'xbox'] 	 0.0149


### Semantic Similiarities

In [5]:
from sentence_transformers import SentenceTransformer, util
model = SentenceTransformer('all-MiniLM-L6-v2')

#Encode all sentences
embeddings = model.encode(passages)

#Compute cosine similarity between all pairs
cos_sim = util.cos_sim(embeddings, embeddings)

#cosine similarity score with query
all_sentence_combinations = []
for i in range(len(cos_sim)):
    all_sentence_combinations.append([cos_sim[0][i], i])

#Sort list by the highest cosine similarity score
all_sentence_combinations = sorted(all_sentence_combinations, key=lambda x: x[0], reverse=True)

print("Top most similar pairs:")
for score, i in all_sentence_combinations[0:4]:
    print("{} \t {:.4f}".format(passages[i],cos_sim[0][i]))

Top most similar pairs:
does this work with xbox? 	 1.0000
Can this work with an xbox  	 0.9444
does this work with iphone? 	 0.4522
Does the M70 work with Android phones? 	 0.3235


## Get Cloud Formation stack output variables

### Note change "cloudformation_stack_name" to the Cloud Formation stack name when you provision your env.

In [7]:
cfn = boto3.client('cloudformation')

def get_cfn_outputs(stackname):
    outputs = {}
    for output in cfn.describe_stacks(StackName=stackname)['Stacks'][0]['Outputs']:
        outputs[output['OutputKey']] = output['OutputValue']
    return outputs

## Setup variables to use for the rest of the demo
cloudformation_stack_name = "semantic-search-2"

outputs = get_cfn_outputs(cloudformation_stack_name)

bucket = outputs['s3BucketTraining']
aos_host = outputs['DomainEndpoint']

outputs

{'DomainEndpoint': 'search-opensearchservi-yomfx3pcjpau-q4udafmcuza4w5fgpyea4xyvci.us-east-1.es.amazonaws.com',
 'S3BucketSecureURL': 'https://semantic-search-2-s3buckethosting-33xm5021tpid.s3.amazonaws.com',
 'SageMakerNotebookURL': 'https://console.aws.amazon.com/sagemaker/home?region=us-east-1#/notebook-instances/openNotebook/NotebookInstance-kj0OB9Muj8Ed?view=classic',
 's3BucketTraining': 'semantic-search-2-s3buckettraining-8dfzorcextw5',
 'Arn': 'arn:aws:es:us-east-1:522880334446:domain/opensearchservi-yomfx3pcjpau',
 'osDomainName': 'opensearchservi-yomfx3pcjpau',
 's3BucketHostingBucketName': 'semantic-search-2-s3buckethosting-33xm5021tpid'}

## Step 1: Prepare BERT Model in SageMaker

Use Hugging Face BERT model to generate vectorization data, every sentence is 768 dimention data.
![BERT](nlp_bert.png)

In [8]:
import torch
from transformers import AutoTokenizer, AutoModel
from transformers import DistilBertTokenizer, DistilBertModel

#model_name = "distilbert-base-uncased"
#model_name = "sentence-transformers/msmarco-distilbert-base-dot-prod-v3"
model_name = "sentence-transformers/distilbert-base-nli-stsb-mean-tokens"


#Mean Pooling - Take attention mask into account for correct averaging
def mean_pooling(model_output, attention_mask):
    token_embeddings = model_output[0] #First element of model_output contains all token embeddings
    input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
    sum_embeddings = torch.sum(token_embeddings * input_mask_expanded, 1)
    sum_mask = torch.clamp(input_mask_expanded.sum(1), min=1e-9)
    return sum_embeddings / sum_mask


def sentence_to_vector(raw_inputs):
    tokenizer = DistilBertTokenizer.from_pretrained(model_name)
    model = DistilBertModel.from_pretrained(model_name)
    inputs_tokens = tokenizer(raw_inputs, padding=True, return_tensors="pt")
    
    with torch.no_grad():
        outputs = model(**inputs_tokens)

    sentence_embeddings = mean_pooling(outputs, inputs_tokens['attention_mask'])
    return sentence_embeddings


### Save pre-trained BERT model to local and then upload to S3

In this section will host the pretrained BERT model into SageMaker Pytorch model server to generate 768x1 dimension fixed length sentence embedding from [sentence-transformers](https://github.com/UKPLab/sentence-transformers) using [HuggingFace Transformers](https://huggingface.co/sentence-transformers/distilbert-base-nli-stsb-mean-tokens). 


In [9]:
import os
from transformers import AutoTokenizer, AutoModel
saved_model_dir = 'transformer'
os.makedirs(saved_model_dir, exist_ok=True)

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModel.from_pretrained(model_name) 

tokenizer.save_pretrained(saved_model_dir)
model.save_pretrained(saved_model_dir)

In [10]:
sagemaker_session = sagemaker.Session()
role = sagemaker.get_execution_role()


In [11]:
!cd transformer && tar czvf ../model.tar.gz *

config.json
pytorch_model.bin
special_tokens_map.json
tokenizer_config.json
tokenizer.json
vocab.txt


In [12]:
#Upload the model to S3

inputs = sagemaker_session.upload_data(path='model.tar.gz', key_prefix='sentence-transformers-model')
inputs

's3://sagemaker-us-east-1-522880334446/sentence-transformers-model/model.tar.gz'

### Deploy the BERT model to SageMaker Endpoint

First we need to create a PyTorchModel object. The deploy() method on the model object creates an endpoint which serves prediction requests in real-time. If the instance_type is set to a SageMaker instance type (e.g. ml.m5.large) then the model will be deployed on SageMaker. If the instance_type parameter is set to local then it will be deployed locally as a Docker container and ready for testing locally.

First we need to create a Predictor class to accept TEXT as input and output JSON. The default behaviour is to accept a numpy array.

In [13]:
from sagemaker.pytorch import PyTorch, PyTorchModel
from sagemaker.predictor import Predictor
from sagemaker import get_execution_role

class StringPredictor(Predictor):
    def __init__(self, endpoint_name, sagemaker_session):
        super(StringPredictor, self).__init__(endpoint_name, sagemaker_session, content_type='text/plain')

Deploy the BERT model to Sagemaker Endpoint

#### Note: This process will take serveral minutes to complete.

In [14]:
pytorch_model = PyTorchModel(model_data = inputs, 
                             role=role, 
                             entry_point ='inference.py',
                             source_dir = './code',
                             py_version = 'py38', 
                             framework_version = '1.10.2',
                             predictor_cls=StringPredictor)

predictor = pytorch_model.deploy(instance_type='ml.m5d.large', 
                                 initial_instance_count=1, 
                                 endpoint_name = f'semantic-search-model-{int(time.time())}')

-------!

content_type is a no-op in sagemaker>=2.
See: https://sagemaker.readthedocs.io/en/stable/v2.html for details.


### Test the SageMaker Endpoint.

Input is text data, output is vector data

In [15]:
import json
original_payload = 'Does this work with xbox?'
features = predictor.predict(original_payload)
vector_data = json.loads(features)

vector_data

[-0.07037321478128433,
 0.123631551861763,
 -0.1777588576078415,
 0.37857410311698914,
 0.10541839897632599,
 -0.17814640700817108,
 1.121869444847107,
 -0.1818254292011261,
 0.8721059560775757,
 -0.07338754832744598,
 -0.07336762547492981,
 0.3294079303741455,
 -0.7744457721710205,
 -0.2792033553123474,
 0.16999515891075134,
 -0.9044400453567505,
 -1.1421384811401367,
 0.5440397262573242,
 0.14487577974796295,
 -0.06362169981002808,
 -0.06719405204057693,
 0.22005021572113037,
 -0.33827993273735046,
 1.2252930402755737,
 -0.08813939243555069,
 0.45869138836860657,
 1.0466194152832031,
 0.023347239941358566,
 -0.008586255833506584,
 0.7446303367614746,
 -0.037270449101924896,
 0.39659425616264343,
 -0.6779797673225403,
 0.7318069934844971,
 0.16607779264450073,
 0.16313423216342926,
 0.19561108946800232,
 1.6662780046463013,
 1.0857700109481812,
 -0.7747550010681152,
 0.18123018741607666,
 -0.06501846760511398,
 0.9792702794075012,
 0.45116177201271057,
 0.3411131799221039,
 0.22166623

## Step 2: Ingest data to OpenSearch Cluster
Load data set of Amazon Product Question and Answer data from : https://registry.opendata.aws/amazon-pqa/

### Downloading Amazon Production Question and Answer Data

Datasets: https://registry.opendata.aws/amazon-pqa/

In [16]:
!aws s3 ls --no-sign-request s3://amazon-pqa/

2021-05-20 13:11:25 2267692311 amazon-pqa.tar.gz
2021-05-09 11:53:53  442066567 amazon_pqa_accessories.json
2021-05-09 11:53:49  275062405 amazon_pqa_activity_&_fitness_trackers.json
2021-05-09 11:53:49  127094083 amazon_pqa_adapters.json
2021-05-09 11:53:49  143639699 amazon_pqa_amazon_echo_&_alexa_devices.json
2021-05-09 11:53:49  106017252 amazon_pqa_area_rugs.json
2021-05-09 11:53:49  164430689 amazon_pqa_backpacks.json
2021-05-09 11:53:49  679285046 amazon_pqa_basic_cases.json
2021-05-09 11:53:49  390964941 amazon_pqa_batteries.json
2021-05-09 11:53:49  107896488 amazon_pqa_battery_chargers.json
2021-05-09 11:53:49   77113272 amazon_pqa_bed_frames.json
2021-05-09 11:53:49  157944761 amazon_pqa_beds.json
2021-05-09 11:53:49  218133567 amazon_pqa_bullet_cameras.json
2021-05-09 11:53:50  118106256 amazon_pqa_camcorders.json
2021-05-09 11:53:50   71239417 amazon_pqa_car.json
2021-05-09 11:53:50  137487049 amazon_pqa_car_stereo_receivers.json
2021-05-09 11:53:50  153301

In [17]:
!aws s3 cp --no-sign-request s3://amazon-pqa/amazon_pqa_headsets.json ./amazon-pqa/amazon_pqa_headsets.json

download: s3://amazon-pqa/amazon_pqa_headsets.json to amazon-pqa/amazon_pqa_headsets.json


### We can ingest 1000 rows data for test

In [18]:
import json
import pandas as pd

def load_pqa(file_name,number_rows=1000):
    qa_list = []
    df = pd.DataFrame(columns=('question', 'answer'))
    with open(file_name) as f:
        i=0
        for line in f:
            data = json.loads(line)
            df.loc[i] = [data['question_text'],data['answers'][0]['answer_text']]
            i+=1
            if(i == number_rows):
                break
    return df


qa_list = load_pqa('amazon-pqa/amazon_pqa_headsets.json',number_rows=1000)




Convert the text data into vector data

In [19]:
vector_sentences = sentence_to_vector(qa_list["question"].tolist())

Use Python API to set up connection with OpenSearch Cluster

In [20]:
# from elasticsearch import Elasticsearch, RequestsHttpConnection
# from requests_aws4auth import AWS4Auth
# region = 'us-east-1' 
# service = 'es'
# credentials = boto3.Session().get_credentials()
# awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service, session_token=credentials.token)

# es_client = Elasticsearch(
#     hosts = [{'host': aos_host, 'port': 443}],
#     http_auth = awsauth,
#     use_ssl = True,
#     verify_certs = True,
#     connection_class = RequestsHttpConnection
# )

from opensearchpy import OpenSearch, RequestsHttpConnection, AWSV4SignerAuth
import boto3

#es_host = 'search-semanti-domain-7fc1mmzarfpg-vtklyjm33bhijjarsdhbyl7jxq.us-east-1.es.amazonaws.com' 
region = 'us-east-1' 

credentials = boto3.Session().get_credentials()
auth = AWSV4SignerAuth(credentials, region)
index_name = 'nlp_pqa'

aos_client = OpenSearch(
    hosts = [{'host': aos_host, 'port': 443}],
    http_auth = auth,
    use_ssl = True,
    verify_certs = True,
    connection_class = RequestsHttpConnection
)

Create a index with 2 fields, the first field is "content" for raw sentece, the second field is "nlp_article_vector" for vector data.

In [21]:
knn_index = {
    "settings": {
        "index.knn": True,
        "index.knn.space_type": "cosinesimil",
        "analysis": {
          "analyzer": {
            "default": {
              "type": "standard",
              "stopwords": "_english_"
            }
          }
        }
    },
    "mappings": {
        "properties": {
            "question_vector": {
                "type": "knn_vector",
                "dimension": 768,
                "store": True
            },
            "question": {
                "type": "text",
                "store": True
            },
            "answer": {
                "type": "text",
                "store": True
            }
        }
    }
}


In [24]:
#aos_client.indices.delete(index="nlp_pqa")


{'acknowledged': True}

In [25]:
aos_client.indices.create(index="nlp_pqa",body=knn_index,ignore=400)


{'acknowledged': True, 'shards_acknowledged': True, 'index': 'nlp_pqa'}

Show the created index information

In [26]:
aos_client.indices.get(index="nlp_pqa")

{'nlp_pqa': {'aliases': {},
  'mappings': {'properties': {'answer': {'type': 'text', 'store': True},
    'question': {'type': 'text', 'store': True},
    'question_vector': {'type': 'knn_vector',
     'store': True,
     'dimension': 768}}},
  'settings': {'index': {'number_of_shards': '5',
    'provided_name': 'nlp_pqa',
    'knn.space_type': 'cosinesimil',
    'knn': 'true',
    'creation_date': '1663728080536',
    'analysis': {'analyzer': {'default': {'type': 'standard',
       'stopwords': '_english_'}}},
    'number_of_replicas': '1',
    'uuid': 'LTd-oO0gRuS4tmzXLlHPZA',
    'version': {'created': '135248027'}}}}}

In [27]:
i = 0
for c in qa_list["question"].tolist():
    content=c
    vector=vector_sentences[i].tolist()
    answer=qa_list["answer"][i]
    i+=1
    aos_client.index(index='nlp_pqa',body={"question_vector": vector, "question": content,"answer":answer})

### Ingest all the headset PQA data into OpenSearch Cluster
Comment out the following code to ingest all the headset question, answer and corresponding question vector data into OpenSearch index. 

### Note: it will take more than 10 minutes to complete.

In [28]:
# import json
# from tqdm.contrib.concurrent import process_map
# from multiprocessing import cpu_count


# def load_pqa_as_json(file_name):
#     result=[]
#     with open(file_name) as f:
#         for line in f:
#             data = json.loads(line)
#             result.append(data)
#     return result


# qa_list_json = load_pqa_as_json('amazon-pqa/amazon_pqa_headsets.json')


# def es_import(question):
#     vector = json.loads(predictor.predict(question["question_text"]))
#     aos_client.index(index='nlp_pqa',
#              body={"question_vector": vector, "question": question["question_text"],"answer":question["answers"][0]["answer_text"]}
#             )
        
# workers = 4 * cpu_count()
    
# process_map(es_import, qa_list_json, max_workers=workers,chunksize=1000)

### Query the documents number in the OpenSearch Cluster

In [29]:
res = aos_client.search(index="nlp_pqa", body={"query": {"match_all": {}}})
print("Got %d Hits:" % res['hits']['total']['value'])

Got 976 Hits:


## Step 3: Semantic Search 
### Generate vector data for user input query 

Generate vector data for the question by calling SageMaker model

In [30]:
query_raw_sentences = ['does this work with xbox?']
client = boto3.client('sagemaker-runtime')
ENDPOINT_NAME = predictor.endpoint
response = client.invoke_endpoint(EndpointName=ENDPOINT_NAME,
                                       ContentType='text/plain',
                                       Body=query_raw_sentences[0])

search_vector = json.loads((response['Body'].read()))


The endpoint attribute has been renamed in sagemaker>=2.
See: https://sagemaker.readthedocs.io/en/stable/v2.html for details.


### Search vector data with "Semanatic Search" 

OpenSearch KNN


In [31]:

query={
    "size": 50,
    "query": {
        "knn": {
            "question_vector":{
                "vector":search_vector,
                "k":50
            }
        }
    }
}

res = aos_client.search(index="nlp_pqa", 
                       body=query,
                       stored_fields=["question","answer"])
#print("Got %d Hits:" % res['hits']['total']['value'])
query_result=[]
for hit in res['hits']['hits']:
    row=[hit['_id'],hit['_score'],hit['fields']['question'][0],hit['fields']['answer'][0]]
    query_result.append(row)

query_result_df = pd.DataFrame(data=query_result,columns=["_id","_score","question","answer"])
display(query_result_df)

Unnamed: 0,_id,_score,question,answer
0,Nj_rXYMB3Pe6ZRwq6uBv,0.976584,Does this work with xbox one?,"sorry, Im not an xbox user."
1,Qj_sXYMB3Pe6ZRwqA-I_,0.976127,Does this work with the xbox one?,"Yeah of course , but you must have an adapter ..."
2,tD_rXYMB3Pe6ZRwq_OFz,0.967617,does this work on xbox one?,"I'm sorry, but not!"
3,rD_rXYMB3Pe6ZRwq49_Z,0.966998,Does this work for xbox one S?,It should work.
4,xD_rXYMB3Pe6ZRwq_eFT,0.963552,Does it work for xbox one?,"Thanks for your inquiry, it just works with PS..."
5,Hz_sXYMB3Pe6ZRwqDuNr,0.954933,Will it work with Xbox One?,"With the chat adapter for xbox one remotes, bi..."
6,uD_rXYMB3Pe6ZRwq5N9Y,0.953493,Do they work with xbox one?,"No they don't , but let's hope that Microsoft ..."
7,kz_rXYMB3Pe6ZRwq-uH_,0.952881,will these work with xbox one?,Yes
8,1j_rXYMB3Pe6ZRwq_uEr,0.950155,Will it work for xbox one?,"Sorry, it is not compatible with PS4 Xbox one...."
9,Oz_sXYMB3Pe6ZRwqD-Or,0.948001,Do they work with Xbox One system?,If you have the Xbox controller stereo headset...


### Search the same query with "Keyword Search"

In [32]:
query={
    "size": 50,
    "query": {
        "match": {
            "question":"does this work with xbox?"
        }
    }
}

res = aos_client.search(index="nlp_pqa", 
                       body=query,
                       stored_fields=["question","answer"])
#print("Got %d Hits:" % res['hits']['total']['value'])
query_result=[]
for hit in res['hits']['hits']:
    row=[hit['_id'],hit['_score'],hit['fields']['question'][0],hit['fields']['answer'][0]]
    query_result.append(row)

query_result_df = pd.DataFrame(data=query_result,columns=["_id","_score","question","answer"])
display(query_result_df)


Unnamed: 0,_id,_score,question,answer
0,Nj_rXYMB3Pe6ZRwq6uBv,7.322312,Does this work with xbox one?,"sorry, Im not an xbox user."
1,xD_rXYMB3Pe6ZRwq_eFT,6.912413,Does it work for xbox one?,"Thanks for your inquiry, it just works with PS..."
2,CT_sXYMB3Pe6ZRwqDeMX,6.912413,does it work on xbox 1,I am not sure about that but for the price I w...
3,Qj_sXYMB3Pe6ZRwqA-I_,6.807226,Does this work with the xbox one?,"Yeah of course , but you must have an adapter ..."
4,3D_rXYMB3Pe6ZRwq5t8W,6.64985,does it work for an xbox 1?,As long as your controller has a 3.5 headset j...
5,tD_rXYMB3Pe6ZRwq_OFz,6.64985,does this work on xbox one?,"I'm sorry, but not!"
6,vT_rXYMB3Pe6ZRwq_OHf,6.64985,Does it work for Xbox 360?,"Sorry , it can't .Just for PS4"
7,rD_rXYMB3Pe6ZRwq49_Z,6.251604,Does this work for xbox one S?,It should work.
8,8j_sXYMB3Pe6ZRwqC-Lc,6.251604,Does it work for Xbox 360 and os4,Yes. If you get the correct mixamp for those c...
9,wT_rXYMB3Pe6ZRwq_eEt,6.096318,If you have an adaptor does it work for xbox one,Yes it does. Enjoy


## Step 4: Deploying a full-stack semantic search application

The full stack semantic search applicaiton architecure is as following:

![full stack semantic search](semantic_search_fullstack.jpg)


### Disable S3 "Block all public access"

Go to S3 Console, click "Block Public Access settings for this account" make sure "Block all public access" is off.

In [49]:
s3_resource.Object(bucket, 'backend/template.yaml').upload_file('./backend/template.yaml', ExtraArgs={'ACL':'public-read'})


sam_template_url = f'https://{bucket}.s3.amazonaws.com/backend/template.yaml'
print("cloudformation template url:" + sam_template_url)


# Generate the CloudFormation Quick Create Link

print("Click the URL below to create the backend API for NLU search:\n")
print((
    'https://console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/create/review'
    f'?templateURL={sam_template_url}'
    '&stackName=semantic-search-api'
    f'&param_BucketName={outputs["s3BucketTraining"]}'
    f'&param_DomainName={outputs["osDomainName"]}'
    f'&param_ElasticSearchURL={outputs["DomainEndpoint"]}'
    f'&param_SagemakerEndpoint={predictor.endpoint}'
))

The endpoint attribute has been renamed in sagemaker>=2.
See: https://sagemaker.readthedocs.io/en/stable/v2.html for details.


cloudformation template url:https://semantic-search-2-s3buckettraining-8dfzorcextw5.s3.amazonaws.com/backend/template.yaml
Click the URL below to create the backend API for NLU search:

https://console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/create/review?templateURL=https://semantic-search-2-s3buckettraining-8dfzorcextw5.s3.amazonaws.com/backend/template.yaml&stackName=semantic-search-api&param_BucketName=semantic-search-2-s3buckettraining-8dfzorcextw5&param_DomainName=opensearchservi-yomfx3pcjpau&param_ElasticSearchURL=search-opensearchservi-yomfx3pcjpau-q4udafmcuza4w5fgpyea4xyvci.us-east-1.es.amazonaws.com&param_SagemakerEndpoint=semantic-search-model-1663726417


Now that you have a working Amazon SageMaker endpoint for extracting image features and a KNN index on Elasticsearch, you are ready to build a real-world full-stack ML-powered web app. The SAM template you just created will deploy an Amazon API Gateway and AWS Lambda function. The Lambda function runs your code in response to HTTP requests that are sent to the API Gateway.

In [35]:
!pygmentize backend/lambda/app.py

[34mimport[39;49;00m [04m[36mjson[39;49;00m
[34mfrom[39;49;00m [04m[36mos[39;49;00m [34mimport[39;49;00m environ

[34mimport[39;49;00m [04m[36mboto3[39;49;00m
[34mfrom[39;49;00m [04m[36murllib[39;49;00m[04m[36m.[39;49;00m[04m[36mparse[39;49;00m [34mimport[39;49;00m urlparse

[34mfrom[39;49;00m [04m[36melasticsearch[39;49;00m [34mimport[39;49;00m Elasticsearch, RequestsHttpConnection
[34mfrom[39;49;00m [04m[36mrequests_aws4auth[39;49;00m [34mimport[39;49;00m AWS4Auth

[37m# Global variables that are reused[39;49;00m
sm_runtime_client = boto3.client([33m'[39;49;00m[33msagemaker-runtime[39;49;00m[33m'[39;49;00m)
s3_client = boto3.client([33m'[39;49;00m[33ms3[39;49;00m[33m'[39;49;00m)


[34mdef[39;49;00m [32mget_features[39;49;00m(sm_runtime_client, sagemaker_endpoint, payload):
    response = sm_runtime_client.invoke_endpoint(
        EndpointName=sagemaker_endpoint,
        ContentType=[33m'[39;49;00m[33mte

## Once the CloudFormation Stack shows CREATE_COMPLETE, proceed to this cell below:

In [50]:
import json
api_endpoint = get_cfn_outputs('semantic-search-api')['TextSimilarityApi']

with open('./frontend/src/config/config.json', 'w') as outfile:
    json.dump({'apiEndpoint': api_endpoint}, outfile)

## Deploy frontend services

In [51]:
# add NPM to the path so we can assemble the web frontend from our notebook code

from os import environ

npm_path = ':/home/ec2-user/anaconda3/envs/JupyterSystemEnv/bin'

if npm_path not in environ['PATH']:
    ADD_NPM_PATH = environ['PATH']
    ADD_NPM_PATH = ADD_NPM_PATH + npm_path
else:
    ADD_NPM_PATH = environ['PATH']
    
%set_env PATH=$ADD_NPM_PATH

env: PATH=/usr/local/cuda-10.1/bin:/opt/amazon/openmpi/bin:/opt/amazon/efa/bin:/home/ec2-user/anaconda3/condabin:/home/ec2-user/.dl_binaries/bin:/usr/local/cuda/bin:/usr/libexec/gcc/x86_64-amazon-linux/4.8.5:/home/ec2-user/anaconda3/envs/pytorch_p36/bin:/home/ec2-user/anaconda3/condabin:/opt/amazon/openmpi/bin:/opt/amazon/efa/bin:/home/ec2-user/anaconda3/condabin:/home/ec2-user/.dl_binaries/bin:/usr/local/cuda/bin:/usr/libexec/gcc/x86_64-amazon-linux/4.8.5:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/opt/aws/bin:/opt/aws/bin:/home/ec2-user/anaconda3/envs/JupyterSystemEnv/bin


In [52]:
%cd ./frontend/

!npm install

/home/ec2-user/SageMaker/semantic-search/frontend
[K[?25h[37;40mnpm[0m [0m[30;43mWARN[0m[35m[0m fork-ts-checker-webpack-plugin@6.5.2 requires a peer of typescript@>= 2.7 but none is installed. You must install peer dependencies yourself.
[0m[37;40mnpm[0m [0m[30;43mWARN[0m[35m[0m tsutils@3.21.0 requires a peer of typescript@>=2.8.0 || >= 3.2.0-dev || >= 3.3.0-dev || >= 3.4.0-dev || >= 3.5.0-dev || >= 3.6.0-dev || >= 3.6.0-beta || >= 3.7.0-dev || >= 3.7.0-beta but none is installed. You must install peer dependencies yourself.
[0m[37;40mnpm[0m [0m[30;43mWARN[0m [0m[35moptional[0m SKIPPING OPTIONAL DEPENDENCY: fsevents@2.3.2 (node_modules/fsevents):
[0m[37;40mnpm[0m [0m[30;43mWARN[0m [0m[35mnotsup[0m SKIPPING OPTIONAL DEPENDENCY: Unsupported platform for fsevents@2.3.2: wanted {"os":"darwin","arch":"any"} (current: {"os":"linux","arch":"x64"})
[0m
audited 1942 packages in 8.516s

223 packages are looking for funding
  run `npm fund` for details

found 1

In [53]:
!npm run-script build


> frontend@0.1.0 build /home/ec2-user/SageMaker/semantic-search/frontend
> react-scripts build

Creating an optimized production build...
[33m[39m
[eslint] 
src/App.js
  [1mLine 4:25:[22m  'FormControl' is defined but never used                    [33m[4mno-unused-vars[24m[39m
  [1mLine 4:38:[22m  'Select' is defined but never used                         [33m[4mno-unused-vars[24m[39m
  [1mLine 4:46:[22m  'MenuItem' is defined but never used                       [33m[4mno-unused-vars[24m[39m
  [1mLine 10:8:[22m  'GridList' is defined but never used                       [33m[4mno-unused-vars[24m[39m
  [1mLine 11:8:[22m  'GridListTile' is defined but never used                   [33m[4mno-unused-vars[24m[39m
  [1mLine 12:8:[22m  'GridListTileBar' is defined but never used                [33m[4mno-unused-vars[24m[39m
  [1mLine 51:7:[22m  'BorderLinearProgress' is assigned a value but never used  [33m[4mno-unused-vars[24m[39m

src/config/index

In [54]:
hosting_bucket = f"s3://{outputs['s3BucketHostingBucketName']}"

!aws s3 sync ./build/ $hosting_bucket --acl public-read

upload: build/manifest.json to s3://semantic-search-2-s3buckethosting-33xm5021tpid/manifest.json
upload: build/static/js/main.aa8ede42.js.LICENSE.txt to s3://semantic-search-2-s3buckethosting-33xm5021tpid/static/js/main.aa8ede42.js.LICENSE.txt
upload: build/favicon.ico to s3://semantic-search-2-s3buckethosting-33xm5021tpid/favicon.ico
upload: build/logo512.png to s3://semantic-search-2-s3buckethosting-33xm5021tpid/logo512.png
upload: build/index.html to s3://semantic-search-2-s3buckethosting-33xm5021tpid/index.html
upload: build/static/css/main.4ef2127a.css.map to s3://semantic-search-2-s3buckethosting-33xm5021tpid/static/css/main.4ef2127a.css.map
upload: build/asset-manifest.json to s3://semantic-search-2-s3buckethosting-33xm5021tpid/asset-manifest.json
upload: build/logo192.png to s3://semantic-search-2-s3buckethosting-33xm5021tpid/logo192.png
upload: build/robots.txt to s3://semantic-search-2-s3buckethosting-33xm5021tpid/robots.txt
upload: build/static/media/roboto-latin-100italic.7

## Browse your frontend service

In [55]:
print('Click the URL below:\n')
print(outputs['S3BucketSecureURL'] + '/index.html')

Click the URL below:

https://semantic-search-2-s3buckethosting-33xm5021tpid.s3.amazonaws.com/index.html


You can search the question, for example "does this work with xbox?", compare the search result. you will see the difference between keyword search and semantic search.

![full stack semantic search](full-stack-semantic-search-ui.jpg)

In keyword search, some questions like "Does this work for a switch?", "does this work with pc" which include "does this work" are searched however the meaning is totally different with query.

In semantic search, some questions like "Do I need to buy anything extra to used in xbox one s controller?", "How do these headphones connect to the Xbox360 controller?" are searched. The meaning is very close to the query.
![full stack semantic search](full-stack-semantic-search-ui-2.jpg)

## Cleanup

Make sure that you stop the notebook instance, delete the Amazon SageMaker endpoint and delete the Elasticsearch domain to prevent any additional charges.

In [None]:
# Delete the endpoint
predictor.delete_endpoint()

# Empty S3 Contents
training_bucket_resource = s3_resource.Bucket(bucket)
training_bucket_resource.objects.all().delete()

hosting_bucket_resource = s3_resource.Bucket(outputs['s3BucketHostingBucketName'])
hosting_bucket_resource.objects.all().delete()