[Reference](https://medium.com/tripadvisor/a-quick-tutorial-on-how-to-build-vector-search-d74ad26f9ffe)

In [1]:
from flask import Flask, request, jsonify
from sentence_transformers import SentenceTransformer
from sklearn.metrics.pairwise import cosine_similarity
import numpy as np

# Initialize the Flask application
app = Flask(__name__)

# Define the model name and load the SentenceTransformer model once to not load it every request
MODEL_NAME = 'sentence-transformers/all-mpnet-base-v2'
model = SentenceTransformer(MODEL_NAME)

# Define fixed categories
CATEGORIES = ['eatery', 'accommodation', 'attraction']

# Precompute embeddings for the fixed categories
category_embeddings = model.encode(CATEGORIES)
category_embeddings = np.array(category_embeddings)  # Ensure it's a NumPy array


@app.route('/map_to_category', methods=['POST'])
def map_to_category():
    """
    Expects a JSON payload with:
    {
      "input": "inn"
    }
    Returns a JSON response containing:
      - the original input,
      - the closest category (from CATEGORIES),
      - cosine similarity scores for all categories.
    """
    data = request.get_json()

    if not data or 'input' not in data:
        return jsonify({'error': 'JSON payload must contain an "input" key.'}), 400

    input_text = data['input']

    # Compute the embedding for the input text
    input_embedding = model.encode([input_text])
    input_embedding = np.array(input_embedding)
    # --- Compute Cosine Similarity ---
    # Cosine similarity measures the orientation of two vectors regardless of their magnitude.
    # For two vectors u and v, the cosine similarity is computed as:
    #    cosine_similarity(u, v) = (u · v) / (||u|| * ||v||)
    # This score ranges from -1 to 1. In our semantic space, higher values (closer to 1)
    # indicate that the texts are more similar.
    similarities = cosine_similarity(input_embedding, category_embeddings)
    similarities = similarities[0]  # Convert from a 2D array to a 1D array

    # --- Identify the Closest Category ---
    # Once we have the similarity scores, the category with the highest score is considered
    # the best match for the input string. We use np.argmax to find the index of the maximum
    # similarity score and map it back to the corresponding category.
    closest_index = int(np.argmax(similarities))
   # Since the order of our categories is the same as the order in the CATEGORIES list, we directly map the index to the corresponding category:


    closest_category = CATEGORIES[closest_index]

    # Prepare the response with details.
    response = {
        'input': input_text,
        'closest_category': closest_category,
        'similarities': {cat: float(sim) for cat, sim in zip(CATEGORIES, similarities)}
    }

    return jsonify(response)

if __name__ == '__main__':
    # Run the Flask app
    # Accessible at http://0.0.0.0:5001/
    app.run(host='0.0.0.0', port=5001, debug=True)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.4k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/571 [00:00<?, ?B/s]

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model.safetensors:   0%|          | 0.00/438M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

 * Serving Flask app '__main__'
 * Debug mode: on


 * Running on all addresses (0.0.0.0)
 * Running on http://127.0.0.1:5001
 * Running on http://172.28.0.12:5001
INFO:werkzeug:[33mPress CTRL+C to quit[0m
INFO:werkzeug: * Restarting with stat
