<a href="https://colab.research.google.com/github/jyotidabass/Chatbot/blob/main/FlaskPineConeChatbot.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Install required libraries**

In [9]:
!pip install Flask SQLAlchemy requests google-auth-oauthlib google-auth pandas
!pip install pinecone

Collecting pinecone
  Downloading pinecone-4.0.0-py3-none-any.whl (214 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m214.4/214.4 kB[0m [31m3.3 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: pinecone
Successfully installed pinecone-4.0.0


**Set up the Pinecone embeddings vector database**

Create a Pinecone project and a new vector search space. Add your vectors (e.g., pre-trained BERT embeddings) and the corresponding metadata (e.g., engineer profiles) to the vector search space.


**Set up the SQLAlchemy database**

Create a SQLite or PostgreSQL database and set up SQLAlchemy to connect to it. Populate the database with information about the engineers, including their skills, experience, and other relevant details.


**Create the Flask app**
Create a Flask app to handle the front-end requests and serve the search function and results.

In [10]:
from flask import Flask, request, jsonify
from sqlalchemy import create_engine, text
import requests
import pandas as pd

app = Flask(__name__)
engine = create_engine('sqlite:///engine_data.db')

@app.route("/search", methods=["POST"])
def search():
    # Extract the search query from the request
    query = request.form["query"]

    # Perform the search
    if "scalar" in query:
        results = scalar_search(query)
    else:
        results = semantic_search(query)

    # Return the results
    return jsonify(results)

@app.route("/")
def home():
    return "Chatbot Search API"

if __name__ == "__main__":
    app.run(debug=True)

 * Serving Flask app '__main__'
 * Debug mode: on


 * Running on http://127.0.0.1:5000
INFO:werkzeug:[33mPress CTRL+C to quit[0m
INFO:werkzeug: * Restarting with stat


Create a scalar_search function that takes a query and performs a search based on the scalar criteria (part-time/full-time, budget, and skills). This function should return the relevant candidate rows.

In [11]:
def scalar_search(query):
    # Extract scalar parameters from the query
    part_time = "part-time" in query
    full_time = "full-time" in query
    budget = ... # Extract budget from query

    # Query the database for relevant candidates
    candidates = pd.read_sql_query("SELECT * FROM engineers WHERE ", engine)

    # Filter candidates based on part-time/full-time and budget
    candidates = candidates[(candidates["type"] == "full-time") | (candidates["type"] == "part-time") & (candidates["budget"] >= budget)]

    # Filter candidates based on skills
    skills = ... # Extract skills from query
    candidates = candidates[candidates["skills"].str.contains("|".join(skills))]

    return candidates

Create a semantic_search function that takes a query and performs a semantic search using the Pinecone embeddings vector database. This function should return the relevant candidate rows.

In [12]:
def semantic_search(query):
    # Embed the query using your pre-trained model
    embedding = ... # Embed the query using BERT, for example

    # Query the Pinecone embeddings vector database for similar vectors
    response = requests.post(
        "https://api.pinecone.io/v1/search",
        headers={"Authorization": f"Bearer your_pinecone_api_key"},
        json={"query": embedding, "vector_size": 768},
    )

    # Process the results
    results = json.loads(response.text)["results"]
    result_ids = [r["id"] for r in results]

    # Get the candidate rows
    candidates = pd.read_sql_query("SELECT * FROM engineers WHERE id IN (%s)" % ",".join(result_ids), engine)

    return candidates